Story Detail of id 43057964 | Liveview Hacker News

ozgune1 week ago | on: We were wrong about GPUs

I agree with the blog post that using K8s + containers for GPU virtualization is a security disaster waiting to happen. Even if you configure your container right (which is extremely hard to do), you don't get seccomp-bpf.

People started using K8s for training, where you already had a network isolated cluster. Extending the K8s+container pattern to multi-tenant environments is scary at best.

I didn't understand the following part though.

> Instead, we burned months trying (and ultimately failing) to get Nvidia’s host drivers working to map virtualized GPUs into Intel Cloud Hypervisor.

Why was this part so hard? Doing PCI passthrough with the Cloud Hypervisor (CH) is relatively common. Was it the transition from Firecracker to CH that was tricky?

huntaub1 week ago | parent

This has actually brought up an interesting point. Kubernetes is nothing more than an API interface. Should someone be working on building a multi-tenant Kubernetes (so that customers don't need to manage nodes or clusters) which enforces VM-level security (obviously you cannot safely co-locate multiple tenants containers on the same VM)?

#visit	12106112
#session	46834
#live-session	0