Skip to content

GitHub Issue: GPU Integration in k3s Using nvidia-open Drivers on RTX 5080 (Containerd + No Proprietary Runtime) #1031

@vialianit

Description

@vialianit

🧩 Summary
I'm running a single-node k3s cluster on a workstation with an NVIDIA RTX 5080 GPU using open-source drivers (nvidia-open) on Arch Linux. I want to run vLLM, Whisper, and other CUDA-dependent LLMs through Kubernetes with full GPU acceleration.

The key challenge:
📛 nvidia-container-runtime does not support nvidia-open, but nvidia-container-toolkit does detect the GPU and CUDA correctly.

🛠 System Setup
OS: Arch Linux, kernel 6.x

GPU: NVIDIA GeForce RTX 5080

Driver: nvidia-open (NOT proprietary)

CUDA: 12.8 (confirmed via nvidia-container-cli info)

k3s: latest (uses containerd internally)

containerd: working, default config generated and customized

NVIDIA container toolkit: installed and working

No nvidia runtime detected via ctr plugins ls

✅ What Works
nvidia-smi inside privileged containers (via Docker or pod with hostPath + privileged)

Running LLMs via docker compose with --gpus all

Custom edits to config.toml.tmpl in k3s, confirmed preserved across restarts

/dev/nvidia* and /proc/driver/nvidia are visible inside privileged pods

❌ What Fails
nvidia-container-runtime does not register with containerd

nvidia-device-plugin DaemonSet fails with network plugin not initialized (after config changes)

No nvidia.com/gpu resource exposed to Kubernetes

ctr plugins ls | grep nvidia → empty

🤔 Questions to the Community
Is it currently possible to use nvidia-open drivers with k3s + containerd + Kubernetes GPU support in any official or semi-official way?

Has anyone successfully registered a GPU runtime with containerd using nvidia-open?

Would you recommend sticking with Docker Compose + --gpus all until nvidia-container-runtime gains support for nvidia-open?

Is there an official roadmap for nvidia-container-runtime to support nvidia-open?

Any known working workarounds other than privileged pods and hostPath volumes?

🔧 Additional Context
We're developing a local GPU inference platform (LLMs, Whisper, Image models) using Portainer and Rancher, exploring Kubernetes orchestration in the long term. Right now, we're considering falling back to Docker Compose to proceed, unless there’s a viable way to unlock GPU scheduling within k3s using open drivers.

💬 Would Appreciate
Community confirmations / working examples

Links to related PRs or tracking issues

Recommendations for better architecture

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions