Skip to content

nvidia-container-toolkit Missing from Bluefin DX NVIDIA Image #3560

@castrojo

Description

@castrojo

Discussed in #3559

Originally posted by MCancian November 1, 2025
Sorry for the AI-assisted bug report, but I'm a political scientist primarily! I'm using GPU passthrough in a container to do my OCR work, but new processes that I start then can't use the GPUs. Here is the Claude Code bug report:

Description

The nvidia-container-toolkit is not available on Bluefin DX NVIDIA variant, preventing proper GPU sharing between the host and Podman containers. This causes GPU access issues where host services cannot use the GPUs after containers with GPU passthrough are started, despite sufficient VRAM being available.

System Information

  • Distribution: Bluefin DX (Developer eXperience)
  • Variant: bluefin-dx-nvidia-open
  • Version: gts-41.20251019 (Silverblue)
  • Fedora Base: Fedora 41
  • Kernel: 6.16.8-100.fc41.x86_64
  • Podman Version: 5.6.2
  • Build ID: 282a7f2

GPU Configuration

  • GPU 0: NVIDIA RTX PRO 6000 Blackwell Workstation Edition (97887 MiB VRAM)
  • GPU 1: NVIDIA RTX PRO 6000 Blackwell Workstation Edition (97887 MiB VRAM)
  • Driver Version: 580.95.05 (Open kernel modules)
  • Compute Mode: Default (shared) on both GPUs

Problem

When using Podman containers with GPU passthrough via direct device mapping (--device=/dev/nvidia0, etc.), the GPUs become unavailable to host services even though:

  1. Both GPUs are in shared compute mode
  2. Significant VRAM remains available
  3. The container is only using resources from one GPU

This is a known limitation of direct device passthrough. The standard solution is to use nvidia-container-toolkit with CDI (Container Device Interface), which properly manages GPU contexts and allows sharing between host and containers.

Expected Behavior

The nvidia-container-toolkit package should be included in Bluefin DX NVIDIA images to enable proper GPU resource management with containers.

Current State

nvidia-container-toolkit not found:

$ which nvidia-ctk nvidia-container-toolkit nvidia-container-runtime
/usr/bin/which: no nvidia-ctk in (/path/to/bin...)
/usr/bin/which: no nvidia-container-toolkit in (/path/to/bin...)
/usr/bin/which: no nvidia-container-runtime in (/path/to/bin...)

RPM packages search shows no runtime package installed:

$ rpm -qa | grep -i nvidia-container
(no output)

DNF search only shows golang devel package:

$ dnf search nvidia-container-toolkit
Matched fields: name
 golang-github-nvidia-container-toolkit-devel.noarch	Build and run containers leveraging NVIDIA GPUs

The actual runtime package (nvidia-container-toolkit) is not available in the enabled repositories.

Workaround Currently Using

Currently forced to use direct device passthrough in devcontainer configuration:

"runArgs": [
  "--device=/dev/nvidia0",
  "--device=/dev/nvidiactl",
  "--device=/dev/nvidia-uvm",
  "--security-opt", "label=disable",
  "--runtime=runc"
]

This works but prevents host GPU access while containers are running.

Proposed Solution

Include nvidia-container-toolkit in the Bluefin DX NVIDIA image, either:

  1. Pre-installed in the base image, or
  2. Available via rpm-ostree install nvidia-container-toolkit

This would enable proper CDI-based GPU sharing:

"runArgs": [
  "--device=nvidia.com/gpu=all",
  "--security-opt", "label=disable"
]

Impact

This issue affects:

  • Developers running ML/AI workloads in containers while needing host GPU access
  • Users with multiple GPUs who want efficient resource utilization
  • DevContainer users following NVIDIA's recommended Podman GPU practices

Additional Context

  • NVIDIA devices are present and working: /dev/nvidia0, /dev/nvidia1, /dev/nvidiactl, /dev/nvidia-uvm
  • Driver installation is correct (via ublue-os-nvidia-addons)
  • This is specifically about container runtime integration, not driver issues

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    dxDeveloper Experience Image specificenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions