Skip to content

nvidia-ctk doesn't respect spec-dir override #939

@antverpp

Description

@antverpp

$ nvidia-ctk --version
NVIDIA Container Toolkit CLI version 1.17.4
commit: 9b69590

$ nvidia-ctk config
disable-require = false
supported-driver-capabilities = "compat32,compute,display,graphics,ngx,utility,video"
[nvidia-container-cli]
environment = []
ldconfig = "@/sbin/ldconfig"
load-kmods = true
no-cgroups = true
[nvidia-container-runtime]
debug = "/app/home/podman/.local/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
[nvidia-container-runtime.modes]
[nvidia-container-runtime.modes.cdi]
annotation-prefixes = ["cdi.k8s.io/"]
default-kind = "nvidia.com/gpu"
spec-dirs = ["/etc/cdi", "/var/run/cdi", "/app/home/podman/cdi"]
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime-hook]
path = "nvidia-container-runtime-hook"
skip-mode-detection = false
[nvidia-ctk]
path = "nvidia-ctk"
$ nvidia-ctk cdi list --spec-dir "/app/home/podman/cdi"
INFO[0000] Found 5 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=1
nvidia.com/gpu=GPU-a69be7f2-776f-034f-d202-ad700be58eac
nvidia.com/gpu=GPU-b6705994-caa0-dad7-095d-554686d20f12
nvidia.com/gpu=all
$ nvidia-ctk cdi list
INFO[0000] Found 0 CDI devices

Problem: I run nvidia-ctk generate and put nvidia.yml to the custom directory being as a non-privileged user.
I add env XDG_CONFIG_HOME to my home dir and create nvidia-container-runtime there, where I re-define config.toml with custom spec-dirs values.
And if I run nvidia-ctk config - I see that the configuration at least can be read fine from my custom config file.
So I put nvidia.yml to /app/home/podman/cdi.
And nvidia-ctk cdi list show nothing.
But it shows devices if I run with flag --spec-dir, so it defenitely should work.
I expect nvidia-ctk cdi list to show devices without additional flag as soon as in the config I redefine spec-dirs.
What else am I missing here?

Metadata

Metadata

Labels

lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions