-
Notifications
You must be signed in to change notification settings - Fork 437
Description
Hello everyone,
we have recently set up a rootless docker instance alongside our existing docker on one of our servers, but ran into issues mounting host GPUs into the rootless containers. A workaround was presented in issue #85 (toggling no-cgroups to switch between rootful and rootless) with a mention of a better solution in the form of Nvidia CDI coming as an experimental feature in Docker 25.
After updating to the newest Docker releases and setting up CDI, our regular Docker instance behaved as we expected based on the documentation, but the rootless instance still runs into issues.
Setup to reproduce:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
NVIDIA Container Toolkit CLI version 1.14.6
commit: 5605d191332dcfeea802c4497360d60a65c7887e
rootless: containerd github.com/containerd/containerd v1.7.13 7c3aca7a610df76212171d200ca3811ff6096eb8
rootful: containerd containerd.io 1.6.28 ae07eda36dd25f8a1b98dfbf587313b99c0190bb
+---------------------------------------------------------------------------------------+`
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:01:00.0 Off | 0 |
| N/A 40C P0 61W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100-SXM4-40GB On | 00000000:47:00.0 Off | 0 |
| N/A 39C P0 55W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA A100-SXM4-40GB On | 00000000:81:00.0 Off | 0 |
| N/A 39C P0 57W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA DGX Display On | 00000000:C1:00.0 Off | N/A |
| 34% 41C P8 N/A / 50W | 1MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 4 NVIDIA A100-SXM4-40GB On | 00000000:C2:00.0 Off | 0 |
| N/A 39C P0 58W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
config.toml (click to expand)
#accept-nvidia-visible-devices-as-volume-mounts = false
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
[nvidia-container-cli]
#debug = "/var/log/nvidia-container-toolkit.log"
environment = []
#ldcache = "/etc/ld.so.cache"
ldconfig = "@/sbin/ldconfig.real"
load-kmods = true
#no-cgroups = true
#no-cgroups = false
#path = "/usr/bin/nvidia-container-cli"
#root = "/run/nvidia/driver"
#user = "root:video"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc"]
[nvidia-container-runtime.modes]
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
- sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia.yaml (click to expand)
cdiVersion: 0.5.0
containerEdits:
deviceNodes:
- path: /dev/nvidia-modeset
- path: /dev/nvidia-uvm
- path: /dev/nvidia-uvm-tools
- path: /dev/nvidiactl
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- libglxserver_nvidia.so.535.161.07::/lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- update-ldcache
- --folder
- /lib/x86_64-linux-gnu
hookName: createContainer
path: /usr/bin/nvidia-ctk
mounts:
- containerPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
hostPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /run/nvidia-persistenced/socket
hostPath: /run/nvidia-persistenced/socket
options:
- ro
- nosuid
- nodev
- bind
- noexec
- containerPath: /usr/bin/nvidia-cuda-mps-control
hostPath: /usr/bin/nvidia-cuda-mps-control
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/bin/nvidia-cuda-mps-server
hostPath: /usr/bin/nvidia-cuda-mps-server
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/bin/nvidia-debugdump
hostPath: /usr/bin/nvidia-debugdump
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/bin/nvidia-persistenced
hostPath: /usr/bin/nvidia-persistenced
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/bin/nvidia-smi
hostPath: /usr/bin/nvidia-smi
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/nvidia/nvoptix.bin
hostPath: /usr/share/nvidia/nvoptix.bin
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
hostPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
hostPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
hostPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
hostPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
hostPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/vulkan/icd.d/nvidia_icd.json
hostPath: /usr/share/vulkan/icd.d/nvidia_icd.json
options:
- ro
- nosuid
- nodev
- bind
- containerPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
hostPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
options:
- ro
- nosuid
- nodev
- bind
devices:
- containerEdits:
deviceNodes:
- path: /dev/nvidia4
- path: /dev/dri/card5
- path: /dev/dri/renderD132
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
- --link
- ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- chmod
- --mode
- "755"
- --path
- /dev/dri
hookName: createContainer
path: /usr/bin/nvidia-ctk
name: "0"
- containerEdits:
deviceNodes:
- path: /dev/nvidia3
- path: /dev/dri/card4
- path: /dev/dri/renderD131
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
- --link
- ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- chmod
- --mode
- "755"
- --path
- /dev/dri
hookName: createContainer
path: /usr/bin/nvidia-ctk
name: "1"
- containerEdits:
deviceNodes:
- path: /dev/nvidia2
- path: /dev/dri/card3
- path: /dev/dri/renderD130
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
- --link
- ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- chmod
- --mode
- "755"
- --path
- /dev/dri
hookName: createContainer
path: /usr/bin/nvidia-ctk
name: "2"
- containerEdits:
deviceNodes:
- path: /dev/nvidia1
- path: /dev/dri/card2
- path: /dev/dri/renderD129
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
- --link
- ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- chmod
- --mode
- "755"
- --path
- /dev/dri
hookName: createContainer
path: /usr/bin/nvidia-ctk
name: "4"
- containerEdits:
deviceNodes:
- path: /dev/nvidia1
- path: /dev/nvidia2
- path: /dev/nvidia3
- path: /dev/nvidia4
- path: /dev/dri/card2
- path: /dev/dri/card3
- path: /dev/dri/card4
- path: /dev/dri/card5
- path: /dev/dri/renderD129
- path: /dev/dri/renderD130
- path: /dev/dri/renderD131
- path: /dev/dri/renderD132
hooks:
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
- --link
- ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- chmod
- --mode
- "755"
- --path
- /dev/dri
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
- --link
- ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
- --link
- ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
- args:
- nvidia-ctk
- hook
- create-symlinks
- --link
- ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
- --link
- ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
hookName: createContainer
path: /usr/bin/nvidia-ctk
name: all
kind: nvidia.com/gpu
INFO[0000] Found 5 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=1
nvidia.com/gpu=2
nvidia.com/gpu=4
nvidia.com/gpu=all
- Rootfull Docker version 26.0.0, build 2ae903e
- Rootless Docker version 26.0.0, build 2ae903e (install script)
The issue:
When no-cgroups = false CDI injection works fine for the regular Docker instance:
$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)
but produces the following errors for the rootless version:
$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.
Running docker run --rm --gpus all ubuntu nvidia-smi results in the same error as without OCI. This seems to be consistent across all variations listed on the Specialized Configurations for Docker page:
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown.
Interestingly, setting no-cgroups = true disables the regular use of GPUs with rootful Docker:
$ docker run --rm --gpus all ubuntu nvidia-smi
Failed to initialize NVML: Unknown Error
but still allows for CDI injections:
$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)
With control groups disabled, the rootless daemon is able to use exposed GPUs as outlined in the Docker docs:
$ docker run -it --rm --gpus '"device=0,2"' ubuntu nvidia-smi
Mon Apr 1 16:33:52 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-40GB Off | 00000000:01:00.0 Off | 0 |
| N/A 37C P0 60W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100-SXM4-40GB Off | 00000000:81:00.0 Off | 0 |
| N/A 36C P0 56W / 275W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
TLDR
Disabling c-groups allows the rootless containers to use exposed GPUs using the regular docker run --gpus flag. This in turn disables the rootful container's GPU access. Leaving control groups enabled reverses the effect, as outlined in #85 .
Disabling c-groups and using Nvidia CDI, the rootful Docker can still use GPU injection, even though regular GPU access is barred, while the rootless container uses the exposed GPUs. CDI injection for rootless fails in both cases, however.
This seems like a definite improvement, but I'm not sure it's intended behavior. The CDI injection failing with rootless regardless of control group setting leads me to believe this is unintended, unless rootless is not yet supported by Nvidia CDI.
Any insights would be greatly appreciated!