Error running NVIDIA Container with Docker on Ubuntu 22.04

Hi ,

I installed a new Ubuntu 22.04 Ubuntu machine and performed following actions.

1. Installed GPU Driver as per the instruction here : https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/#ubuntu
2. Installed Docker on Ubuntu : https://docs.docker.com/engine/install/ubuntu/
3. Installed NVIDIA Cuda Driver Toolkit as per following instructions : https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html4. 

**uname -m && cat /etc/*release***
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.2 LTS"
PRETTY_NAME="Ubuntu 24.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.2 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

**nvidia-smi**
Fri Apr 25 10:52:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06             Driver Version: 570.124.06     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L40S                    Off |   00000000:59:00.0 Off |                    0 |
| N/A   31C    P8             24W /  350W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L40S                    Off |   00000000:C2:00.0 Off |                    0 |
| N/A   32C    P8             24W /  350W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

**export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}**
**export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}**
**nvcc --version**
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
root@pcai-grkr1:~#


**docker run --rm --gpus all nvidia/cuda:12.8.0-base-ubuntu22.04 nvidia-smi**
Unable to find image 'nvidia/cuda:12.8.0-base-ubuntu22.04' locally
12.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
6414378b6477: Pull complete
ad69d3880477: Pull complete
2d01ee89ef0b: Pull complete
7d21de8cade1: Pull complete
4b650590013c: Pull complete
Digest: sha256:12242992c121f6cab0ca11bccbaaf757db893b3065d7db74b933e59f321b2cf4
Status: Downloaded newer image for nvidia/cuda:12.8.0-base-ubuntu22.04
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown


**cat /var/log/nvidia-container-toolkit.log**

-- WARNING, the following logs are for debugging purposes only --

I0425 10:59:51.731803 14273 nvc.c:393] initializing library context (version=1.15.0, build=6c8f1df7fd32cea3280cf2a2c6e931c9b3132465)
I0425 10:59:51.731853 14273 nvc.c:364] using root /
I0425 10:59:51.731859 14273 nvc.c:365] using ldcache /etc/ld.so.cache
I0425 10:59:51.731864 14273 nvc.c:366] using unprivileged user 65534:65534
I0425 10:59:51.731882 14273 nvc.c:410] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0425 10:59:51.732038 14273 nvc.c:412] dxcore initialization failed, continuing assuming a non-WSL environment
I0425 10:59:51.762413 14282 nvc.c:278] loading kernel module nvidia
I0425 10:59:51.762573 14282 nvc.c:282] running mknod for /dev/nvidiactl
I0425 10:59:51.762633 14282 nvc.c:286] running mknod for /dev/nvidia0
I0425 10:59:51.762658 14282 nvc.c:286] running mknod for /dev/nvidia1
I0425 10:59:51.762681 14282 nvc.c:290] running mknod for all nvcaps in /dev/nvidia-caps
I0425 10:59:51.767123 14282 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap1 from /proc/driver/nvidia/capabilities/mig/config
I0425 10:59:51.767210 14282 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap2 from /proc/driver/nvidia/capabilities/mig/monitor
I0425 10:59:51.768652 14282 nvc.c:301] loading kernel module nvidia_uvm
I0425 10:59:51.768672 14282 nvc.c:305] running mknod for /dev/nvidia-uvm
I0425 10:59:51.768720 14282 nvc.c:310] loading kernel module nvidia_modeset
I0425 10:59:51.768744 14282 nvc.c:314] running mknod for /dev/nvidia-modeset
I0425 10:59:51.769504 14283 rpc.c:71] starting driver rpc service
I0425 10:59:51.769958 14273 rpc.c:132] driver rpc service terminated with signal 15
I0425 10:59:51.770011 14273 nvc.c:452] shutting down library context




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error running NVIDIA Container with Docker on Ubuntu 22.04 #1051

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error running NVIDIA Container with Docker on Ubuntu 22.04 #1051

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions