-
Notifications
You must be signed in to change notification settings - Fork 435
Description
Hi ,
I installed a new Ubuntu 22.04 Ubuntu machine and performed following actions.
- Installed GPU Driver as per the instruction here : https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/#ubuntu
- Installed Docker on Ubuntu : https://docs.docker.com/engine/install/ubuntu/
- Installed NVIDIA Cuda Driver Toolkit as per following instructions : https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html4.
uname -m && cat /etc/release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.2 LTS"
PRETTY_NAME="Ubuntu 24.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.2 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
nvidia-smi
Fri Apr 25 10:52:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06 Driver Version: 570.124.06 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA L40S Off | 00000000:59:00.0 Off | 0 |
| N/A 31C P8 24W / 350W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA L40S Off | 00000000:C2:00.0 Off | 0 |
| N/A 32C P8 24W / 350W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
root@pcai-grkr1:~#
docker run --rm --gpus all nvidia/cuda:12.8.0-base-ubuntu22.04 nvidia-smi
Unable to find image 'nvidia/cuda:12.8.0-base-ubuntu22.04' locally
12.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
6414378b6477: Pull complete
ad69d3880477: Pull complete
2d01ee89ef0b: Pull complete
7d21de8cade1: Pull complete
4b650590013c: Pull complete
Digest: sha256:12242992c121f6cab0ca11bccbaaf757db893b3065d7db74b933e59f321b2cf4
Status: Downloaded newer image for nvidia/cuda:12.8.0-base-ubuntu22.04
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
cat /var/log/nvidia-container-toolkit.log
-- WARNING, the following logs are for debugging purposes only --
I0425 10:59:51.731803 14273 nvc.c:393] initializing library context (version=1.15.0, build=6c8f1df7fd32cea3280cf2a2c6e931c9b3132465)
I0425 10:59:51.731853 14273 nvc.c:364] using root /
I0425 10:59:51.731859 14273 nvc.c:365] using ldcache /etc/ld.so.cache
I0425 10:59:51.731864 14273 nvc.c:366] using unprivileged user 65534:65534
I0425 10:59:51.731882 14273 nvc.c:410] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0425 10:59:51.732038 14273 nvc.c:412] dxcore initialization failed, continuing assuming a non-WSL environment
I0425 10:59:51.762413 14282 nvc.c:278] loading kernel module nvidia
I0425 10:59:51.762573 14282 nvc.c:282] running mknod for /dev/nvidiactl
I0425 10:59:51.762633 14282 nvc.c:286] running mknod for /dev/nvidia0
I0425 10:59:51.762658 14282 nvc.c:286] running mknod for /dev/nvidia1
I0425 10:59:51.762681 14282 nvc.c:290] running mknod for all nvcaps in /dev/nvidia-caps
I0425 10:59:51.767123 14282 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap1 from /proc/driver/nvidia/capabilities/mig/config
I0425 10:59:51.767210 14282 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap2 from /proc/driver/nvidia/capabilities/mig/monitor
I0425 10:59:51.768652 14282 nvc.c:301] loading kernel module nvidia_uvm
I0425 10:59:51.768672 14282 nvc.c:305] running mknod for /dev/nvidia-uvm
I0425 10:59:51.768720 14282 nvc.c:310] loading kernel module nvidia_modeset
I0425 10:59:51.768744 14282 nvc.c:314] running mknod for /dev/nvidia-modeset
I0425 10:59:51.769504 14283 rpc.c:71] starting driver rpc service
I0425 10:59:51.769958 14273 rpc.c:132] driver rpc service terminated with signal 15
I0425 10:59:51.770011 14273 nvc.c:452] shutting down library context