-
Notifications
You must be signed in to change notification settings - Fork 437
Closed
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.
Description
I have installed nvidia container tool kit and docker-ce on ubuntu 20.04.
When i restart the server i get the following error when i run the following command docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi or any other docker container which is utilization gpu
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
ERROR: Encountered errors while bringing up the project
I have purge and reinstall the docker package for docker to work properly with gpu. I have written the following bash script which i have to run on every startup for docker to work with nvidia container toolkit.
sudo apt-get -y purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl restart docker
How to resolve it so that i don't have to run the above bash script on every startup
Metadata
Metadata
Assignees
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.