-
Notifications
You must be signed in to change notification settings - Fork 439
Description
For pods inside the GPU Operator, after driver installation finished, they rely on container toolkit starts on the node for setting up the nvidia container runtime:
Download nvidia container runtime, hooks, container ctk(nvidia-ctk) and copy over from container to the host /run/nvidia/toolkit. [link]
Update the containerd config file based on container runtime like nvidia or nvidia-cdi
Generate the CDI spec for management containers if runtime is nvidia-cdi
Toolkit is a necessary component on GPU Operator, to make it work on COS, we needs to:
Support necessary binary from container toolkit for the COS platform. So far, the container runtime, hooks, and container CTK are not yet supported (supported platform lists).
Starting from COS109, the nvidia-ctk is pre-built in COS. However, in current state (intermediate CDI mode), it still requires nvidia container runtime (nvidia-cdi) binaries. For legacy mode support, nvidia container runtime(nvidia-legacy) and its hooks are also required
The goal is to achieve the same functionality of container toolkit on COS platform with custom installed driver, container runtime binaries path