Skip to content

Container Toolkit Support GKE/COS Platform and its test coverage  #209

@Dragoncell

Description

@Dragoncell

For pods inside the GPU Operator, after driver installation finished, they rely on container toolkit starts on the node for setting up the nvidia container runtime:
Download nvidia container runtime, hooks, container ctk(nvidia-ctk) and copy over from container to the host /run/nvidia/toolkit. [link]
Update the containerd config file based on container runtime like nvidia or nvidia-cdi
Generate the CDI spec for management containers if runtime is nvidia-cdi

Toolkit is a necessary component on GPU Operator, to make it work on COS, we needs to:

Support necessary binary from container toolkit for the COS platform. So far, the container runtime, hooks, and container CTK are not yet supported (supported platform lists).

Starting from COS109, the nvidia-ctk is pre-built in COS. However, in current state (intermediate CDI mode), it still requires nvidia container runtime (nvidia-cdi) binaries. For legacy mode support, nvidia container runtime(nvidia-legacy) and its hooks are also required

The goal is to achieve the same functionality of container toolkit on COS platform with custom installed driver, container runtime binaries path

Metadata

Metadata

Labels

featureissue/PR that proposes a new feature or functionalitylifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions