How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker

Hi there,

In our use case, we have 1 k8s node (special-designed hardware) with 3 GPUs, 2 GPUs used for container workloads and 1 GPU used for display purposes. In the current setup, all three GPUs are exposed to containers by default. I would like to know how to make nvidia-container-runtime and docker to only expose 2 GPUs by default for any pods scheduled on this node. Specifically:

1. When the nvidia-device-plugin expose the nvidia.com/gpu to the k8s node capacity, it should be 2 instead of 3.
2. When the pod is using NVIDIA_VISIBLE_DEVICES=all on the node, it should only see 2 instead of 3.

Note that we could not drain that 1 GPU, since it will still be needed for non-container GPU workloads.

Searched a bit on both this repo as well as other online, but did not find a good solution so far. Thanks in advance for the help.
Jianan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions