Multi-GPU allocation with precise control in shared environment

Hi! 

So far I played with time-slicing and I need a way to guarantee that when specifying gpu.shared: 2 (or any other number > 1), a pod gets access to two different physical GPUs, not the same GPU twice. While testing shows this seems to work through some implicit affinity, there's no explicit way to ensure this behavior through deployment manifests.

Additionally, for GPUs with different capacities, I'd like to be able to set different numbers of allowed instances per GPU.

I've tried using the devices field the `sharing.timeSlicing` config, but logs show the only available value is "all". 

Is there a way to explicitly control GPU allocation to ensure a pod gets access to multiple distinct GPUs when needed? Perhaps there's another approach besides time-slicing that might address this use case?

I'm trying to achieve a setup where:
1) All GPUs are shared
2) Some pods need access specifically to multiple different GPUs (not the same GPU multiple times)
3) Other pods need access to just one GPU per pod

For context, I've verified that when using `gpu.shared: 2`, pods do seem to access different GPUs, but I'd like a more explicit way to guarantee and control this behavior.

Currently I have a gpu-operator deployed via helm of version `version=v24.9.1` on the node(s) with 2+ gpu(s), while k8s is `v1.30.6+rke2r1`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-GPU allocation with precise control in shared environment #1400

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-GPU allocation with precise control in shared environment #1400

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions