Skip to content

Multi-GPU allocation with precise control in shared environment #1400

@FourierMourier

Description

@FourierMourier

Hi!

So far I played with time-slicing and I need a way to guarantee that when specifying gpu.shared: 2 (or any other number > 1), a pod gets access to two different physical GPUs, not the same GPU twice. While testing shows this seems to work through some implicit affinity, there's no explicit way to ensure this behavior through deployment manifests.

Additionally, for GPUs with different capacities, I'd like to be able to set different numbers of allowed instances per GPU.

I've tried using the devices field the sharing.timeSlicing config, but logs show the only available value is "all".

Is there a way to explicitly control GPU allocation to ensure a pod gets access to multiple distinct GPUs when needed? Perhaps there's another approach besides time-slicing that might address this use case?

I'm trying to achieve a setup where:

  1. All GPUs are shared
  2. Some pods need access specifically to multiple different GPUs (not the same GPU multiple times)
  3. Other pods need access to just one GPU per pod

For context, I've verified that when using gpu.shared: 2, pods do seem to access different GPUs, but I'd like a more explicit way to guarantee and control this behavior.

Currently I have a gpu-operator deployed via helm of version version=v24.9.1 on the node(s) with 2+ gpu(s), while k8s is v1.30.6+rke2r1

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions