Skip to content

Dedicated GPUs per bundle #193

@andlaz

Description

@andlaz

Hi,

apologies if this isn't the right forum for a feature request/question

I've been poking around, trying to find ways to automatically assign a number of Nvidia accelerators ( unique devices connected to the host ) to a specific, Nvidia Container Runtime enhanced bundle. So far i'm coming up with very little and i'm not too certain if this is in or out of scope for NVCR itself.

Before i jump in to a novel oci shim that could work together with NVCR envvars, such as VISIBLE_DEVICES, and other features of the container toolkit ( discovery and filtering on other envvars ) to maintain a bundle to device/s "lease", i'd like to ask

  • if this already is easily attainable and i just haven't searched thoroughly enough,
  • if this should live in NVCR or elsewhere in the toolkit, or
  • it should be indeed essentially pre-processing of the bundle json before so by the time the oci create reaches NVCR, it already has a specific VISIBLE_DEVICES value set, even though the overlap with NVCR ( parsing envvar expressed requirements ) may be large

my use case is coordinating containerized workloads on nodes with many accelerators, that are a poor fit for kubernetes scheduling at this moment ( partially because some of them are containerized kubelets themselves, but i'll also accept an "all of this should really be done over kubernetes" solution as well )

thanks!
Andras

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions