Dedicated GPUs per bundle

Hi,

apologies if this isn't the right forum for a feature request/question

I've been poking around, trying to find ways to automatically assign a number of Nvidia accelerators ( unique devices connected to the host ) to a specific, Nvidia Container Runtime enhanced bundle. So far i'm coming up with very little and i'm not too certain if this is in or out of scope for NVCR itself.

Before i jump in to a novel oci shim that could work together with NVCR envvars, such as VISIBLE_DEVICES, and other features of the container toolkit ( discovery and filtering on other envvars ) to maintain a bundle to device/s "lease", i'd like to ask 
- if this already is easily attainable and i just haven't searched thoroughly enough,
- if this should live in NVCR or elsewhere in the toolkit, or
- it should be indeed essentially pre-processing of the bundle json before so by the time the oci create reaches NVCR, it already has a specific VISIBLE_DEVICES value set, even though the overlap with NVCR ( parsing envvar expressed requirements ) may be large

my use case is coordinating containerized workloads on nodes with many accelerators, that are a poor fit for kubernetes scheduling at this moment ( partially because some of them are containerized kubelets themselves, but i'll also accept an "all of this should really be done over kubernetes" solution as well )

thanks!
Andras

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dedicated GPUs per bundle #193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dedicated GPUs per bundle #193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions