-
Notifications
You must be signed in to change notification settings - Fork 81
Description
We want to demonstrate advanced features like partitionable devices with the example driver. This begs the question: should we use some artificial, fake devices or mirror the behavior of some real vendor driver when it advertises real hardware?
On the one hand, we prefer to stay vendor-neutral. On the other hand, a real example is easier to understand and would make the example driver more useful for realistic benchmarking.
After some discussions at KubeCon and in https://kubernetes.slack.com/archives/C0409NGC1TK/p1743667181489199, here's a proposal. In the top-level README.md, we add a new section:
Configuration
Vendors are encouraged to work with the Kubernetes maintainers to enhance DRA for their use cases. When this leads to new features, extending the example driver such that it demonstrates those features by emulating a vendor driver for certain hardware is desirable. Later, adding novel usages of existing features may also be worth extending the example driver.
At the moment, the driver supports the following profiles:
- gpu (default): 8 generic GPUs per node. Works on Kubernetes >= 1.32.
- nvidia-mig: two NVIDIA A100 GPUs per node, with attributes that are the same as for real hardware. Works on Kubernetes >= 1.33.
- google-tpu: models multi-host devices. Works on Kubernetes >= 1.33.
Each deployment of the example driver uses exactly one profile and <profile name>.dra.example.com as driver name. To configure the profile, ... [TBD]. These profiles do not actually emulate any real hardware. Instead, they merely inject environment variables which mirror the devices that were allocated.
/assign @bg-chun
Metadata
Metadata
Labels
Type
Projects
Status