-
Notifications
You must be signed in to change notification settings - Fork 438
Refactor cdi api #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor cdi api #1166
Conversation
Pull Request Test Coverage Report for Build 16072906327Details
💛 - Coveralls |
0a2b879 to
a6f8a10
Compare
2683a6b to
7753402
Compare
ArangoGutierrez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
77fe5cf to
66af75b
Compare
ArangoGutierrez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for this Evan, this is a great refactor of the API
Signed-off-by: Evan Lezar <[email protected]>
Signed-off-by: Evan Lezar <[email protected]>
0443041 to
a248c76
Compare
Signed-off-by: Evan Lezar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the nvcdi API to unify CDI spec generation across all modes by introducing a factory-based generator pattern and removing redundant, mode-specific methods.
- Introduce
deviceSpecGeneratorFactoryandDeviceSpecGeneratorto replace multiple Interface methods. - Refactor NVML path to create and combine per-device generators with init/shutdown hooks.
- Deprecate
GetAllDeviceSpecsand route calls throughGetDeviceSpecsByID("all").
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/nvcdi/wrapper.go | Replaced direct Interface embedding with a factory for generating specs |
| pkg/nvcdi/lib-nvml.go | Switched NVML path to return generators and unified init/shutdown logic |
| pkg/nvcdi/api.go | Updated Interface to use new SpecGenerator types and deprecated legacy methods |
Comments suppressed due to low confidence (2)
pkg/nvcdi/lib-nvml.go:80
- [nitpick] The variable name
DeviceSpecGeneratorsshadows the type name and begins with an uppercase letter. Rename it to a lowercase, distinct name likegeneratorsordsgs.
var DeviceSpecGenerators DeviceSpecGenerators
pkg/nvcdi/wrapper.go:78
- Add unit tests for
GetDeviceSpecsByID(andGetAllDeviceSpecs) on the wrapper to ensure correct behavior across all factory implementations.
func (l *wrapper) GetDeviceSpecsByID(devices ...string) ([]specs.Device, error) {
| // TODO: Rename this type | ||
| type deviceSpecGeneratorFactory interface { |
Copilot
AI
Jul 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Remove or resolve the TODO comment by renaming deviceSpecGeneratorFactory to a more descriptive name (e.g., SpecFactory).
| // TODO: Rename this type | |
| type deviceSpecGeneratorFactory interface { | |
| // SpecFactory is responsible for creating device spec generators and retrieving common edits. | |
| type SpecFactory interface { |
ArangoGutierrez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
These changes fix a bug in CDI spec generation introduced in NVIDIA#1166 nvml is shutdown and initialized again -- we explicitly store UUIDs and use these to query the device handles when generating the CDI specification. Signed-off-by: Evan Lezar <[email protected]>
These changes fix a bug in CDI spec generation introduced in NVIDIA#1166 where device handles become invalid when nvml is shutdown and initialized again. Here we explicitly store UUIDs and use these to query the device handles when generating the CDI specification. Signed-off-by: Evan Lezar <[email protected]>
These changes fix a bug in CDI spec generation introduced in NVIDIA#1166 where device handles become invalid when nvml is shutdown and initialized again. Here we explicitly store UUIDs and use these to query the device handles when generating the CDI specification. Signed-off-by: Evan Lezar <[email protected]>
These changes fix a bug in CDI spec generation introduced in NVIDIA#1166 where device handles become invalid when nvml is shutdown and initialized again. Here we explicitly store UUIDs and use these to query the device handles when generating the CDI specification. Signed-off-by: Evan Lezar <[email protected]>
These changes fix a bug in CDI spec generation introduced in NVIDIA#1166 where device handles become invalid when nvml is shutdown and initialized again. Here we explicitly store UUIDs and use these to query the device handles when generating the CDI specification. Signed-off-by: Evan Lezar <[email protected]>
The original CDI spec generation API was focussed on NVML device specifically. Since then we have replaced the more specific functions (for GPU and MIG devices) in the API with more generally applicable functions based on mode and device IDs.
This organic growth of APIs also means that for the NVML case specifically we had multiple different implementations of CDI spec generation making keeping things consistent more difficult.
Thes changes remove the redundant functions in the
nvcdi.Interfaceallowing devices to be requested by ID across all use cases. It also refactors the CDI spec generation for NVML devices to ensure that the same generation logic is used for all cases.