-
Notifications
You must be signed in to change notification settings - Fork 435
Make CDI device requests consistent with other methods #1132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CDI device requests consistent with other methods #1132
Conversation
|
|
||
| // DevicesFromMounts returns a list of device specified as mounts. | ||
| func (i CUDA) DevicesFromMounts() []string { | ||
| // requestsFromMounts returns a list of device specified as mounts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed this function since we're returning all requests including:
- device requests (legacy and CDI)
- imex channel requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: DevicesFromMounts has been renamed and turned private to the config pkg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It was not used outside this package.
5d667fc to
0e11147
Compare
Pull Request Test Coverage Report for Build 15635726706Details
💛 - Coveralls |
58c6c17 to
1b0f07a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors CDI device request handling to use image.VisibleDevices uniformly, adds a new cdiModifier wrapper for spec-based retrieval, and updates related tests.
- Introduce
cdiModifierto centralize spec parsing and device extraction - Replace direct calls to
CDIDevicesFromMounts/VisibleDevicesFromEnvVarwithVisibleDevices - Update tests in
internal/modifier,internal/info, andinternal/configto reflect the new behavior
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/modifier/cdi.go | Add cdiModifier type, consolidate device extraction |
| internal/modifier/cdi_test.go | New tests for getDevicesFromSpec |
| internal/info/auto_test.go | Adjust TestResolveAutoMode expectations and setup |
| internal/config/image/cuda_image.go | Rename/move mount helpers, parse CDI mount requests |
| internal/config/image/cuda_image_test.go | Update expected mount-based CDI device inclusion |
Comments suppressed due to low confidence (1)
internal/config/image/cuda_image.go:219
- The mount-based CDI device checks were removed, so
OnlyFullyQualifiedCDIDevicesno longer accounts for devices requested via mounts. Consider re-adding logic usingrequestsFromMountsand parsingcdiDeviceMountRequestto correctly detect mount-based CDI devices.
func (i CUDA) OnlyFullyQualifiedCDIDevices() bool {
internal/config/image/cuda_image.go
Outdated
| case strings.HasPrefix(device, volumeMountDevicePrefixCDI): | ||
| name, err := cdiDeviceMountRequest(device).qualifiedName() | ||
| if err != nil { | ||
| i.logger.Warningf("Ignoring invalid mount request for CDI device %v: %w", device, err) |
Copilot
AI
Jun 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The %w verb is only supported by fmt.Errorf. In logging calls, use %v or %s to ensure the error is formatted correctly.
| i.logger.Warningf("Ignoring invalid mount request for CDI device %v: %w", device, err) | |
| i.logger.Warningf("Ignoring invalid mount request for CDI device %v: %v", device, err) |
internal/modifier/cdi.go
Outdated
| cdiModifier := &cdiModifier{ | ||
| logger: logger, | ||
| acceptDeviceListAsVolumeMounts: cfg.AcceptDeviceListAsVolumeMounts, | ||
| acceptEnvvarUnprivileged: cfg.AcceptEnvvarUnprivileged, | ||
| annotationPrefixes: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.AnnotationPrefixes, | ||
| defaultKind: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind, | ||
| } | ||
| return cdiModifier.getDevicesFromSpec(ociSpec) | ||
| } | ||
|
|
||
| // TODO: We should rename this type. | ||
| type cdiModifier struct { |
Copilot
AI
Jun 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The cdiModifier type has a TODO to rename it. Consider selecting a more descriptive name or removing the TODO if this is the intended name.
| cdiModifier := &cdiModifier{ | |
| logger: logger, | |
| acceptDeviceListAsVolumeMounts: cfg.AcceptDeviceListAsVolumeMounts, | |
| acceptEnvvarUnprivileged: cfg.AcceptEnvvarUnprivileged, | |
| annotationPrefixes: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.AnnotationPrefixes, | |
| defaultKind: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind, | |
| } | |
| return cdiModifier.getDevicesFromSpec(ociSpec) | |
| } | |
| // TODO: We should rename this type. | |
| type cdiModifier struct { | |
| deviceHandler := &CDIDeviceHandler{ | |
| logger: logger, | |
| acceptDeviceListAsVolumeMounts: cfg.AcceptDeviceListAsVolumeMounts, | |
| acceptEnvvarUnprivileged: cfg.AcceptEnvvarUnprivileged, | |
| annotationPrefixes: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.AnnotationPrefixes, | |
| defaultKind: cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind, | |
| } | |
| return deviceHandler.getDevicesFromSpec(ociSpec) | |
| } | |
| // This type handles CDI-related device configurations and annotations. | |
| type CDIDeviceHandler struct { |
1b0f07a to
d68d401
Compare
Signed-off-by: Evan Lezar <[email protected]>
This change updates the image.CUDA type to also extract CDI device requests. These are only relevant IF CDI prefixes are specifically set. Signed-off-by: Evan Lezar <[email protected]>
Signed-off-by: Evan Lezar <[email protected]>
Following the refactoring of device request extraction, we can now make CDI device requests consistent with other methods. This change moves to using image.VisibleDevices instead of separate calls to CDIDevicesFromMounts and VisibleDevicesFromEnvVar. Signed-off-by: Evan Lezar <[email protected]>
This change includes annotation devices in CUDA.VisibleDevices with the highest priority. This allows for the CDI device request extraction to be consistent across all request mechanisms. Note that this does change behaviour in the following ways: 1. Annotations are considered when resolving the runtime mode. 2. Incorrectly formed device names in annotations are no longer treated as an error. Signed-off-by: Evan Lezar <[email protected]>
Signed-off-by: Evan Lezar <[email protected]>
Signed-off-by: Evan Lezar <[email protected]>
d68d401 to
8be03cf
Compare
jgehrcke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Cool. I didn't really look at this and don't fully understand this. But I am sure it's good work and I support you and trust that overall it's best to move forward! Also, there are tests so what can possibly go wrong? :)
| return requestedDevice, nil | ||
| } | ||
|
|
||
| parts := strings.SplitN(requestedDevice, "/", 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this trying to parse vendor, class, device from
/cdi/<vendor>/<class>/<device>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, volumeMountDevicePrefixCDI = "cdi/".
and the argument 3 probably means "at most two splits", i.e. max three tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a mounted CDI device will have the path /var/run/nvidia-container-devices/cdi/vendor.com/class/device or /var/run/nvidia-container-devices/cdi/vendor.com/class=device. This code handles the former case where we split vendor.com/class/device into AT MOST three parts.
From the SplitN docs:
// SplitN slices s into substrings separated by sep and returns a slice of
// the substrings between those separators.
//
// The count determines the number of substrings to return:
// - n > 0: at most n substrings; the last substring will be the unsplit remainder;
// - n == 0: the result is nil (zero substrings);
// - n < 0: all substrings.
Following the refactoring of device request extraction, we can now make CDI device requests consistent with other methods.
This change moves to using image.VisibleDevices instead of separate calls to CDIDevicesFromMounts and VisibleDevicesFromEnvVar. This also changes the way in which annotation requests are handled to be consistent with other mechanisms by including annotation devices in the list of VisibleDevices.
See: