-
Notifications
You must be signed in to change notification settings - Fork 98
Open
Labels
backport-25.8robustnessissue/pr: edge cases & fault toleranceissue/pr: edge cases & fault toleranceusabilityissue/pr related to UXissue/pr related to UX
Milestone
Description
We should add some filtering to avoid errors like this, where the namespace the driver is being installed to matches one of the ADDITIONAL_NAMESPACES selected:
helm install nvidia-dra-driver-gpu oci://ghcr.io/nvidia/k8s-dra-driver-gpu \
--namespace=gpu-operator \
--version "25.8.0-dev-5483fa4f-chart" \
--create-namespace \
--set nvidiaDriverRoot=/run/nvidia/driver \
--set gpuResourcesEnabledOverride=true \
--set controller.containers.computeDomain.env[0].name=ADDITIONAL_NAMESPACES \
--set controller.containers.computeDomain.env[0].value=gpu-operator
Pulled: ghcr.io/nvidia/k8s-dra-driver-gpu:25.8.0-dev-5483fa4f-chart
Digest: sha256:2ac2e0a4837064621b572453f7dcb999db043f6669a1a6623fd973668b42d707
Error: INSTALLATION FAILED: 6 errors occurred:
* serviceaccounts "compute-domain-daemon-service-account" already exists
* serviceaccounts "nvidia-dra-driver-gpu-service-account" already exists
* clusterrolebindings.rbac.authorization.k8s.io "nvidia-dra-driver-gpu-clusterrole-binding-gpu-operator" already exists
* clusterrolebindings.rbac.authorization.k8s.io "compute-domain-daemon-role-binding-gpu-operator" already exists
* roles.rbac.authorization.k8s.io "nvidia-dra-driver-gpu-role" already exists
* rolebindings.rbac.authorization.k8s.io "nvidia-dra-driver-gpu-role-binding" already exists
Originally posted by @klueska in #585 (comment)
Metadata
Metadata
Assignees
Labels
backport-25.8robustnessissue/pr: edge cases & fault toleranceissue/pr: edge cases & fault toleranceusabilityissue/pr related to UXissue/pr related to UX
Type
Projects
Status
Backlog