Summary
When deploying GPU Operator in an air-gapped (offline) cluster the nvidia-driver-daemonset init container fails to start.
Root cause: the driver image ships with a public YUM repo enabled by default, which triggers yum errors in offline environments.
Additionally, the pod spec tries to mount /etc/yum.repos.d/redhat.repo (HostPath, File) but the file is absent, so the kubelet rejects the volume with hostPath type check failed.
Node OS: RHEL 8.10
We rebuilt the driver image and **removed /etc/yum.repos.d/redhat.repo
|
volMountSubscriptionName := fmt.Sprintf("subscription-config-%d", num) |
Related issue