-
Notifications
You must be signed in to change notification settings - Fork 66
Description
What happened:
When deploying pod with no resource limits/requests defined, I neglected to remove the metadata.annotations k8s.v1.cni.cncf.io/networks: hostdev-rdma-device-sriov-gds-test-a-su-1 which caused the deployment of my pod to remain in a ContainerCreating state. In reviewing the events/description of the mod (in this scenario, the namespace was sriov-gds-test), there was an endless loop of errors similar to below, enumerating through every available IP available within k8s-pod-network:
Normal AddedInterface 36s multus Add eth0 [192.168.179.237/32] from k8s-pod-network
Warning FailedCreatePodSandBox 36s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a164f0f20932003e4dfc144edeb844e764f0a4f792a6590c114f5143e464ba11": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"
Normal AddedInterface 36s multus Add eth0 [192.168.179.236/32] from k8s-pod-network
Warning FailedCreatePodSandBox 35s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f51969d12c6daca8546a144cdfa4b03db04a73a390784a9157e1a95954db1d13": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"
Normal AddedInterface 34s multus Add eth0 [192.168.179.238/32] from k8s-pod-network
Warning FailedCreatePodSandBox 34s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4a57990b3cf143279ef543e798931c44d0898d282f589711ecff0f6352fe24d6": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"
What you expected to happen:
If no resource requests for SRIOV/RDMA devices are defined, annotations should be ignored and allow for pod to be deployed or error disclosing actual issue with missing resources being defined when annotation is present.
How to reproduce it (as minimally and precisely as possible):
Below is example manifest of what was used to reproduced.
apiVersion: apps/v1
kind: Deployment
metadata:
name: gds-pvc-rdma-no-device-attached
spec:
replicas: 1
selector:
matchLabels:
app: gds-pvc-test-rdma
template:
metadata:
labels:
app: gds-pvc-test-rdma
nvidia-nsight-profile: disabled
annotations:
k8s.v1.cni.cncf.io/networks: hostdev-rdma-device-sriov-gds-test-a-su-1
spec:
containers:
- name: appcntr1
image: quay.io/frollandnvidia/cuda-perftest:latest
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- |
sleep inf
As you'll notice - no resource limits were defined, such as:
resources:
requests:
nvidia.com/rdma_device_a: '1'
limits:
nvidia.com/rdma_device_a: '1'
Anything else we need to know?:
This cluster co-resides with nvidia gpu operator, with GDS/NFSoRDMA enabled.
below are helm deployment configs used:
$ helm get values gpu-operator
USER-SUPPLIED VALUES:
driver:
rdma:
enabled: true
gds:
enabled: true
$ helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
gpu-operator gpu-operator 1 2025-07-22 12:21:23.462651 -0400 EDT deployed gpu-operator-v25.3.0 v25.3.0
Logs:
- NicClusterPolicy CR spec and state:
$ k get nicclusterpolicies
NAME STATUS AGE
nic-cluster-policy ready 2025-07-23T16:35:15Z
apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"mellanox.com/v1alpha1","kind":"NicClusterPolicy","metadata":{"annotations":{},"name":"nic-cluster-policy"},"spec":{"nvIpam":{"enableWebhook":false,"image":"nvidia-k8s-ipam","imagePullSecrets":[],"repository":"ghcr.io/mellanox","version":"v0.3.7"},"ofedDriver":{"env":[{"name":"RESTORE_DRIVER_ON_POD_TERMINATION","value":"true"},{"name":"UNLOAD_STORAGE_MODULES","value":"true"},{"name":"CREATE_IFNAMES_UDEV","value":"true"},{"name":"ENABLE_NFSRDMA","value":"true"}],"forcePrecompiled":false,"image":"doca-driver","imagePullSecrets":[],"livenessProbe":{"initialDelaySeconds":30,"periodSeconds":30},"readinessProbe":{"initialDelaySeconds":10,"periodSeconds":30},"repository":"nvcr.io/nvidia/mellanox","startupProbe":{"initialDelaySeconds":10,"periodSeconds":20},"terminationGracePeriodSeconds":300,"upgradePolicy":{"autoUpgrade":true,"drain":{"deleteEmptyDir":true,"enable":true,"force":true,"podSelector":"","timeoutSeconds":300},"maxParallelUpgrades":1,"safeLoad":false},"version":"25.04-0.6.1.0-2"},"secondaryNetwork":{"cniPlugins":{"image":"plugins","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v1.5.0"},"multus":{"image":"multus-cni","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v4.1.0"}},"sriovDevicePlugin":{"config":"{\n \"resourceList\": [\n {\n \"resourcePrefix\": \"nvidia.com\",\n \"resourceName\": \"rdma_device_a\",\n \"selectors\": {\n \"vendors\": [\"15b3\"],\n \"devices\": [],\n \"drivers\": [],\n \"pfNames\": [],\n \"pciAddresses\": [\"0000:00:07.0\",\"0000:00:08.0\"],\n \"rootDevices\": [],\n \"linkTypes\": [],\n \"isRdma\": true\n }\n }\n ]\n}\n","image":"sriov-network-device-plugin","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v3.9.0"}}}
creationTimestamp: "2025-07-23T16:35:15Z"
generation: 2
name: nic-cluster-policy
resourceVersion: "41540363"
uid: f2e5e6e0-46a0-40e9-9c8d-59eb70257ac2
spec:
nvIpam:
enableWebhook: false
image: nvidia-k8s-ipam
imagePullSecrets: []
repository: ghcr.io/mellanox
version: v0.3.7
ofedDriver:
env:
- name: RESTORE_DRIVER_ON_POD_TERMINATION
value: "true"
- name: UNLOAD_STORAGE_MODULES
value: "true"
- name: CREATE_IFNAMES_UDEV
value: "true"
- name: ENABLE_NFSRDMA
value: "true"
forcePrecompiled: false
image: doca-driver
imagePullSecrets: []
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
initialDelaySeconds: 10
periodSeconds: 30
repository: nvcr.io/nvidia/mellanox
startupProbe:
initialDelaySeconds: 10
periodSeconds: 20
terminationGracePeriodSeconds: 300
upgradePolicy:
autoUpgrade: true
drain:
deleteEmptyDir: true
enable: true
force: true
podSelector: ""
timeoutSeconds: 300
maxParallelUpgrades: 1
safeLoad: false
version: 25.04-0.6.1.0-2
secondaryNetwork:
cniPlugins:
image: plugins
imagePullSecrets: []
repository: ghcr.io/k8snetworkplumbingwg
version: v1.5.0
multus:
image: multus-cni
imagePullSecrets: []
repository: ghcr.io/k8snetworkplumbingwg
version: v4.1.0
sriovDevicePlugin:
config: |
{
"resourceList": [
{
"resourcePrefix": "nvidia.com",
"resourceName": "rdma_device_a",
"selectors": {
"vendors": ["15b3"],
"devices": [],
"drivers": [],
"pfNames": [],
"pciAddresses": ["0000:00:07.0"],
"rootDevices": [],
"linkTypes": [],
"isRdma": true
}
}
]
}
image: sriov-network-device-plugin
imagePullSecrets: []
repository: ghcr.io/k8snetworkplumbingwg
version: v3.9.0
status:
appliedStates:
- name: state-multus-cni
state: ready
- name: state-container-networking-plugins
state: ready
- name: state-ipoib-cni
state: ignore
- name: state-whereabouts-cni
state: ignore
- name: state-OFED
state: ready
- name: state-SRIOV-device-plugin
state: ready
- name: state-RDMA-device-plugin
state: ignore
- name: state-ib-kubernetes
state: ignore
- name: state-nv-ipam-cni
state: ready
- name: state-nic-feature-discovery
state: ignore
- name: state-doca-telemetry-service
state: ignore
- name: state-nic-configuration-operator
state: ignore
- name: state-spectrum-x-operator
state: ignore
state: ready
- Output of:
kubectl -n nvidia-network-operator get -A:
This command is incorrect - instead I ran kubectl -n nvidia-network-operator get all
$ kubectl -n nvidia-network-operator get all
NAME READY STATUS RESTARTS AGE
pod/cni-plugins-ds-7vwhm 1/1 Running 0 15h
pod/cni-plugins-ds-bjch2 1/1 Running 0 15h
pod/cni-plugins-ds-dbnlf 1/1 Running 0 15h
pod/cni-plugins-ds-n24jv 1/1 Running 0 15h
pod/cni-plugins-ds-nxnrm 1/1 Running 0 15h
pod/cni-plugins-ds-prdsp 1/1 Running 0 15h
pod/kube-multus-ds-6bwnz 1/1 Running 0 15h
pod/kube-multus-ds-9jz2n 1/1 Running 0 15h
pod/kube-multus-ds-cwm8w 1/1 Running 0 15h
pod/kube-multus-ds-gcl68 1/1 Running 0 15h
pod/kube-multus-ds-gl5c8 1/1 Running 0 15h
pod/kube-multus-ds-ts2gz 1/1 Running 0 15h
pod/mofed-ubuntu22.04-6dc4b88db4-ds-r7j5f 1/1 Running 0 15h
pod/mofed-ubuntu22.04-6dc4b88db4-ds-ssstp 1/1 Running 0 15h
pod/mofed-ubuntu22.04-889fff7f4-ds-zbfmc 1/1 Running 0 15h
pod/network-operator-84f7bf449d-tztgs 1/1 Running 0 15h
pod/network-operator-sriov-device-plugin-6fwph 1/1 Running 0 15h
pod/network-operator-sriov-device-plugin-bm6j6 1/1 Running 0 15h
pod/network-operator-sriov-device-plugin-d5kc9 1/1 Running 0 15h
pod/nv-ipam-controller-7748fb76b9-ckgcm 1/1 Running 0 15h
pod/nv-ipam-controller-7748fb76b9-nj2mb 1/1 Running 0 15h
pod/nv-ipam-node-2xx2c 1/1 Running 0 15h
pod/nv-ipam-node-chxcf 1/1 Running 0 15h
pod/nv-ipam-node-glthb 1/1 Running 0 15h
pod/nv-ipam-node-jfhtz 1/1 Running 0 15h
pod/nv-ipam-node-s9r28 1/1 Running 0 15h
pod/nv-ipam-node-w586m 1/1 Running 0 15h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/cni-plugins-ds 6 6 6 6 6 <none> 7d18h
daemonset.apps/kube-multus-ds 6 6 6 6 6 <none> 7d18h
daemonset.apps/mofed-ubuntu22.04-6dc4b88db4-ds 2 2 2 2 2 feature.node.kubernetes.io/kernel-version.full=5.15.0-144-generic,feature.node.kubernetes.io/pci-15b3.present=true,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04 7d18h
daemonset.apps/mofed-ubuntu22.04-889fff7f4-ds 1 1 1 1 1 feature.node.kubernetes.io/kernel-version.full=5.15.0-151-generic,feature.node.kubernetes.io/pci-15b3.present=true,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04 15h
daemonset.apps/network-operator-sriov-device-plugin 3 3 3 3 3 feature.node.kubernetes.io/pci-15b3.present=true,network.nvidia.com/operator.mofed.wait=false 7d18h
daemonset.apps/nv-ipam-node 6 6 6 6 6 <none> 7d18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/network-operator 1/1 1 1 8d
deployment.apps/nv-ipam-controller 2/2 2 2 7d18h
NAME DESIRED CURRENT READY AGE
replicaset.apps/network-operator-76f4c8779d 0 0 0 19h
replicaset.apps/network-operator-798476bc67 0 0 0 8d
replicaset.apps/network-operator-84f7bf449d 1 1 1 15h
replicaset.apps/network-operator-868b657597 0 0 0 7d15h
replicaset.apps/nv-ipam-controller-558cd88566 0 0 0 7d15h
replicaset.apps/nv-ipam-controller-7748fb76b9 2 2 2 15h
replicaset.apps/nv-ipam-controller-7f575676f7 0 0 0 19h
replicaset.apps/nv-ipam-controller-cdcb7db5c 0 0 0 7d18h
- Network Operator version:
$ k images -u
[Summary]: 1 namespaces, 27 pods, 39 containers and 7 different images
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| Pod | Container | Image |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| cni-plugins-ds-7vwhm | cni-plugins | ghcr.io/k8snetworkplumbingwg/plugins:v1.5.0 |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| kube-multus-ds-6bwnz | kube-multus | ghcr.io/k8snetworkplumbingwg/multus-cni:v4.1.0 |
+ +---------------------------------+ +
| | (init) install-multus-binary | |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| mofed-ubuntu22.04-6dc4b88db4-ds-r7j5f | mofed-container | nvcr.io/nvidia/mellanox/doca-driver:25.04-0.6.1.0-2-ubuntu22.04-amd64 |
+ +---------------------------------+-----------------------------------------------------------------------+
| | (init) | ghcr.io/mellanox/network-operator-init-container:v0.0.3 |
| | network-operator-init-container | |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| network-operator-84f7bf449d-tztgs | network-operator | nvcr.io/nvidia/cloud-native/network-operator:v25.4.0 |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| network-operator-sriov-device-plugin-6fwph | kube-sriovdp | ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0 |
+ +---------------------------------+ +
| | (init) ofed-driver-validation | |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| nv-ipam-controller-7748fb76b9-ckgcm | nv-ipam-controller | ghcr.io/mellanox/nvidia-k8s-ipam:v0.3.7 |
+--------------------------------------------+---------------------------------+ +
| nv-ipam-node-2xx2c | nv-ipam-node | |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
-
Logs of Network Operator controller:
-
Logs of the various Pods in
nvidia-network-operatornamespace: -
Helm Configuration (if applicable):
$ helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
network-operator nvidia-network-operator 2 2025-07-22 18:44:05.406250499 +0000 UTC deployed network-operator-25.4.0 v25.4.0
USER-SUPPLIED VALUES:
deployCR: true
imagePullSecrets: []
maintenance-operator-chart:
operator:
admissionController:
certificates:
certManager:
enable: false
generateSelfSigned: false
custom:
enable: false
secretNames:
operator: maintenance-webhook-cert
enable: false
image:
name: maintenance-operator
repository: ghcr.io/mellanox
tag: v0.2.2
maintenanceOperator:
enabled: false
nfd:
NodeFeatureRule: false
deployNodeFeatureRules: true
enabled: false
nic-configuration-operator-chart:
configDaemon:
image:
name: nic-configuration-operator-daemon
repository: ghcr.io/mellanox
tag: v1.0.3
operator:
image:
name: nic-configuration-operator
repository: ghcr.io/mellanox
tag: v1.0.3
nicConfigurationOperator:
enabled: false
node-feature-discovery:
enableNodeFeatureApi: true
featureGates:
NodeFeatureAPI: true
gc:
enable: true
replicaCount: 1
serviceAccount:
create: false
name: node-feature-discovery
master:
config:
extraLabelNs:
- nvidia.com
serviceAccount:
create: true
name: node-feature-discovery
postDeleteCleanup: false
worker:
config:
sources:
pci:
deviceClassWhitelist:
- "0300"
- "0302"
deviceLabelFields:
- vendor
serviceAccount:
create: false
name: node-feature-discovery
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoSchedule
key: nvidia.com/gpu
operator: Exists
nvIpam:
deploy: true
operator:
admissionController:
enabled: false
useCertManager: true
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: In
values:
- ""
weight: 1
- preference:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- ""
weight: 1
cniBinDirectory: /opt/cni/bin
fullnameOverride: ""
image: network-operator
maintenanceOperator:
nodeMaintenanceNamePrefix: network-operator
nodeMaintenanceNamespace: default
requestorID: nvidia.network.operator
useRequestor: false
nameOverride: ""
nodeSelector: {}
ofedDriver:
initContainer:
enable: true
image: network-operator-init-container
repository: ghcr.io/mellanox
version: v0.0.3
repository: nvcr.io/nvidia/cloud-native
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 5m
memory: 64Mi
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Equal
value: ""
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Equal
value: ""
useDTK: true
sriov-network-operator:
images:
ibSriovCni: ghcr.io/k8snetworkplumbingwg/ib-sriov-cni:v1.2.1
operator: nvcr.io/nvidia/mellanox/sriov-network-operator:network-operator-25.4.0
ovsCni: ghcr.io/k8snetworkplumbingwg/ovs-cni-plugin:v0.38.2
resourcesInjector: ghcr.io/k8snetworkplumbingwg/network-resources-injector:v1.7.0
sriovCni: ghcr.io/k8snetworkplumbingwg/sriov-cni:v2.8.1
sriovConfigDaemon: nvcr.io/nvidia/mellanox/sriov-network-operator-config-daemon:network-operator-25.4.0
sriovDevicePlugin: ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0
webhook: nvcr.io/nvidia/mellanox/sriov-network-operator-webhook:network-operator-25.4.0
operator:
admissionControllers:
certificates:
certManager:
enabled: true
generateSelfSigned: true
custom:
enabled: false
secretNames:
injector: network-resources-injector-cert
operator: operator-webhook-cert
enabled: false
resourcePrefix: nvidia.com
sriovOperatorConfig:
configDaemonNodeSelector:
beta.kubernetes.io/os: linux
network.nvidia.com/operator.mofed.wait: "false"
deploy: true
sriovNetworkOperator:
enabled: false
test:
pf: ens2f0
upgradeCRDs: true
- Kubernetes' nodes information (labels, annotations and status):
kubectl get node -o yaml:
labels: {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"feature.node.kubernetes.io/cpu-cpuid.ADX": "true",
"feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
"feature.node.kubernetes.io/cpu-cpuid.AMXBF16": "true",
"feature.node.kubernetes.io/cpu-cpuid.AMXINT8": "true",
"feature.node.kubernetes.io/cpu-cpuid.AMXTILE": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX2": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512BF16": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512BITALG": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512BW": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512CD": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512DQ": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512F": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512FP16": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512IFMA": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI2": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512VL": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512VNNI": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX512VPOPCNTDQ": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVXVNNI": "true",
"feature.node.kubernetes.io/cpu-cpuid.CLDEMOTE": "true",
"feature.node.kubernetes.io/cpu-cpuid.CMPSB_SCADBS_SHORT": "true",
"feature.node.kubernetes.io/cpu-cpuid.CMPXCHG8": "true",
"feature.node.kubernetes.io/cpu-cpuid.FMA3": "true",
"feature.node.kubernetes.io/cpu-cpuid.FSRM": "true",
"feature.node.kubernetes.io/cpu-cpuid.FXSR": "true",
"feature.node.kubernetes.io/cpu-cpuid.FXSROPT": "true",
"feature.node.kubernetes.io/cpu-cpuid.GFNI": "true",
"feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR": "true",
"feature.node.kubernetes.io/cpu-cpuid.IA32_ARCH_CAP": "true",
"feature.node.kubernetes.io/cpu-cpuid.IBPB": "true",
"feature.node.kubernetes.io/cpu-cpuid.LAHF": "true",
"feature.node.kubernetes.io/cpu-cpuid.MOVBE": "true",
"feature.node.kubernetes.io/cpu-cpuid.MOVDIR64B": "true",
"feature.node.kubernetes.io/cpu-cpuid.MOVDIRI": "true",
"feature.node.kubernetes.io/cpu-cpuid.MOVSB_ZL": "true",
"feature.node.kubernetes.io/cpu-cpuid.OSXSAVE": "true",
"feature.node.kubernetes.io/cpu-cpuid.SERIALIZE": "true",
"feature.node.kubernetes.io/cpu-cpuid.SHA": "true",
"feature.node.kubernetes.io/cpu-cpuid.SPEC_CTRL_SSBD": "true",
"feature.node.kubernetes.io/cpu-cpuid.STOSB_SHORT": "true",
"feature.node.kubernetes.io/cpu-cpuid.SYSCALL": "true",
"feature.node.kubernetes.io/cpu-cpuid.SYSEE": "true",
"feature.node.kubernetes.io/cpu-cpuid.TSXLDTRK": "true",
"feature.node.kubernetes.io/cpu-cpuid.VAES": "true",
"feature.node.kubernetes.io/cpu-cpuid.VPCLMULQDQ": "true",
"feature.node.kubernetes.io/cpu-cpuid.WBNOINVD": "true",
"feature.node.kubernetes.io/cpu-cpuid.X87": "true",
"feature.node.kubernetes.io/cpu-cpuid.XGETBV1": "true",
"feature.node.kubernetes.io/cpu-cpuid.XSAVE": "true",
"feature.node.kubernetes.io/cpu-cpuid.XSAVEC": "true",
"feature.node.kubernetes.io/cpu-cpuid.XSAVEOPT": "true",
"feature.node.kubernetes.io/cpu-cpuid.XSAVES": "true",
"feature.node.kubernetes.io/cpu-hardware_multithreading": "false",
"feature.node.kubernetes.io/cpu-model.family": "6",
"feature.node.kubernetes.io/cpu-model.id": "143",
"feature.node.kubernetes.io/cpu-model.vendor_id": "Intel",
"feature.node.kubernetes.io/kernel-config.NO_HZ": "true",
"feature.node.kubernetes.io/kernel-config.NO_HZ_IDLE": "true",
"feature.node.kubernetes.io/kernel-version.full": "5.15.0-144-generic",
"feature.node.kubernetes.io/kernel-version.major": "5",
"feature.node.kubernetes.io/kernel-version.minor": "15",
"feature.node.kubernetes.io/kernel-version.revision": "0",
"feature.node.kubernetes.io/pci-0300_1234.present": "true",
"feature.node.kubernetes.io/pci-0302_10de.present": "true",
"feature.node.kubernetes.io/pci-10de.present": "true",
"feature.node.kubernetes.io/pci-1234.present": "true",
"feature.node.kubernetes.io/pci-15b3.present": "true",
"feature.node.kubernetes.io/pci-1af4.present": "true",
"feature.node.kubernetes.io/rdma.available": "true",
"feature.node.kubernetes.io/rdma.capable": "true",
"feature.node.kubernetes.io/system-os_release.ID": "ubuntu",
"feature.node.kubernetes.io/system-os_release.VERSION_ID": "22.04",
"feature.node.kubernetes.io/system-os_release.VERSION_ID.major": "22",
"feature.node.kubernetes.io/system-os_release.VERSION_ID.minor": "04",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "wh-sriov-gpu-worker-0",
"kubernetes.io/os": "linux",
"network.nvidia.com/operator.mofed.wait": "false",
"node-role.kubernetes.io/worker": "",
"nvidia.com/cuda.driver-version.full": "570.124.06",
"nvidia.com/cuda.driver-version.major": "570",
"nvidia.com/cuda.driver-version.minor": "124",
"nvidia.com/cuda.driver-version.revision": "06",
"nvidia.com/cuda.driver.major": "570",
"nvidia.com/cuda.driver.minor": "124",
"nvidia.com/cuda.driver.rev": "06",
"nvidia.com/cuda.runtime-version.full": "12.8",
"nvidia.com/cuda.runtime-version.major": "12",
"nvidia.com/cuda.runtime-version.minor": "8",
"nvidia.com/cuda.runtime.major": "12",
"nvidia.com/cuda.runtime.minor": "8",
"nvidia.com/gfd.timestamp": "1753903488",
"nvidia.com/gpu-driver-upgrade-state": "upgrade-done",
"nvidia.com/gpu.compute.major": "8",
"nvidia.com/gpu.compute.minor": "9",
"nvidia.com/gpu.count": "1",
"nvidia.com/gpu.deploy.container-toolkit": "true",
"nvidia.com/gpu.deploy.dcgm": "true",
"nvidia.com/gpu.deploy.dcgm-exporter": "true",
"nvidia.com/gpu.deploy.device-plugin": "true",
"nvidia.com/gpu.deploy.driver": "true",
"nvidia.com/gpu.deploy.gpu-feature-discovery": "true",
"nvidia.com/gpu.deploy.node-status-exporter": "true",
"nvidia.com/gpu.deploy.nvsm": "",
"nvidia.com/gpu.deploy.operator-validator": "true",
"nvidia.com/gpu.family": "ada-lovelace",
"nvidia.com/gpu.machine": "AHV",
"nvidia.com/gpu.memory": "46068",
"nvidia.com/gpu.mode": "compute",
"nvidia.com/gpu.present": "true",
"nvidia.com/gpu.product": "NVIDIA-L40S",
"nvidia.com/gpu.replicas": "1",
"nvidia.com/gpu.sharing-strategy": "none",
"nvidia.com/mig.capable": "false",
"nvidia.com/mig.strategy": "single",
"nvidia.com/mps.capable": "false",
"nvidia.com/ofed-driver-upgrade-state": "upgrade-done",
"nvidia.com/vgpu.present": "false"
}
{
"cluster.x-k8s.io/cluster-name": "wh-sriov",
"cluster.x-k8s.io/cluster-namespace": "default",
"cluster.x-k8s.io/labels-from-machine": "",
"cluster.x-k8s.io/machine": "wh-sriov-gpu-nodepool-bhqss-z2nvx",
"cluster.x-k8s.io/owner-kind": "MachineSet",
"cluster.x-k8s.io/owner-name": "wh-sriov-gpu-nodepool-bhqss",
"csi.volume.kubernetes.io/nodeid": "{\"csi.nutanix.com\":\"wh-sriov-gpu-worker-0\",\"csi.tigera.io\":\"wh-sriov-gpu-worker-0\"}",
"kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock",
"nfd.node.kubernetes.io/feature-labels": "cpu-cpuid.ADX,cpu-cpuid.AESNI,cpu-cpuid.AMXBF16,cpu-cpuid.AMXINT8,cpu-cpuid.AMXTILE,cpu-cpuid.AVX,cpu-cpuid.AVX2,cpu-cpuid.AVX512BF16,cpu-cpuid.AVX512BITALG,cpu-cpuid.AVX512BW,cpu-cpuid.AVX512CD,cpu-cpuid.AVX512DQ,cpu-cpuid.AVX512F,cpu-cpuid.AVX512FP16,cpu-cpuid.AVX512IFMA,cpu-cpuid.AVX512VBMI,cpu-cpuid.AVX512VBMI2,cpu-cpuid.AVX512VL,cpu-cpuid.AVX512VNNI,cpu-cpuid.AVX512VPOPCNTDQ,cpu-cpuid.AVXVNNI,cpu-cpuid.CLDEMOTE,cpu-cpuid.CMPSB_SCADBS_SHORT,cpu-cpuid.CMPXCHG8,cpu-cpuid.FMA3,cpu-cpuid.FSRM,cpu-cpuid.FXSR,cpu-cpuid.FXSROPT,cpu-cpuid.GFNI,cpu-cpuid.HYPERVISOR,cpu-cpuid.IA32_ARCH_CAP,cpu-cpuid.IBPB,cpu-cpuid.LAHF,cpu-cpuid.MOVBE,cpu-cpuid.MOVDIR64B,cpu-cpuid.MOVDIRI,cpu-cpuid.MOVSB_ZL,cpu-cpuid.OSXSAVE,cpu-cpuid.SERIALIZE,cpu-cpuid.SHA,cpu-cpuid.SPEC_CTRL_SSBD,cpu-cpuid.STOSB_SHORT,cpu-cpuid.SYSCALL,cpu-cpuid.SYSEE,cpu-cpuid.TSXLDTRK,cpu-cpuid.VAES,cpu-cpuid.VPCLMULQDQ,cpu-cpuid.WBNOINVD,cpu-cpuid.X87,cpu-cpuid.XGETBV1,cpu-cpuid.XSAVE,cpu-cpuid.XSAVEC,cpu-cpuid.XSAVEOPT,cpu-cpuid.XSAVES,cpu-hardware_multithreading,cpu-model.family,cpu-model.id,cpu-model.vendor_id,kernel-config.NO_HZ,kernel-config.NO_HZ_IDLE,kernel-version.full,kernel-version.major,kernel-version.minor,kernel-version.revision,nvidia.com/cuda.driver-version.full,nvidia.com/cuda.driver-version.major,nvidia.com/cuda.driver-version.minor,nvidia.com/cuda.driver-version.revision,nvidia.com/cuda.driver.major,nvidia.com/cuda.driver.minor,nvidia.com/cuda.driver.rev,nvidia.com/cuda.runtime-version.full,nvidia.com/cuda.runtime-version.major,nvidia.com/cuda.runtime-version.minor,nvidia.com/cuda.runtime.major,nvidia.com/cuda.runtime.minor,nvidia.com/gfd.timestamp,nvidia.com/gpu.compute.major,nvidia.com/gpu.compute.minor,nvidia.com/gpu.count,nvidia.com/gpu.family,nvidia.com/gpu.machine,nvidia.com/gpu.memory,nvidia.com/gpu.mode,nvidia.com/gpu.product,nvidia.com/gpu.replicas,nvidia.com/gpu.sharing-strategy,nvidia.com/mig.capable,nvidia.com/mig.strategy,nvidia.com/mps.capable,nvidia.com/vgpu.present,pci-0300_1234.present,pci-0302_10de.present,pci-10de.present,pci-1234.present,pci-15b3.present,pci-1af4.present,rdma.available,rdma.capable,system-os_release.ID,system-os_release.VERSION_ID,system-os_release.VERSION_ID.major,system-os_release.VERSION_ID.minor",
"node.alpha.kubernetes.io/ttl": "0",
"nvidia.com/gpu-driver-upgrade-enabled": "true",
"projectcalico.org/IPv4Address": "10.122.7.57/24",
"projectcalico.org/IPv4IPIPTunnelAddr": "192.168.179.192",
"volumes.kubernetes.io/controller-managed-attach-detach": "true"
}
{
"addresses": [
{
"address": "10.122.7.57",
"type": "InternalIP"
},
{
"address": "wh-sriov-gpu-worker-0",
"type": "Hostname"
}
],
"allocatable": {
"cpu": "16",
"ephemeral-storage": "280794130787",
"hugepages-1Gi": "0",
"hugepages-2Mi": "0",
"memory": "65666324Ki",
"nvidia.com/gpu": "1",
"nvidia.com/rdma_device_a": "1",
"nvidia.com/rdma_device_b": "0",
"pods": "110"
},
"capacity": {
"cpu": "16",
"ephemeral-storage": "304681132Ki",
"hugepages-1Gi": "0",
"hugepages-2Mi": "0",
"memory": "65768724Ki",
"nvidia.com/gpu": "1",
"nvidia.com/rdma_device_a": "1",
"nvidia.com/rdma_device_b": "0",
"pods": "110"
},
"conditions": [
{
"lastHeartbeatTime": "2025-07-30T19:12:03Z",
"lastTransitionTime": "2025-07-30T19:12:03Z",
"message": "Calico is running on this node",
"reason": "CalicoIsUp",
"status": "False",
"type": "NetworkUnavailable"
},
{
"lastHeartbeatTime": "2025-07-31T10:47:21Z",
"lastTransitionTime": "2025-07-30T19:12:00Z",
"message": "kubelet has sufficient memory available",
"reason": "KubeletHasSufficientMemory",
"status": "False",
"type": "MemoryPressure"
},
{
"lastHeartbeatTime": "2025-07-31T10:47:21Z",
"lastTransitionTime": "2025-07-30T19:12:00Z",
"message": "kubelet has no disk pressure",
"reason": "KubeletHasNoDiskPressure",
"status": "False",
"type": "DiskPressure"
},
{
"lastHeartbeatTime": "2025-07-31T10:47:21Z",
"lastTransitionTime": "2025-07-30T19:12:00Z",
"message": "kubelet has sufficient PID available",
"reason": "KubeletHasSufficientPID",
"status": "False",
"type": "PIDPressure"
},
{
"lastHeartbeatTime": "2025-07-31T10:47:21Z",
"lastTransitionTime": "2025-07-30T19:12:00Z",
"message": "kubelet is posting ready status",
"reason": "KubeletReady",
"status": "True",
"type": "Ready"
}
],
"daemonEndpoints": {
"kubeletEndpoint": {
"Port": 10250
}
},
"images": [
{
"names": [
"ghcr.io/coreweave/nccl-tests@sha256:c926650b8f5d34db90409265436a7369f10bc2f5ef820291fab75785b18bef71",
"ghcr.io/coreweave/nccl-tests:12.9.1-devel-ubuntu22.04-nccl2.27.6-1-7c12c62"
],
"sizeBytes": 8323670310
},
{
"names": [
"nvcr.io/nvidia/cuda@sha256:a3460ae0897335607bef01cb903b56f1bf07a9651e480fc6bb621279e39f9214",
"nvcr.io/nvidia/cuda:12.9.1-cudnn-devel-ubuntu22.04"
],
"sizeBytes": 6379183063
},
{
"names": [
"nvcr.io/nvidia/cuda-dl-base@sha256:ab128a0b5d4298e62c691e478e42e0af98aecdb71ea17b1fea0261875faf4611",
"nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04"
],
"sizeBytes": 6079048112
},
{
"names": [
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-huggingfaceserver@sha256:674f2321c4665b5b97a38c1efde8a0cf2cbdd7c64452942dda29915ed706fca0",
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-huggingfaceserver:v0.15.2-gpu"
],
"sizeBytes": 5942716497
},
{
"names": [
"quay.io/frollandnvidia/cuda-perftest@sha256:c10d648df4a09ae6afb48ac7f220233df9122ef59616aabda7d6fed7f65517ab",
"quay.io/frollandnvidia/cuda-perftest:latest"
],
"sizeBytes": 4137623786
},
{
"names": [
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-custom-model-server@sha256:23815b51e12abb8705e18ec36b54ec2371155a1652bbcfd84361a45aaf197931",
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-custom-model-server:320"
],
"sizeBytes": 3297017421
},
{
"names": [
"nvcr.io/nvidia/driver@sha256:36564482ecae5a01fa0099823aaca2212e497d80021602391b7c306453baef7d",
"nvcr.io/nvidia/driver:570.124.06-ubuntu22.04"
],
"sizeBytes": 740871082
},
{
"names": [
"nvcr.io/nvidia/driver@sha256:316963bc85f3d3a95e046b8d264072c7b41145a22de9ac1df1683e1bc8b7d207",
"nvcr.io/nvidia/driver:550.127.05-ubuntu22.04"
],
"sizeBytes": 675354846
},
{
"names": [
"nvcr.io/nvidia/mellanox/doca-driver@sha256:cb436254179a539a33c31b97c3eb84f78011e5eba3b7a49780245fed233076cf",
"nvcr.io/nvidia/mellanox/doca-driver:25.04-0.6.1.0-2-ubuntu22.04-amd64"
],
"sizeBytes": 389317968
},
{
"names": [
"nvcr.io/nvidia/cloud-native/nvidia-fs@sha256:f95fd7f39991fd3b37e64beeed7fe31f2cc98ff4c174d056a1a8d8e14e1a09dc",
"nvcr.io/nvidia/cloud-native/nvidia-fs:2.20.5-ubuntu22.04"
],
"sizeBytes": 299283866
},
{
"names": [
"nvcr.io/nvidia/cloud-native/k8s-driver-manager@sha256:c525320fd1e771b911b68f8e760b83e8fccf1beea43bf9b009c4f0c591e193ea",
"nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.8.0"
],
"sizeBytes": 238542288
},
{
"names": [
"nvcr.io/nvidia/k8s-device-plugin@sha256:af31e2b7c7f89834c4e5219860def7ac2e49a207b3d4e8610d5a26772b7738e5",
"nvcr.io/nvidia/k8s-device-plugin:v0.17.1"
],
"sizeBytes": 198670519
},
{
"names": [
"nvcr.io/nvidia/cloud-native/gpu-operator-validator@sha256:07b93914425148f936157ad295649ce100b91b29394669031a585d2458c9f39f",
"nvcr.io/nvidia/cloud-native/gpu-operator-validator:v25.3.0"
],
"sizeBytes": 188028070
},
{
"names": [
"nvcr.io/nvidia/k8s/dcgm-exporter@sha256:5e0a7eb08d446042ad7eac82dd871c0ea2b12a344a1f3ae9b106357618714565",
"nvcr.io/nvidia/k8s/dcgm-exporter:4.2.3-4.3.0-ubuntu22.04"
],
"sizeBytes": 187331328
},
{
"names": [
"ghcr.io/k8snetworkplumbingwg/multus-cni@sha256:aa59e65256324c83efb9eaebca9e78877b38c33ad30ff8df71e02610aa968fb7",
"ghcr.io/k8snetworkplumbingwg/multus-cni:v4.1.0"
],
"sizeBytes": 182498028
},
{
"names": [
"ghcr.io/k8snetworkplumbingwg/plugins@sha256:fe8efec170b498922b3367aabbb6dc57966eb930c8aa086a5f5fb369cefa6064",
"ghcr.io/k8snetworkplumbingwg/plugins:v1.5.0"
],
"sizeBytes": 167150249
},
{
"names": [
"docker.io/calico/node@sha256:eed399f2a727cfc1f374ab5c9cda6123c207e794ed8dc66c7eb6d8db412669e1",
"docker.io/calico/node:v3.29.3"
],
"sizeBytes": 144069230
},
{
"names": [
"nvcr.io/nvidia/k8s/dcgm-exporter@sha256:e62659741497b046dd1586bdca61bbbaeb8022e17ccbe8d2a7e8b1745a3e12ce",
"nvcr.io/nvidia/k8s/dcgm-exporter:4.1.1-4.0.4-ubuntu22.04"
],
"sizeBytes": 140702544
},
{
"names": [
"nvcr.io/nvidia/k8s/container-toolkit@sha256:10d10f951431986a4aa23a586266022a29350d45fc50cc8b6fd1ca4feb771959",
"nvcr.io/nvidia/k8s/container-toolkit:v1.17.5-ubuntu20.04"
],
"sizeBytes": 139876707
},
{
"names": [
"quay.io/karbon/ntnx-csi@sha256:850f01fce01dd924e442cefabd07c05193dd2d22fa26391b4b61303c386014e1",
"quay.io/karbon/ntnx-csi:v2.6.10"
],
"sizeBytes": 136921604
},
{
"names": [
"docker.io/bitnami/kubectl@sha256:f65b74480c37b65099453fb3a5ca7eaaea235b3d4268ef3b1ed0f0150d340646",
"docker.io/bitnami/kubectl:1.32.3"
],
"sizeBytes": 111999120
},
{
"names": [
"quay.io/prometheus/prometheus@sha256:497fe921f22fea8535fa2bcb1c193dacc6ce98c08274257b3d18a4eaae0f9647",
"quay.io/prometheus/prometheus:v2.54.0"
],
"sizeBytes": 108261651
},
{
"names": [
"docker.io/calico/cni@sha256:53f826d3f565a6635b4d58ea4fcfdc0e7ea418ffd4dbb495b4c801074e6eb99c",
"docker.io/calico/cni:v3.29.3"
],
"sizeBytes": 99286923
},
{
"names": [
"registry.k8s.io/nfd/node-feature-discovery@sha256:b86ad9a33a42dc371fe32a7c67d0705c6246db0a7fa8f0c119810582e6199241",
"registry.k8s.io/nfd/node-feature-discovery:v0.17.2",
"registry.k8s.io/nfd/node-feature-discovery:v0.17.2-minimal"
],
"sizeBytes": 80696356
},
{
"names": [
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-inference-ui@sha256:5a88ee8e1b25141e7634aef57210715a9b7b821dbcd2caa2c0967befd961b8a4",
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-inference-ui:v2.4.0-rc0"
],
"sizeBytes": 76009953
},
{
"names": [
"ghcr.io/mellanox/nvidia-k8s-ipam@sha256:1b20b78f889339834ed74e0da621fc5da582719b2537b36d8967ddc6a04679b8",
"ghcr.io/mellanox/nvidia-k8s-ipam:v0.3.7"
],
"sizeBytes": 74762960
},
{
"names": [
"docker.io/envoyproxy/gateway@sha256:d6e5e3c7291e246f3c13311b640dc8a475dfaefe7961759e1dc2b622a8f9c1a5",
"docker.io/envoyproxy/gateway:v1.3.2"
],
"sizeBytes": 68030691
},
{
"names": [
"ghcr.io/mesosphere/local-volume-provisioner@sha256:53290aa2b2764f0100cb6b1d601a52c35d9310b3d828d433293240005d2ceae3",
"ghcr.io/mesosphere/local-volume-provisioner:v2.7.0-d2iq.2"
],
"sizeBytes": 63131919
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:ba9f29796bdcdffdefdec1f11de5d095263def4bd33b865eec1fbcf0ba39bbbf",
"registry.k8s.io/etcd:3.5.16-0"
],
"sizeBytes": 57677613
},
{
"names": [
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-iep-operator@sha256:ab339bdedc52f7ed4c503d7b6218de2f5a3588ba05a957c0727148f3002d9a85",
"artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-iep-operator:v2.4.0-rc0"
],
"sizeBytes": 56004493
},
{
"names": [
"quay.io/metallb/speaker@sha256:fd86bfc502601d6525739d411a0045e7085a4008a732be7e271c851800952142",
"quay.io/metallb/speaker:v0.14.8"
],
"sizeBytes": 53146149
},
{
"names": [
"docker.io/envoyproxy/envoy@sha256:42653673bbc413c41c545ce6f134e0847d88c44932fc3c8e5d3b0907b36ffa31",
"docker.io/envoyproxy/envoy:distroless-v1.33.1"
],
"sizeBytes": 33512557
},
{
"names": [
"quay.io/brancz/kube-rbac-proxy@sha256:7de54b6dedc8006ffd447267b826eb417a648c00f2b735b6d313395411803719",
"quay.io/brancz/kube-rbac-proxy:v0.18.2"
],
"sizeBytes": 32165573
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:f7633477debae9281f32c926502aab93c515bf702efca919bf937ba565978b70",
"registry.k8s.io/kube-proxy:v1.32.3"
],
"sizeBytes": 30915698
},
{
"names": [
"docker.io/library/ubuntu@sha256:1ec65b2719518e27d4d25f104d93f9fac60dc437f81452302406825c46fcc9cb",
"docker.io/library/ubuntu:22.04"
],
"sizeBytes": 29545886
},
{
"names": [
"nvcr.io/nvidia/distroless/python@sha256:2cf6b3df9e9e07f8a01846c51f83af9e0b244dae0b2f8fccc2d3c3f0430111e9",
"nvcr.io/nvidia/distroless/python:3.12-v3.4.6"
],
"sizeBytes": 29422791
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:497a1d7ff11bec63cd7c1a371ad7ded8d6a22ae1e327296c6f25ac5d8db39329",
"registry.k8s.io/kube-apiserver:v1.32.3"
],
"sizeBytes": 28677203
},
{
"names": [
"docker.io/library/debian@sha256:2424c1850714a4d94666ec928e24d86de958646737b1d113f5b2207be44d37d8",
"docker.io/library/debian:bookworm-slim"
],
"sizeBytes": 28240304
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:6c4248bb462de0630deb1ae7ff457e374f71c9a120f5da1eb6d6861fedc1e284",
"ghcr.io/mesosphere/dynamic-credential-provider:v0.5.3"
],
"sizeBytes": 28195777
},
{
"names": [
"ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin@sha256:cabce074d10a0f1d62135e2cc5442d65b49094b95b8297fdd024a1a5f461319f",
"ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0"
],
"sizeBytes": 28126576
},
{
"names": [
"ghcr.io/mellanox/network-operator-init-container@sha256:67e93ccf3ecb61f17597567faf0f72e1b8ddcf73c5d7440baeadcc1cb6bb811b",
"ghcr.io/mellanox/network-operator-init-container:v0.0.3"
],
"sizeBytes": 27938033
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:914507ed33938406f92a1f46fac4c08e607d611a9e53a60e69b48ac465141179",
"ghcr.io/mesosphere/dynamic-credential-provider:v0.2.0"
],
"sizeBytes": 27487612
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:91b7907ea73cb15d4a6bbe4cf133e8b83c8e0b47fe06576191cedc36b2c6f0ef",
"registry.k8s.io/kube-controller-manager:v1.32.3"
],
"sizeBytes": 26265265
},
{
"names": [
"registry.k8s.io/build-image/debian-base@sha256:0a17678966f63e82e9c5e246d9e654836a33e13650a698adefede61bb5ca099e",
"registry.k8s.io/build-image/debian-base:bookworm-v1.0.4"
],
"sizeBytes": 25514632
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:4b7864ec8c3ab64a5e667d9831897744cacd50986dd63090bd5c6162a5b6e7e5",
"registry.k8s.io/kube-scheduler:v1.32.3"
],
"sizeBytes": 20657018
},
{
"names": [
"docker.io/library/import-2025-07-22@sha256:22b54476eaa7be21b5304972f62caa919d40a1407d0a41ff676a857a1926ddd5",
"k8s.gcr.io/coredns:v1.11.3",
"registry.k8s.io/coredns/coredns:v1.11.3"
],
"sizeBytes": 18559366
},
{
"names": [
"docker.io/calico/node-driver-registrar@sha256:adcc7f3e0534d0ca1c8e5c25dc666d177f4ed01043262ec2933ac8439c90b5cf",
"docker.io/calico/node-driver-registrar:v3.29.3"
],
"sizeBytes": 15484965
},
{
"names": [
"quay.io/prometheus-operator/prometheus-config-reloader@sha256:959d47672fbff2776a04ec62b8afcec89e8c036af84dc5fade50019dab212746",
"quay.io/prometheus-operator/prometheus-config-reloader:v0.81.0"
],
"sizeBytes": 14433657
},
{
"names": [
"quay.io/prometheus/node-exporter@sha256:d00a542e409ee618a4edc67da14dd48c5da66726bbd5537ab2af9c1dfc442c8a",
"quay.io/prometheus/node-exporter:v1.9.1"
],
"sizeBytes": 12955907
},
{
"names": [
"registry.k8s.io/sig-storage/csi-node-driver-registrar@sha256:2cddcc716c1930775228d56b0d2d339358647629701047edfdad5fcdfaf4ebcb",
"registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1"
],
"sizeBytes": 10755082
}
],
"nodeInfo": {
"architecture": "amd64",
"bootID": "8800d241-74de-4846-8d7b-2b8c3cac3ab6",
"containerRuntimeVersion": "containerd://1.7.26-d2iq.1",
"kernelVersion": "5.15.0-144-generic",
"kubeProxyVersion": "v1.32.3",
"kubeletVersion": "v1.32.3",
"machineID": "a833d33e380440fabe004f553674800b",
"operatingSystem": "linux",
"osImage": "Ubuntu 22.04.5 LTS",
"systemUUID": "a833d33e-3804-40fa-be00-4f553674800b"
}
}
Environment:
- Kubernetes version (use
kubectl version):
$ kubectl version
Client Version: v1.32.2
Kustomize Version: v5.5.0
Server Version: v1.32.3
- Hardware configuration:
- Network adapter model and firmware version:
Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
Driver Version: mlx5_core:24.10-1.1.4
Firmware Version: 22.43.2026 (MT_0000000437)
- OS (e.g:
cat /etc/os-release):
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
- Kernel (e.g.
uname -a):5.15.0-151-generic
network-operator-controller-logs.txt
network-operator-controller-logs.txt
- Others: