Skip to content

Infinite loop of failed multus AddInterface ops when hostdevice network annotation is set with no devices in resources defined #1674

@jesse-gonzalez

Description

@jesse-gonzalez

What happened:

When deploying pod with no resource limits/requests defined, I neglected to remove the metadata.annotations k8s.v1.cni.cncf.io/networks: hostdev-rdma-device-sriov-gds-test-a-su-1 which caused the deployment of my pod to remain in a ContainerCreating state. In reviewing the events/description of the mod (in this scenario, the namespace was sriov-gds-test), there was an endless loop of errors similar to below, enumerating through every available IP available within k8s-pod-network:

  Normal   AddedInterface          36s                 multus             Add eth0 [192.168.179.237/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  36s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a164f0f20932003e4dfc144edeb844e764f0a4f792a6590c114f5143e464ba11": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"
  Normal   AddedInterface          36s                 multus             Add eth0 [192.168.179.236/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  35s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f51969d12c6daca8546a144cdfa4b03db04a73a390784a9157e1a95954db1d13": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"
  Normal   AddedInterface          34s                 multus             Add eth0 [192.168.179.238/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  34s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4a57990b3cf143279ef543e798931c44d0898d282f589711ecff0f6352fe24d6": plugin type="multus" name="multus-cni-network" failed (add): [sriov-gds-test/gds-pvc-rdma-no-device-attached-67fc7b6746-46fkc/3c15dafb-4ffe-4681-9e1a-07f03e1b83c8:hostdev-rdma-device-sriov-gds-test-a-su-1]: error adding container to network "hostdev-rdma-device-sriov-gds-test-a-su-1": specify either "device", "hwaddr", "kernelpath" or "pciBusID"

What you expected to happen:

If no resource requests for SRIOV/RDMA devices are defined, annotations should be ignored and allow for pod to be deployed or error disclosing actual issue with missing resources being defined when annotation is present.

How to reproduce it (as minimally and precisely as possible):

Below is example manifest of what was used to reproduced.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gds-pvc-rdma-no-device-attached
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gds-pvc-test-rdma
  template:
    metadata:
      labels:
        app: gds-pvc-test-rdma
        nvidia-nsight-profile: disabled
      annotations:
        k8s.v1.cni.cncf.io/networks: hostdev-rdma-device-sriov-gds-test-a-su-1
    spec:
      containers:
      - name: appcntr1
        image: quay.io/frollandnvidia/cuda-perftest:latest
        imagePullPolicy: IfNotPresent
        command:
        - sh
        - -c
        - |
          sleep inf

As you'll notice - no resource limits were defined, such as:

        resources:
          requests:
            nvidia.com/rdma_device_a: '1'
          limits:
            nvidia.com/rdma_device_a: '1'

Anything else we need to know?:

This cluster co-resides with nvidia gpu operator, with GDS/NFSoRDMA enabled.

below are helm deployment configs used:

$ helm get values gpu-operator
USER-SUPPLIED VALUES:
driver:
  rdma:
    enabled: true
gds:
  enabled: true

$ helm ls
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
gpu-operator    gpu-operator    1               2025-07-22 12:21:23.462651 -0400 EDT    deployed        gpu-operator-v25.3.0    v25.3.0

Logs:

  • NicClusterPolicy CR spec and state:
$ k get nicclusterpolicies
NAME                 STATUS   AGE
nic-cluster-policy   ready    2025-07-23T16:35:15Z

apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"mellanox.com/v1alpha1","kind":"NicClusterPolicy","metadata":{"annotations":{},"name":"nic-cluster-policy"},"spec":{"nvIpam":{"enableWebhook":false,"image":"nvidia-k8s-ipam","imagePullSecrets":[],"repository":"ghcr.io/mellanox","version":"v0.3.7"},"ofedDriver":{"env":[{"name":"RESTORE_DRIVER_ON_POD_TERMINATION","value":"true"},{"name":"UNLOAD_STORAGE_MODULES","value":"true"},{"name":"CREATE_IFNAMES_UDEV","value":"true"},{"name":"ENABLE_NFSRDMA","value":"true"}],"forcePrecompiled":false,"image":"doca-driver","imagePullSecrets":[],"livenessProbe":{"initialDelaySeconds":30,"periodSeconds":30},"readinessProbe":{"initialDelaySeconds":10,"periodSeconds":30},"repository":"nvcr.io/nvidia/mellanox","startupProbe":{"initialDelaySeconds":10,"periodSeconds":20},"terminationGracePeriodSeconds":300,"upgradePolicy":{"autoUpgrade":true,"drain":{"deleteEmptyDir":true,"enable":true,"force":true,"podSelector":"","timeoutSeconds":300},"maxParallelUpgrades":1,"safeLoad":false},"version":"25.04-0.6.1.0-2"},"secondaryNetwork":{"cniPlugins":{"image":"plugins","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v1.5.0"},"multus":{"image":"multus-cni","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v4.1.0"}},"sriovDevicePlugin":{"config":"{\n  \"resourceList\": [\n    {\n      \"resourcePrefix\": \"nvidia.com\",\n      \"resourceName\": \"rdma_device_a\",\n      \"selectors\": {\n        \"vendors\": [\"15b3\"],\n        \"devices\": [],\n        \"drivers\": [],\n        \"pfNames\": [],\n        \"pciAddresses\": [\"0000:00:07.0\",\"0000:00:08.0\"],\n        \"rootDevices\": [],\n        \"linkTypes\": [],\n        \"isRdma\": true\n      }\n    }\n  ]\n}\n","image":"sriov-network-device-plugin","imagePullSecrets":[],"repository":"ghcr.io/k8snetworkplumbingwg","version":"v3.9.0"}}}
  creationTimestamp: "2025-07-23T16:35:15Z"
  generation: 2
  name: nic-cluster-policy
  resourceVersion: "41540363"
  uid: f2e5e6e0-46a0-40e9-9c8d-59eb70257ac2
spec:
  nvIpam:
    enableWebhook: false
    image: nvidia-k8s-ipam
    imagePullSecrets: []
    repository: ghcr.io/mellanox
    version: v0.3.7
  ofedDriver:
    env:
    - name: RESTORE_DRIVER_ON_POD_TERMINATION
      value: "true"
    - name: UNLOAD_STORAGE_MODULES
      value: "true"
    - name: CREATE_IFNAMES_UDEV
      value: "true"
    - name: ENABLE_NFSRDMA
      value: "true"
    forcePrecompiled: false
    image: doca-driver
    imagePullSecrets: []
    livenessProbe:
      initialDelaySeconds: 30
      periodSeconds: 30
    readinessProbe:
      initialDelaySeconds: 10
      periodSeconds: 30
    repository: nvcr.io/nvidia/mellanox
    startupProbe:
      initialDelaySeconds: 10
      periodSeconds: 20
    terminationGracePeriodSeconds: 300
    upgradePolicy:
      autoUpgrade: true
      drain:
        deleteEmptyDir: true
        enable: true
        force: true
        podSelector: ""
        timeoutSeconds: 300
      maxParallelUpgrades: 1
      safeLoad: false
    version: 25.04-0.6.1.0-2
  secondaryNetwork:
    cniPlugins:
      image: plugins
      imagePullSecrets: []
      repository: ghcr.io/k8snetworkplumbingwg
      version: v1.5.0
    multus:
      image: multus-cni
      imagePullSecrets: []
      repository: ghcr.io/k8snetworkplumbingwg
      version: v4.1.0
  sriovDevicePlugin:
    config: |
      {
        "resourceList": [
          {
            "resourcePrefix": "nvidia.com",
            "resourceName": "rdma_device_a",
            "selectors": {
              "vendors": ["15b3"],
              "devices": [],
              "drivers": [],
              "pfNames": [],
              "pciAddresses": ["0000:00:07.0"],
              "rootDevices": [],
              "linkTypes": [],
              "isRdma": true
            }
          }
        ]
      }
    image: sriov-network-device-plugin
    imagePullSecrets: []
    repository: ghcr.io/k8snetworkplumbingwg
    version: v3.9.0
status:
  appliedStates:
  - name: state-multus-cni
    state: ready
  - name: state-container-networking-plugins
    state: ready
  - name: state-ipoib-cni
    state: ignore
  - name: state-whereabouts-cni
    state: ignore
  - name: state-OFED
    state: ready
  - name: state-SRIOV-device-plugin
    state: ready
  - name: state-RDMA-device-plugin
    state: ignore
  - name: state-ib-kubernetes
    state: ignore
  - name: state-nv-ipam-cni
    state: ready
  - name: state-nic-feature-discovery
    state: ignore
  - name: state-doca-telemetry-service
    state: ignore
  - name: state-nic-configuration-operator
    state: ignore
  - name: state-spectrum-x-operator
    state: ignore
  state: ready

  • Output of: kubectl -n nvidia-network-operator get -A:

This command is incorrect - instead I ran kubectl -n nvidia-network-operator get all

$ kubectl -n nvidia-network-operator get all
NAME                                             READY   STATUS    RESTARTS   AGE
pod/cni-plugins-ds-7vwhm                         1/1     Running   0          15h
pod/cni-plugins-ds-bjch2                         1/1     Running   0          15h
pod/cni-plugins-ds-dbnlf                         1/1     Running   0          15h
pod/cni-plugins-ds-n24jv                         1/1     Running   0          15h
pod/cni-plugins-ds-nxnrm                         1/1     Running   0          15h
pod/cni-plugins-ds-prdsp                         1/1     Running   0          15h
pod/kube-multus-ds-6bwnz                         1/1     Running   0          15h
pod/kube-multus-ds-9jz2n                         1/1     Running   0          15h
pod/kube-multus-ds-cwm8w                         1/1     Running   0          15h
pod/kube-multus-ds-gcl68                         1/1     Running   0          15h
pod/kube-multus-ds-gl5c8                         1/1     Running   0          15h
pod/kube-multus-ds-ts2gz                         1/1     Running   0          15h
pod/mofed-ubuntu22.04-6dc4b88db4-ds-r7j5f        1/1     Running   0          15h
pod/mofed-ubuntu22.04-6dc4b88db4-ds-ssstp        1/1     Running   0          15h
pod/mofed-ubuntu22.04-889fff7f4-ds-zbfmc         1/1     Running   0          15h
pod/network-operator-84f7bf449d-tztgs            1/1     Running   0          15h
pod/network-operator-sriov-device-plugin-6fwph   1/1     Running   0          15h
pod/network-operator-sriov-device-plugin-bm6j6   1/1     Running   0          15h
pod/network-operator-sriov-device-plugin-d5kc9   1/1     Running   0          15h
pod/nv-ipam-controller-7748fb76b9-ckgcm          1/1     Running   0          15h
pod/nv-ipam-controller-7748fb76b9-nj2mb          1/1     Running   0          15h
pod/nv-ipam-node-2xx2c                           1/1     Running   0          15h
pod/nv-ipam-node-chxcf                           1/1     Running   0          15h
pod/nv-ipam-node-glthb                           1/1     Running   0          15h
pod/nv-ipam-node-jfhtz                           1/1     Running   0          15h
pod/nv-ipam-node-s9r28                           1/1     Running   0          15h
pod/nv-ipam-node-w586m                           1/1     Running   0          15h

NAME                                                  DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                                                                                                                                                                                             AGE
daemonset.apps/cni-plugins-ds                         6         6         6       6            6           <none>                                                                                                                                                                                                                                    7d18h
daemonset.apps/kube-multus-ds                         6         6         6       6            6           <none>                                                                                                                                                                                                                                    7d18h
daemonset.apps/mofed-ubuntu22.04-6dc4b88db4-ds        2         2         2       2            2           feature.node.kubernetes.io/kernel-version.full=5.15.0-144-generic,feature.node.kubernetes.io/pci-15b3.present=true,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04   7d18h
daemonset.apps/mofed-ubuntu22.04-889fff7f4-ds         1         1         1       1            1           feature.node.kubernetes.io/kernel-version.full=5.15.0-151-generic,feature.node.kubernetes.io/pci-15b3.present=true,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04   15h
daemonset.apps/network-operator-sriov-device-plugin   3         3         3       3            3           feature.node.kubernetes.io/pci-15b3.present=true,network.nvidia.com/operator.mofed.wait=false                                                                                                                                             7d18h
daemonset.apps/nv-ipam-node                           6         6         6       6            6           <none>                                                                                                                                                                                                                                    7d18h

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/network-operator     1/1     1            1           8d
deployment.apps/nv-ipam-controller   2/2     2            2           7d18h

NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/network-operator-76f4c8779d     0         0         0       19h
replicaset.apps/network-operator-798476bc67     0         0         0       8d
replicaset.apps/network-operator-84f7bf449d     1         1         1       15h
replicaset.apps/network-operator-868b657597     0         0         0       7d15h
replicaset.apps/nv-ipam-controller-558cd88566   0         0         0       7d15h
replicaset.apps/nv-ipam-controller-7748fb76b9   2         2         2       15h
replicaset.apps/nv-ipam-controller-7f575676f7   0         0         0       19h
replicaset.apps/nv-ipam-controller-cdcb7db5c    0         0         0       7d18h

  • Network Operator version:
$ k images -u
[Summary]: 1 namespaces, 27 pods, 39 containers and 7 different images
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
|                    Pod                     |            Container            |                                 Image                                 |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| cni-plugins-ds-7vwhm                       | cni-plugins                     | ghcr.io/k8snetworkplumbingwg/plugins:v1.5.0                           |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| kube-multus-ds-6bwnz                       | kube-multus                     | ghcr.io/k8snetworkplumbingwg/multus-cni:v4.1.0                        |
+                                            +---------------------------------+                                                                       +
|                                            | (init) install-multus-binary    |                                                                       |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| mofed-ubuntu22.04-6dc4b88db4-ds-r7j5f      | mofed-container                 | nvcr.io/nvidia/mellanox/doca-driver:25.04-0.6.1.0-2-ubuntu22.04-amd64 |
+                                            +---------------------------------+-----------------------------------------------------------------------+
|                                            | (init)                          | ghcr.io/mellanox/network-operator-init-container:v0.0.3               |
|                                            | network-operator-init-container |                                                                       |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| network-operator-84f7bf449d-tztgs          | network-operator                | nvcr.io/nvidia/cloud-native/network-operator:v25.4.0                  |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| network-operator-sriov-device-plugin-6fwph | kube-sriovdp                    | ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0       |
+                                            +---------------------------------+                                                                       +
|                                            | (init) ofed-driver-validation   |                                                                       |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+
| nv-ipam-controller-7748fb76b9-ckgcm        | nv-ipam-controller              | ghcr.io/mellanox/nvidia-k8s-ipam:v0.3.7                               |
+--------------------------------------------+---------------------------------+                                                                       +
| nv-ipam-node-2xx2c                         | nv-ipam-node                    |                                                                       |
+--------------------------------------------+---------------------------------+-----------------------------------------------------------------------+

  • Logs of Network Operator controller:

  • Logs of the various Pods in nvidia-network-operator namespace:

  • Helm Configuration (if applicable):

$ helm ls
NAME                    NAMESPACE               REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
network-operator        nvidia-network-operator 2               2025-07-22 18:44:05.406250499 +0000 UTC deployed        network-operator-25.4.0 v25.4.0

USER-SUPPLIED VALUES:
deployCR: true
imagePullSecrets: []
maintenance-operator-chart:
  operator:
    admissionController:
      certificates:
        certManager:
          enable: false
          generateSelfSigned: false
        custom:
          enable: false
        secretNames:
          operator: maintenance-webhook-cert
      enable: false
    image:
      name: maintenance-operator
      repository: ghcr.io/mellanox
      tag: v0.2.2
maintenanceOperator:
  enabled: false
nfd:
  NodeFeatureRule: false
  deployNodeFeatureRules: true
  enabled: false
nic-configuration-operator-chart:
  configDaemon:
    image:
      name: nic-configuration-operator-daemon
      repository: ghcr.io/mellanox
      tag: v1.0.3
  operator:
    image:
      name: nic-configuration-operator
      repository: ghcr.io/mellanox
      tag: v1.0.3
nicConfigurationOperator:
  enabled: false
node-feature-discovery:
  enableNodeFeatureApi: true
  featureGates:
    NodeFeatureAPI: true
  gc:
    enable: true
    replicaCount: 1
    serviceAccount:
      create: false
      name: node-feature-discovery
  master:
    config:
      extraLabelNs:
      - nvidia.com
    serviceAccount:
      create: true
      name: node-feature-discovery
  postDeleteCleanup: false
  worker:
    config:
      sources:
        pci:
          deviceClassWhitelist:
          - "0300"
          - "0302"
          deviceLabelFields:
          - vendor
    serviceAccount:
      create: false
      name: node-feature-discovery
    tolerations:
    - effect: NoSchedule
      key: node-role.kubernetes.io/master
      operator: Exists
    - effect: NoSchedule
      key: node-role.kubernetes.io/control-plane
      operator: Exists
    - effect: NoSchedule
      key: nvidia.com/gpu
      operator: Exists
nvIpam:
  deploy: true
operator:
  admissionController:
    enabled: false
    useCertManager: true
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: node-role.kubernetes.io/master
            operator: In
            values:
            - ""
        weight: 1
      - preference:
          matchExpressions:
          - key: node-role.kubernetes.io/control-plane
            operator: In
            values:
            - ""
        weight: 1
  cniBinDirectory: /opt/cni/bin
  fullnameOverride: ""
  image: network-operator
  maintenanceOperator:
    nodeMaintenanceNamePrefix: network-operator
    nodeMaintenanceNamespace: default
    requestorID: nvidia.network.operator
    useRequestor: false
  nameOverride: ""
  nodeSelector: {}
  ofedDriver:
    initContainer:
      enable: true
      image: network-operator-init-container
      repository: ghcr.io/mellanox
      version: v0.0.3
  repository: nvcr.io/nvidia/cloud-native
  resources:
    limits:
      cpu: 500m
      memory: 128Mi
    requests:
      cpu: 5m
      memory: 64Mi
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
    operator: Equal
    value: ""
  - effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
    operator: Equal
    value: ""
  useDTK: true
sriov-network-operator:
  images:
    ibSriovCni: ghcr.io/k8snetworkplumbingwg/ib-sriov-cni:v1.2.1
    operator: nvcr.io/nvidia/mellanox/sriov-network-operator:network-operator-25.4.0
    ovsCni: ghcr.io/k8snetworkplumbingwg/ovs-cni-plugin:v0.38.2
    resourcesInjector: ghcr.io/k8snetworkplumbingwg/network-resources-injector:v1.7.0
    sriovCni: ghcr.io/k8snetworkplumbingwg/sriov-cni:v2.8.1
    sriovConfigDaemon: nvcr.io/nvidia/mellanox/sriov-network-operator-config-daemon:network-operator-25.4.0
    sriovDevicePlugin: ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0
    webhook: nvcr.io/nvidia/mellanox/sriov-network-operator-webhook:network-operator-25.4.0
  operator:
    admissionControllers:
      certificates:
        certManager:
          enabled: true
          generateSelfSigned: true
        custom:
          enabled: false
        secretNames:
          injector: network-resources-injector-cert
          operator: operator-webhook-cert
      enabled: false
    resourcePrefix: nvidia.com
  sriovOperatorConfig:
    configDaemonNodeSelector:
      beta.kubernetes.io/os: linux
      network.nvidia.com/operator.mofed.wait: "false"
    deploy: true
sriovNetworkOperator:
  enabled: false
test:
  pf: ens2f0
upgradeCRDs: true
  • Kubernetes' nodes information (labels, annotations and status): kubectl get node -o yaml:
labels: {
  "beta.kubernetes.io/arch": "amd64",
  "beta.kubernetes.io/os": "linux",
  "feature.node.kubernetes.io/cpu-cpuid.ADX": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AMXBF16": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AMXINT8": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AMXTILE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX2": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512BF16": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512BITALG": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512BW": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512CD": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512DQ": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512F": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512FP16": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512IFMA": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI2": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512VL": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512VNNI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVX512VPOPCNTDQ": "true",
  "feature.node.kubernetes.io/cpu-cpuid.AVXVNNI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.CLDEMOTE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.CMPSB_SCADBS_SHORT": "true",
  "feature.node.kubernetes.io/cpu-cpuid.CMPXCHG8": "true",
  "feature.node.kubernetes.io/cpu-cpuid.FMA3": "true",
  "feature.node.kubernetes.io/cpu-cpuid.FSRM": "true",
  "feature.node.kubernetes.io/cpu-cpuid.FXSR": "true",
  "feature.node.kubernetes.io/cpu-cpuid.FXSROPT": "true",
  "feature.node.kubernetes.io/cpu-cpuid.GFNI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR": "true",
  "feature.node.kubernetes.io/cpu-cpuid.IA32_ARCH_CAP": "true",
  "feature.node.kubernetes.io/cpu-cpuid.IBPB": "true",
  "feature.node.kubernetes.io/cpu-cpuid.LAHF": "true",
  "feature.node.kubernetes.io/cpu-cpuid.MOVBE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.MOVDIR64B": "true",
  "feature.node.kubernetes.io/cpu-cpuid.MOVDIRI": "true",
  "feature.node.kubernetes.io/cpu-cpuid.MOVSB_ZL": "true",
  "feature.node.kubernetes.io/cpu-cpuid.OSXSAVE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.SERIALIZE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.SHA": "true",
  "feature.node.kubernetes.io/cpu-cpuid.SPEC_CTRL_SSBD": "true",
  "feature.node.kubernetes.io/cpu-cpuid.STOSB_SHORT": "true",
  "feature.node.kubernetes.io/cpu-cpuid.SYSCALL": "true",
  "feature.node.kubernetes.io/cpu-cpuid.SYSEE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.TSXLDTRK": "true",
  "feature.node.kubernetes.io/cpu-cpuid.VAES": "true",
  "feature.node.kubernetes.io/cpu-cpuid.VPCLMULQDQ": "true",
  "feature.node.kubernetes.io/cpu-cpuid.WBNOINVD": "true",
  "feature.node.kubernetes.io/cpu-cpuid.X87": "true",
  "feature.node.kubernetes.io/cpu-cpuid.XGETBV1": "true",
  "feature.node.kubernetes.io/cpu-cpuid.XSAVE": "true",
  "feature.node.kubernetes.io/cpu-cpuid.XSAVEC": "true",
  "feature.node.kubernetes.io/cpu-cpuid.XSAVEOPT": "true",
  "feature.node.kubernetes.io/cpu-cpuid.XSAVES": "true",
  "feature.node.kubernetes.io/cpu-hardware_multithreading": "false",
  "feature.node.kubernetes.io/cpu-model.family": "6",
  "feature.node.kubernetes.io/cpu-model.id": "143",
  "feature.node.kubernetes.io/cpu-model.vendor_id": "Intel",
  "feature.node.kubernetes.io/kernel-config.NO_HZ": "true",
  "feature.node.kubernetes.io/kernel-config.NO_HZ_IDLE": "true",
  "feature.node.kubernetes.io/kernel-version.full": "5.15.0-144-generic",
  "feature.node.kubernetes.io/kernel-version.major": "5",
  "feature.node.kubernetes.io/kernel-version.minor": "15",
  "feature.node.kubernetes.io/kernel-version.revision": "0",
  "feature.node.kubernetes.io/pci-0300_1234.present": "true",
  "feature.node.kubernetes.io/pci-0302_10de.present": "true",
  "feature.node.kubernetes.io/pci-10de.present": "true",
  "feature.node.kubernetes.io/pci-1234.present": "true",
  "feature.node.kubernetes.io/pci-15b3.present": "true",
  "feature.node.kubernetes.io/pci-1af4.present": "true",
  "feature.node.kubernetes.io/rdma.available": "true",
  "feature.node.kubernetes.io/rdma.capable": "true",
  "feature.node.kubernetes.io/system-os_release.ID": "ubuntu",
  "feature.node.kubernetes.io/system-os_release.VERSION_ID": "22.04",
  "feature.node.kubernetes.io/system-os_release.VERSION_ID.major": "22",
  "feature.node.kubernetes.io/system-os_release.VERSION_ID.minor": "04",
  "kubernetes.io/arch": "amd64",
  "kubernetes.io/hostname": "wh-sriov-gpu-worker-0",
  "kubernetes.io/os": "linux",
  "network.nvidia.com/operator.mofed.wait": "false",
  "node-role.kubernetes.io/worker": "",
  "nvidia.com/cuda.driver-version.full": "570.124.06",
  "nvidia.com/cuda.driver-version.major": "570",
  "nvidia.com/cuda.driver-version.minor": "124",
  "nvidia.com/cuda.driver-version.revision": "06",
  "nvidia.com/cuda.driver.major": "570",
  "nvidia.com/cuda.driver.minor": "124",
  "nvidia.com/cuda.driver.rev": "06",
  "nvidia.com/cuda.runtime-version.full": "12.8",
  "nvidia.com/cuda.runtime-version.major": "12",
  "nvidia.com/cuda.runtime-version.minor": "8",
  "nvidia.com/cuda.runtime.major": "12",
  "nvidia.com/cuda.runtime.minor": "8",
  "nvidia.com/gfd.timestamp": "1753903488",
  "nvidia.com/gpu-driver-upgrade-state": "upgrade-done",
  "nvidia.com/gpu.compute.major": "8",
  "nvidia.com/gpu.compute.minor": "9",
  "nvidia.com/gpu.count": "1",
  "nvidia.com/gpu.deploy.container-toolkit": "true",
  "nvidia.com/gpu.deploy.dcgm": "true",
  "nvidia.com/gpu.deploy.dcgm-exporter": "true",
  "nvidia.com/gpu.deploy.device-plugin": "true",
  "nvidia.com/gpu.deploy.driver": "true",
  "nvidia.com/gpu.deploy.gpu-feature-discovery": "true",
  "nvidia.com/gpu.deploy.node-status-exporter": "true",
  "nvidia.com/gpu.deploy.nvsm": "",
  "nvidia.com/gpu.deploy.operator-validator": "true",
  "nvidia.com/gpu.family": "ada-lovelace",
  "nvidia.com/gpu.machine": "AHV",
  "nvidia.com/gpu.memory": "46068",
  "nvidia.com/gpu.mode": "compute",
  "nvidia.com/gpu.present": "true",
  "nvidia.com/gpu.product": "NVIDIA-L40S",
  "nvidia.com/gpu.replicas": "1",
  "nvidia.com/gpu.sharing-strategy": "none",
  "nvidia.com/mig.capable": "false",
  "nvidia.com/mig.strategy": "single",
  "nvidia.com/mps.capable": "false",
  "nvidia.com/ofed-driver-upgrade-state": "upgrade-done",
  "nvidia.com/vgpu.present": "false"
}
{
  "cluster.x-k8s.io/cluster-name": "wh-sriov",
  "cluster.x-k8s.io/cluster-namespace": "default",
  "cluster.x-k8s.io/labels-from-machine": "",
  "cluster.x-k8s.io/machine": "wh-sriov-gpu-nodepool-bhqss-z2nvx",
  "cluster.x-k8s.io/owner-kind": "MachineSet",
  "cluster.x-k8s.io/owner-name": "wh-sriov-gpu-nodepool-bhqss",
  "csi.volume.kubernetes.io/nodeid": "{\"csi.nutanix.com\":\"wh-sriov-gpu-worker-0\",\"csi.tigera.io\":\"wh-sriov-gpu-worker-0\"}",
  "kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock",
  "nfd.node.kubernetes.io/feature-labels": "cpu-cpuid.ADX,cpu-cpuid.AESNI,cpu-cpuid.AMXBF16,cpu-cpuid.AMXINT8,cpu-cpuid.AMXTILE,cpu-cpuid.AVX,cpu-cpuid.AVX2,cpu-cpuid.AVX512BF16,cpu-cpuid.AVX512BITALG,cpu-cpuid.AVX512BW,cpu-cpuid.AVX512CD,cpu-cpuid.AVX512DQ,cpu-cpuid.AVX512F,cpu-cpuid.AVX512FP16,cpu-cpuid.AVX512IFMA,cpu-cpuid.AVX512VBMI,cpu-cpuid.AVX512VBMI2,cpu-cpuid.AVX512VL,cpu-cpuid.AVX512VNNI,cpu-cpuid.AVX512VPOPCNTDQ,cpu-cpuid.AVXVNNI,cpu-cpuid.CLDEMOTE,cpu-cpuid.CMPSB_SCADBS_SHORT,cpu-cpuid.CMPXCHG8,cpu-cpuid.FMA3,cpu-cpuid.FSRM,cpu-cpuid.FXSR,cpu-cpuid.FXSROPT,cpu-cpuid.GFNI,cpu-cpuid.HYPERVISOR,cpu-cpuid.IA32_ARCH_CAP,cpu-cpuid.IBPB,cpu-cpuid.LAHF,cpu-cpuid.MOVBE,cpu-cpuid.MOVDIR64B,cpu-cpuid.MOVDIRI,cpu-cpuid.MOVSB_ZL,cpu-cpuid.OSXSAVE,cpu-cpuid.SERIALIZE,cpu-cpuid.SHA,cpu-cpuid.SPEC_CTRL_SSBD,cpu-cpuid.STOSB_SHORT,cpu-cpuid.SYSCALL,cpu-cpuid.SYSEE,cpu-cpuid.TSXLDTRK,cpu-cpuid.VAES,cpu-cpuid.VPCLMULQDQ,cpu-cpuid.WBNOINVD,cpu-cpuid.X87,cpu-cpuid.XGETBV1,cpu-cpuid.XSAVE,cpu-cpuid.XSAVEC,cpu-cpuid.XSAVEOPT,cpu-cpuid.XSAVES,cpu-hardware_multithreading,cpu-model.family,cpu-model.id,cpu-model.vendor_id,kernel-config.NO_HZ,kernel-config.NO_HZ_IDLE,kernel-version.full,kernel-version.major,kernel-version.minor,kernel-version.revision,nvidia.com/cuda.driver-version.full,nvidia.com/cuda.driver-version.major,nvidia.com/cuda.driver-version.minor,nvidia.com/cuda.driver-version.revision,nvidia.com/cuda.driver.major,nvidia.com/cuda.driver.minor,nvidia.com/cuda.driver.rev,nvidia.com/cuda.runtime-version.full,nvidia.com/cuda.runtime-version.major,nvidia.com/cuda.runtime-version.minor,nvidia.com/cuda.runtime.major,nvidia.com/cuda.runtime.minor,nvidia.com/gfd.timestamp,nvidia.com/gpu.compute.major,nvidia.com/gpu.compute.minor,nvidia.com/gpu.count,nvidia.com/gpu.family,nvidia.com/gpu.machine,nvidia.com/gpu.memory,nvidia.com/gpu.mode,nvidia.com/gpu.product,nvidia.com/gpu.replicas,nvidia.com/gpu.sharing-strategy,nvidia.com/mig.capable,nvidia.com/mig.strategy,nvidia.com/mps.capable,nvidia.com/vgpu.present,pci-0300_1234.present,pci-0302_10de.present,pci-10de.present,pci-1234.present,pci-15b3.present,pci-1af4.present,rdma.available,rdma.capable,system-os_release.ID,system-os_release.VERSION_ID,system-os_release.VERSION_ID.major,system-os_release.VERSION_ID.minor",
  "node.alpha.kubernetes.io/ttl": "0",
  "nvidia.com/gpu-driver-upgrade-enabled": "true",
  "projectcalico.org/IPv4Address": "10.122.7.57/24",
  "projectcalico.org/IPv4IPIPTunnelAddr": "192.168.179.192",
  "volumes.kubernetes.io/controller-managed-attach-detach": "true"
}

{
  "addresses": [
    {
      "address": "10.122.7.57",
      "type": "InternalIP"
    },
    {
      "address": "wh-sriov-gpu-worker-0",
      "type": "Hostname"
    }
  ],
  "allocatable": {
    "cpu": "16",
    "ephemeral-storage": "280794130787",
    "hugepages-1Gi": "0",
    "hugepages-2Mi": "0",
    "memory": "65666324Ki",
    "nvidia.com/gpu": "1",
    "nvidia.com/rdma_device_a": "1",
    "nvidia.com/rdma_device_b": "0",
    "pods": "110"
  },
  "capacity": {
    "cpu": "16",
    "ephemeral-storage": "304681132Ki",
    "hugepages-1Gi": "0",
    "hugepages-2Mi": "0",
    "memory": "65768724Ki",
    "nvidia.com/gpu": "1",
    "nvidia.com/rdma_device_a": "1",
    "nvidia.com/rdma_device_b": "0",
    "pods": "110"
  },
  "conditions": [
    {
      "lastHeartbeatTime": "2025-07-30T19:12:03Z",
      "lastTransitionTime": "2025-07-30T19:12:03Z",
      "message": "Calico is running on this node",
      "reason": "CalicoIsUp",
      "status": "False",
      "type": "NetworkUnavailable"
    },
    {
      "lastHeartbeatTime": "2025-07-31T10:47:21Z",
      "lastTransitionTime": "2025-07-30T19:12:00Z",
      "message": "kubelet has sufficient memory available",
      "reason": "KubeletHasSufficientMemory",
      "status": "False",
      "type": "MemoryPressure"
    },
    {
      "lastHeartbeatTime": "2025-07-31T10:47:21Z",
      "lastTransitionTime": "2025-07-30T19:12:00Z",
      "message": "kubelet has no disk pressure",
      "reason": "KubeletHasNoDiskPressure",
      "status": "False",
      "type": "DiskPressure"
    },
    {
      "lastHeartbeatTime": "2025-07-31T10:47:21Z",
      "lastTransitionTime": "2025-07-30T19:12:00Z",
      "message": "kubelet has sufficient PID available",
      "reason": "KubeletHasSufficientPID",
      "status": "False",
      "type": "PIDPressure"
    },
    {
      "lastHeartbeatTime": "2025-07-31T10:47:21Z",
      "lastTransitionTime": "2025-07-30T19:12:00Z",
      "message": "kubelet is posting ready status",
      "reason": "KubeletReady",
      "status": "True",
      "type": "Ready"
    }
  ],
  "daemonEndpoints": {
    "kubeletEndpoint": {
      "Port": 10250
    }
  },
  "images": [
    {
      "names": [
        "ghcr.io/coreweave/nccl-tests@sha256:c926650b8f5d34db90409265436a7369f10bc2f5ef820291fab75785b18bef71",
        "ghcr.io/coreweave/nccl-tests:12.9.1-devel-ubuntu22.04-nccl2.27.6-1-7c12c62"
      ],
      "sizeBytes": 8323670310
    },
    {
      "names": [
        "nvcr.io/nvidia/cuda@sha256:a3460ae0897335607bef01cb903b56f1bf07a9651e480fc6bb621279e39f9214",
        "nvcr.io/nvidia/cuda:12.9.1-cudnn-devel-ubuntu22.04"
      ],
      "sizeBytes": 6379183063
    },
    {
      "names": [
        "nvcr.io/nvidia/cuda-dl-base@sha256:ab128a0b5d4298e62c691e478e42e0af98aecdb71ea17b1fea0261875faf4611",
        "nvcr.io/nvidia/cuda-dl-base:25.06-cuda12.9-devel-ubuntu24.04"
      ],
      "sizeBytes": 6079048112
    },
    {
      "names": [
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-huggingfaceserver@sha256:674f2321c4665b5b97a38c1efde8a0cf2cbdd7c64452942dda29915ed706fca0",
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-huggingfaceserver:v0.15.2-gpu"
      ],
      "sizeBytes": 5942716497
    },
    {
      "names": [
        "quay.io/frollandnvidia/cuda-perftest@sha256:c10d648df4a09ae6afb48ac7f220233df9122ef59616aabda7d6fed7f65517ab",
        "quay.io/frollandnvidia/cuda-perftest:latest"
      ],
      "sizeBytes": 4137623786
    },
    {
      "names": [
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-custom-model-server@sha256:23815b51e12abb8705e18ec36b54ec2371155a1652bbcfd84361a45aaf197931",
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-kserve-custom-model-server:320"
      ],
      "sizeBytes": 3297017421
    },
    {
      "names": [
        "nvcr.io/nvidia/driver@sha256:36564482ecae5a01fa0099823aaca2212e497d80021602391b7c306453baef7d",
        "nvcr.io/nvidia/driver:570.124.06-ubuntu22.04"
      ],
      "sizeBytes": 740871082
    },
    {
      "names": [
        "nvcr.io/nvidia/driver@sha256:316963bc85f3d3a95e046b8d264072c7b41145a22de9ac1df1683e1bc8b7d207",
        "nvcr.io/nvidia/driver:550.127.05-ubuntu22.04"
      ],
      "sizeBytes": 675354846
    },
    {
      "names": [
        "nvcr.io/nvidia/mellanox/doca-driver@sha256:cb436254179a539a33c31b97c3eb84f78011e5eba3b7a49780245fed233076cf",
        "nvcr.io/nvidia/mellanox/doca-driver:25.04-0.6.1.0-2-ubuntu22.04-amd64"
      ],
      "sizeBytes": 389317968
    },
    {
      "names": [
        "nvcr.io/nvidia/cloud-native/nvidia-fs@sha256:f95fd7f39991fd3b37e64beeed7fe31f2cc98ff4c174d056a1a8d8e14e1a09dc",
        "nvcr.io/nvidia/cloud-native/nvidia-fs:2.20.5-ubuntu22.04"
      ],
      "sizeBytes": 299283866
    },
    {
      "names": [
        "nvcr.io/nvidia/cloud-native/k8s-driver-manager@sha256:c525320fd1e771b911b68f8e760b83e8fccf1beea43bf9b009c4f0c591e193ea",
        "nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.8.0"
      ],
      "sizeBytes": 238542288
    },
    {
      "names": [
        "nvcr.io/nvidia/k8s-device-plugin@sha256:af31e2b7c7f89834c4e5219860def7ac2e49a207b3d4e8610d5a26772b7738e5",
        "nvcr.io/nvidia/k8s-device-plugin:v0.17.1"
      ],
      "sizeBytes": 198670519
    },
    {
      "names": [
        "nvcr.io/nvidia/cloud-native/gpu-operator-validator@sha256:07b93914425148f936157ad295649ce100b91b29394669031a585d2458c9f39f",
        "nvcr.io/nvidia/cloud-native/gpu-operator-validator:v25.3.0"
      ],
      "sizeBytes": 188028070
    },
    {
      "names": [
        "nvcr.io/nvidia/k8s/dcgm-exporter@sha256:5e0a7eb08d446042ad7eac82dd871c0ea2b12a344a1f3ae9b106357618714565",
        "nvcr.io/nvidia/k8s/dcgm-exporter:4.2.3-4.3.0-ubuntu22.04"
      ],
      "sizeBytes": 187331328
    },
    {
      "names": [
        "ghcr.io/k8snetworkplumbingwg/multus-cni@sha256:aa59e65256324c83efb9eaebca9e78877b38c33ad30ff8df71e02610aa968fb7",
        "ghcr.io/k8snetworkplumbingwg/multus-cni:v4.1.0"
      ],
      "sizeBytes": 182498028
    },
    {
      "names": [
        "ghcr.io/k8snetworkplumbingwg/plugins@sha256:fe8efec170b498922b3367aabbb6dc57966eb930c8aa086a5f5fb369cefa6064",
        "ghcr.io/k8snetworkplumbingwg/plugins:v1.5.0"
      ],
      "sizeBytes": 167150249
    },
    {
      "names": [
        "docker.io/calico/node@sha256:eed399f2a727cfc1f374ab5c9cda6123c207e794ed8dc66c7eb6d8db412669e1",
        "docker.io/calico/node:v3.29.3"
      ],
      "sizeBytes": 144069230
    },
    {
      "names": [
        "nvcr.io/nvidia/k8s/dcgm-exporter@sha256:e62659741497b046dd1586bdca61bbbaeb8022e17ccbe8d2a7e8b1745a3e12ce",
        "nvcr.io/nvidia/k8s/dcgm-exporter:4.1.1-4.0.4-ubuntu22.04"
      ],
      "sizeBytes": 140702544
    },
    {
      "names": [
        "nvcr.io/nvidia/k8s/container-toolkit@sha256:10d10f951431986a4aa23a586266022a29350d45fc50cc8b6fd1ca4feb771959",
        "nvcr.io/nvidia/k8s/container-toolkit:v1.17.5-ubuntu20.04"
      ],
      "sizeBytes": 139876707
    },
    {
      "names": [
        "quay.io/karbon/ntnx-csi@sha256:850f01fce01dd924e442cefabd07c05193dd2d22fa26391b4b61303c386014e1",
        "quay.io/karbon/ntnx-csi:v2.6.10"
      ],
      "sizeBytes": 136921604
    },
    {
      "names": [
        "docker.io/bitnami/kubectl@sha256:f65b74480c37b65099453fb3a5ca7eaaea235b3d4268ef3b1ed0f0150d340646",
        "docker.io/bitnami/kubectl:1.32.3"
      ],
      "sizeBytes": 111999120
    },
    {
      "names": [
        "quay.io/prometheus/prometheus@sha256:497fe921f22fea8535fa2bcb1c193dacc6ce98c08274257b3d18a4eaae0f9647",
        "quay.io/prometheus/prometheus:v2.54.0"
      ],
      "sizeBytes": 108261651
    },
    {
      "names": [
        "docker.io/calico/cni@sha256:53f826d3f565a6635b4d58ea4fcfdc0e7ea418ffd4dbb495b4c801074e6eb99c",
        "docker.io/calico/cni:v3.29.3"
      ],
      "sizeBytes": 99286923
    },
    {
      "names": [
        "registry.k8s.io/nfd/node-feature-discovery@sha256:b86ad9a33a42dc371fe32a7c67d0705c6246db0a7fa8f0c119810582e6199241",
        "registry.k8s.io/nfd/node-feature-discovery:v0.17.2",
        "registry.k8s.io/nfd/node-feature-discovery:v0.17.2-minimal"
      ],
      "sizeBytes": 80696356
    },
    {
      "names": [
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-inference-ui@sha256:5a88ee8e1b25141e7634aef57210715a9b7b821dbcd2caa2c0967befd961b8a4",
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-inference-ui:v2.4.0-rc0"
      ],
      "sizeBytes": 76009953
    },
    {
      "names": [
        "ghcr.io/mellanox/nvidia-k8s-ipam@sha256:1b20b78f889339834ed74e0da621fc5da582719b2537b36d8967ddc6a04679b8",
        "ghcr.io/mellanox/nvidia-k8s-ipam:v0.3.7"
      ],
      "sizeBytes": 74762960
    },
    {
      "names": [
        "docker.io/envoyproxy/gateway@sha256:d6e5e3c7291e246f3c13311b640dc8a475dfaefe7961759e1dc2b622a8f9c1a5",
        "docker.io/envoyproxy/gateway:v1.3.2"
      ],
      "sizeBytes": 68030691
    },
    {
      "names": [
        "ghcr.io/mesosphere/local-volume-provisioner@sha256:53290aa2b2764f0100cb6b1d601a52c35d9310b3d828d433293240005d2ceae3",
        "ghcr.io/mesosphere/local-volume-provisioner:v2.7.0-d2iq.2"
      ],
      "sizeBytes": 63131919
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:ba9f29796bdcdffdefdec1f11de5d095263def4bd33b865eec1fbcf0ba39bbbf",
        "registry.k8s.io/etcd:3.5.16-0"
      ],
      "sizeBytes": 57677613
    },
    {
      "names": [
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-iep-operator@sha256:ab339bdedc52f7ed4c503d7b6218de2f5a3588ba05a957c0727148f3002d9a85",
        "artifactory-edge-01.corp.p10y.ntnxdpro.com/canaveral-legacy-docker/nutanix-core/nai-iep-operator:v2.4.0-rc0"
      ],
      "sizeBytes": 56004493
    },
    {
      "names": [
        "quay.io/metallb/speaker@sha256:fd86bfc502601d6525739d411a0045e7085a4008a732be7e271c851800952142",
        "quay.io/metallb/speaker:v0.14.8"
      ],
      "sizeBytes": 53146149
    },
    {
      "names": [
        "docker.io/envoyproxy/envoy@sha256:42653673bbc413c41c545ce6f134e0847d88c44932fc3c8e5d3b0907b36ffa31",
        "docker.io/envoyproxy/envoy:distroless-v1.33.1"
      ],
      "sizeBytes": 33512557
    },
    {
      "names": [
        "quay.io/brancz/kube-rbac-proxy@sha256:7de54b6dedc8006ffd447267b826eb417a648c00f2b735b6d313395411803719",
        "quay.io/brancz/kube-rbac-proxy:v0.18.2"
      ],
      "sizeBytes": 32165573
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:f7633477debae9281f32c926502aab93c515bf702efca919bf937ba565978b70",
        "registry.k8s.io/kube-proxy:v1.32.3"
      ],
      "sizeBytes": 30915698
    },
    {
      "names": [
        "docker.io/library/ubuntu@sha256:1ec65b2719518e27d4d25f104d93f9fac60dc437f81452302406825c46fcc9cb",
        "docker.io/library/ubuntu:22.04"
      ],
      "sizeBytes": 29545886
    },
    {
      "names": [
        "nvcr.io/nvidia/distroless/python@sha256:2cf6b3df9e9e07f8a01846c51f83af9e0b244dae0b2f8fccc2d3c3f0430111e9",
        "nvcr.io/nvidia/distroless/python:3.12-v3.4.6"
      ],
      "sizeBytes": 29422791
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:497a1d7ff11bec63cd7c1a371ad7ded8d6a22ae1e327296c6f25ac5d8db39329",
        "registry.k8s.io/kube-apiserver:v1.32.3"
      ],
      "sizeBytes": 28677203
    },
    {
      "names": [
        "docker.io/library/debian@sha256:2424c1850714a4d94666ec928e24d86de958646737b1d113f5b2207be44d37d8",
        "docker.io/library/debian:bookworm-slim"
      ],
      "sizeBytes": 28240304
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:6c4248bb462de0630deb1ae7ff457e374f71c9a120f5da1eb6d6861fedc1e284",
        "ghcr.io/mesosphere/dynamic-credential-provider:v0.5.3"
      ],
      "sizeBytes": 28195777
    },
    {
      "names": [
        "ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin@sha256:cabce074d10a0f1d62135e2cc5442d65b49094b95b8297fdd024a1a5f461319f",
        "ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.9.0"
      ],
      "sizeBytes": 28126576
    },
    {
      "names": [
        "ghcr.io/mellanox/network-operator-init-container@sha256:67e93ccf3ecb61f17597567faf0f72e1b8ddcf73c5d7440baeadcc1cb6bb811b",
        "ghcr.io/mellanox/network-operator-init-container:v0.0.3"
      ],
      "sizeBytes": 27938033
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:914507ed33938406f92a1f46fac4c08e607d611a9e53a60e69b48ac465141179",
        "ghcr.io/mesosphere/dynamic-credential-provider:v0.2.0"
      ],
      "sizeBytes": 27487612
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:91b7907ea73cb15d4a6bbe4cf133e8b83c8e0b47fe06576191cedc36b2c6f0ef",
        "registry.k8s.io/kube-controller-manager:v1.32.3"
      ],
      "sizeBytes": 26265265
    },
    {
      "names": [
        "registry.k8s.io/build-image/debian-base@sha256:0a17678966f63e82e9c5e246d9e654836a33e13650a698adefede61bb5ca099e",
        "registry.k8s.io/build-image/debian-base:bookworm-v1.0.4"
      ],
      "sizeBytes": 25514632
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:4b7864ec8c3ab64a5e667d9831897744cacd50986dd63090bd5c6162a5b6e7e5",
        "registry.k8s.io/kube-scheduler:v1.32.3"
      ],
      "sizeBytes": 20657018
    },
    {
      "names": [
        "docker.io/library/import-2025-07-22@sha256:22b54476eaa7be21b5304972f62caa919d40a1407d0a41ff676a857a1926ddd5",
        "k8s.gcr.io/coredns:v1.11.3",
        "registry.k8s.io/coredns/coredns:v1.11.3"
      ],
      "sizeBytes": 18559366
    },
    {
      "names": [
        "docker.io/calico/node-driver-registrar@sha256:adcc7f3e0534d0ca1c8e5c25dc666d177f4ed01043262ec2933ac8439c90b5cf",
        "docker.io/calico/node-driver-registrar:v3.29.3"
      ],
      "sizeBytes": 15484965
    },
    {
      "names": [
        "quay.io/prometheus-operator/prometheus-config-reloader@sha256:959d47672fbff2776a04ec62b8afcec89e8c036af84dc5fade50019dab212746",
        "quay.io/prometheus-operator/prometheus-config-reloader:v0.81.0"
      ],
      "sizeBytes": 14433657
    },
    {
      "names": [
        "quay.io/prometheus/node-exporter@sha256:d00a542e409ee618a4edc67da14dd48c5da66726bbd5537ab2af9c1dfc442c8a",
        "quay.io/prometheus/node-exporter:v1.9.1"
      ],
      "sizeBytes": 12955907
    },
    {
      "names": [
        "registry.k8s.io/sig-storage/csi-node-driver-registrar@sha256:2cddcc716c1930775228d56b0d2d339358647629701047edfdad5fcdfaf4ebcb",
        "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1"
      ],
      "sizeBytes": 10755082
    }
  ],
  "nodeInfo": {
    "architecture": "amd64",
    "bootID": "8800d241-74de-4846-8d7b-2b8c3cac3ab6",
    "containerRuntimeVersion": "containerd://1.7.26-d2iq.1",
    "kernelVersion": "5.15.0-144-generic",
    "kubeProxyVersion": "v1.32.3",
    "kubeletVersion": "v1.32.3",
    "machineID": "a833d33e380440fabe004f553674800b",
    "operatingSystem": "linux",
    "osImage": "Ubuntu 22.04.5 LTS",
    "systemUUID": "a833d33e-3804-40fa-be00-4f553674800b"
  }
}

Environment:

  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: v1.32.2
Kustomize Version: v5.5.0
Server Version: v1.32.3
  • Hardware configuration:
    • Network adapter model and firmware version:
Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
Driver Version: mlx5_core:24.10-1.1.4
Firmware Version: 22.43.2026 (MT_0000000437)
  • OS (e.g: cat /etc/os-release):
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
  • Kernel (e.g. uname -a): 5.15.0-151-generic

network-operator-controller-logs.txt

network-operator-controller-logs.txt

  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions