Skip to content

HostDeviceNetwork maps InfiniBand device to Pod, but interface state is DOWN #1646

@calmyan

Description

@calmyan

When using HostDeviceNetwork to map an InfiniBand device into a Pod, the IB network interface inside the Pod shows the state as DOWN with NO-CARRIER. This results in no active link and no InfiniBand communication inside the Pod.

Environment:

Network Operator version: 23.7.0

Kubernetes version: (e.g., 1.27)

MLNX_OFED version: 5.17.0.1.MLNX20240219.0eca20cc

OpenSM status: active and running

rdma-core package: installed

Networking mode: HostDeviceNetwork

Pod interface status (inside the Pod):

root@test-sr\ov-podi:/etc/apt# ibv_devinfo
hca_id: mlx5_7
transport: InfiniBand (0)
fw_ver: 28.39.3004
node_guid: 0000:0000:0000:0000
sys_image_guid: a088::c203:005b:370c
vendor_id: 0x02c9
vendor_part_id: 4126
hw_ver: 0x0
board_id: MT_0000000838
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 4
port_id: 65535
port_lmc: 0x00
link_layer: InfiniBand

root@test-sr\ov-podi:/etc/apt# ibv_devinfo
hca_id: mlx5_7
transport: InfiniBand (0)
fw_ver: 28.39.3004
node_guid: 0000:0000:0000:0000
sys_image_guid: a088::c203:005b:370c
vendor_id: 0x02c9
vendor_part_id: 4126
hw_ver: 0x0
board_id: MT_0000000838
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 4
port_id: 65535
port_lmc: 0x00
link_layer: InfiniBand

root@test-sr\ov-podi:/etc/apt# ip link show net1
15: net1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc mq state DOWN mode DEFAULT group default qlen 256
link/infiniband 00:00:00:88:fe:80:00:00:00:00:00:00:00:00:00:00 brd 00:ff:ff:ff:12:40:1b:ff:
alias tbs9v3
altname tbp56s0v3

HostDeviceNetwork YAML
apiVersion: mellanox.com/vialphal
kind: HostDeviceNetwork
metadata:
name: host-network-ib
namespace: network-operator
spec:
networkNamespace: network-operator
resourceName: sriov_resource_ib
tpam: |
{
"datastore": "kubernetes",
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
},
"log_file": "/tmp/whereabouts.log",
"log_level": "debug",
"type": "whereabouts",
"range": "192.168.88.0/24"
}
}

policy yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-ib
namespace: network-operator
spec:
resourceName: sriov_resource_ib
nodeSelector:
ibs: "true"
numVfs: 4
nicSelector:
pfNames:
- "ibsg"
deviceType: netdevice
tsRdma: true
needVhostNet: false
linkType: ib

Additional Information
The physical InfiniBand port on the host is UP and functioning correctly.

The Pod is configured to use HostDeviceNetwork to map the IB device.

Questions
Is HostDeviceNetwork suitable for InfiniBand mode usage?

Is there any special OpenSM configuration required when using HostDeviceNetwork to map InfiniBand devices into Pods?

If yes, could you provide an example or guidance on how to configure OpenSM properly for this use case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions