-
Notifications
You must be signed in to change notification settings - Fork 66
Description
When using HostDeviceNetwork to map an InfiniBand device into a Pod, the IB network interface inside the Pod shows the state as DOWN with NO-CARRIER. This results in no active link and no InfiniBand communication inside the Pod.
Environment:
Network Operator version: 23.7.0
Kubernetes version: (e.g., 1.27)
MLNX_OFED version: 5.17.0.1.MLNX20240219.0eca20cc
OpenSM status: active and running
rdma-core package: installed
Networking mode: HostDeviceNetwork
Pod interface status (inside the Pod):
root@test-sr\ov-podi:/etc/apt# ibv_devinfo
hca_id: mlx5_7
transport: InfiniBand (0)
fw_ver: 28.39.3004
node_guid: 0000:0000:0000:0000
sys_image_guid: a088::c203:005b:370c
vendor_id: 0x02c9
vendor_part_id: 4126
hw_ver: 0x0
board_id: MT_0000000838
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 4
port_id: 65535
port_lmc: 0x00
link_layer: InfiniBand
root@test-sr\ov-podi:/etc/apt# ibv_devinfo
hca_id: mlx5_7
transport: InfiniBand (0)
fw_ver: 28.39.3004
node_guid: 0000:0000:0000:0000
sys_image_guid: a088::c203:005b:370c
vendor_id: 0x02c9
vendor_part_id: 4126
hw_ver: 0x0
board_id: MT_0000000838
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 4
port_id: 65535
port_lmc: 0x00
link_layer: InfiniBand
root@test-sr\ov-podi:/etc/apt# ip link show net1
15: net1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc mq state DOWN mode DEFAULT group default qlen 256
link/infiniband 00:00:00:88:fe:80:00:00:00:00:00:00:00:00:00:00 brd 00:ff:ff:ff:12:40:1b:ff:
alias tbs9v3
altname tbp56s0v3
HostDeviceNetwork YAML
apiVersion: mellanox.com/vialphal
kind: HostDeviceNetwork
metadata:
name: host-network-ib
namespace: network-operator
spec:
networkNamespace: network-operator
resourceName: sriov_resource_ib
tpam: |
{
"datastore": "kubernetes",
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
},
"log_file": "/tmp/whereabouts.log",
"log_level": "debug",
"type": "whereabouts",
"range": "192.168.88.0/24"
}
}
policy yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-ib
namespace: network-operator
spec:
resourceName: sriov_resource_ib
nodeSelector:
ibs: "true"
numVfs: 4
nicSelector:
pfNames:
- "ibsg"
deviceType: netdevice
tsRdma: true
needVhostNet: false
linkType: ib
Additional Information
The physical InfiniBand port on the host is UP and functioning correctly.
The Pod is configured to use HostDeviceNetwork to map the IB device.
Questions
Is HostDeviceNetwork suitable for InfiniBand mode usage?
Is there any special OpenSM configuration required when using HostDeviceNetwork to map InfiniBand devices into Pods?
If yes, could you provide an example or guidance on how to configure OpenSM properly for this use case?