Skip to content

kernel: nv_mem nv_get_p2p_free_callback:155 nv_get_p2p_free_callback -- invalid dma_mapping #111

@susol-hjkim

Description

@susol-hjkim

Hello ~

This system occured unexpect reboot.
I saw some logs before unexpected reboot in /var/log/syslog.

Dec 20 18:48:09 A100-42 kernel: nv_mem nv_get_p2p_free_callback:155 nv_get_p2p_free_callback -- invalid dma_mapping
Dec 20 18:48:09 A100-42 kernel: nv_mem nv_get_p2p_free_callback:155 nv_get_p2p_free_callback -- invalid dma_mapping

What is these logs mean?
Do that logs have relationship with unexpected reboot?

[ENV]
OS: ubuntu 20.04
Kernel : 5.4.0-42-generic
H/W : Supermicro AS-4124GO-NART (like DGX A100)

[GPU : 8ea]
NVIDIA A100-SXM4-80GB
Driver Version : 470.103.01
CUDA Version : 11.4

[IB : 8ea]
Ofed ver : OFED-5.6.0.1.6.1
nv_peer_mem : v1.0
CA 'mlx5_0'
CA type: MT4123
Number of ports: 1
Firmware version: 20.32.1010
Hardware version: 0
Node GUID: 0x08c0eb0300c8ff40
System image GUID: 0x08c0eb0300c8ff40
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 173
LMC: 0
SM lid: 233
Capability mask: 0x2651e848
Port GUID: 0x08c0eb0300c8ff40
Link layer: InfiniBand

Thanks ~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions