Commit 2abd7e3
Allow NVSHMEM PE to NIC to be initialized by rank
The `nvshmemi_get_devices_by_distance` default initialization
method in NVSHMEM does not work optimally for GPU configurations
where 2 GPU and 2 RDMA NIC share a PCIe bus, such as the x86
based GCP A3 Ultra H200 and A4 B200 instance types:
https://cloud.google.com/compute/docs/gpus/gpu-network-bandwidth#h200-gpus.
GPU0 and GPU1 (on two independent processes) can observe NIC0, NIC1
on the same PCIe switch are equidistant and result in both GPUs
leveraging NIC0, halving the observed bandwidth for RDMA in
test_internode.py and in vLLM wide-EP.
The alternative is a static mapping between GPU host index (PE) and
NIC index (HCA), but the NVSHMEMX_INIT_WITH_UNIQUEID
initialization method bypasses setting `mype_node` and `npes_node`.
The `nvshmemi_boot_handle.pg_rank` for this initialization method
is always 0 and the `nvshmem_boot_handle.pg_size` is always 2,
preventing NVSHMEM_ENABLE_NIC_PE_MAPPING from leveraging a static
list of devices in transport.cpp#nvshmemi_setup_connections:
selected_devices[0] =
nvshmemi_state->mype_node % (tcurr->n_devices > 0
? tcurr->n_devices : 1);
has mype_node = 0 for all devices.
To allow static assignment, introduce a DEEP_EP_DEVICE_TO_HCA_MAPPING
environment variable during Buffer python initialization that accepts
`<cuda_device_id>:<HCA_name>:<HCA_port>` and resolves
`torch.cuda.current_device()` to set NVSHMEM_HCA_LIST to the
appropriate value or error.
Co-Authored-By: Keon Jang <[email protected]>
Signed-off-by: Clayton Coleman <[email protected]>1 parent bfded34 commit 2abd7e3
1 file changed
+25
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
| 105 | + | |
104 | 106 | | |
105 | 107 | | |
106 | 108 | | |
| |||
133 | 135 | | |
134 | 136 | | |
135 | 137 | | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
136 | 161 | | |
137 | 162 | | |
138 | 163 | | |
| |||
0 commit comments