Skip to content

Commit 49c45eb

Browse files
[None][fix] change logging for weight loading on unified memory (#9177)
Signed-off-by: Faraz Khoubsirat <[email protected]> Signed-off-by: Simeng Liu <[email protected]> Co-authored-by: Simeng Liu <[email protected]>
1 parent 1eae941 commit 49c45eb

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

tensorrt_llm/_torch/modules/linear.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,9 @@ def load_weight_shard(
7676
# For integrated GPU systems (e.g., DGX Spark), CPU and GPU share limited physical memory.
7777
# Avoiding device transfers reduces memory consumption and unnecessary data copies,
7878
# enabling support for larger models on memory-constrained systems.
79-
logger.warning(
80-
f"[load_weight_shard] Skipping device transfer from {weight.device} to {device} on integrated GPU to conserve shared memory."
81-
)
79+
logger.warning_once(
80+
f"[load_weight_shard] Skipping device transfer from {weight.device} to {device} on integrated GPU to conserve shared memory.",
81+
key="load_weight_shard_skip_device_transfer_with_integrated_gpu")
8282
device = weight.device
8383
if isinstance(weight, torch.Tensor):
8484
tensor_shape = weight.shape

0 commit comments

Comments
 (0)