Skip to content

Commit 2128f73

Browse files
bo-nvjthomson04
andauthored
[TRTLLM-9247][infra] Upgrade NIXL to 0.7.1 (#9055)
Signed-off-by: Bo Deng <[email protected]> Signed-off-by: jthomson04 <[email protected]> Co-authored-by: jthomson04 <[email protected]>
1 parent 46dccb5 commit 2128f73

File tree

3 files changed

+21
-10
lines changed

3 files changed

+21
-10
lines changed

docker/common/install_nixl.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ set -ex
44
GITHUB_URL="https://github.com"
55
UCX_INSTALL_PATH="/usr/local/ucx/"
66
CUDA_PATH="/usr/local/cuda"
7-
NIXL_VERSION="0.5.0"
7+
NIXL_VERSION="0.7.1"
88
NIXL_REPO="https://github.com/ai-dynamo/nixl.git"
99
OLD_LD_LIBRARY_PATH=$LD_LIBRARY_PATH
1010

jenkins/current_image_tags.properties

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# images are adopted from PostMerge pipelines, the abbreviated commit hash is used instead.
1414
IMAGE_NAME=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm
1515

16-
LLM_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.10-py3-x86_64-ubuntu24.04-trt10.13.3.9-skip-tritondevel-202511131803-8929
17-
LLM_SBSA_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.10-py3-aarch64-ubuntu24.04-trt10.13.3.9-skip-tritondevel-202511131803-8929
18-
LLM_ROCKYLINUX8_PY310_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-13.0.2-devel-rocky8-x86_64-rocky8-py310-trt10.13.3.9-skip-tritondevel-202511131803-8929
19-
LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-13.0.2-devel-rocky8-x86_64-rocky8-py312-trt10.13.3.9-skip-tritondevel-202511131803-8929
16+
LLM_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.10-py3-x86_64-ubuntu24.04-trt10.13.3.9-skip-tritondevel-202511200955-9055
17+
LLM_SBSA_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.10-py3-aarch64-ubuntu24.04-trt10.13.3.9-skip-tritondevel-202511200955-9055
18+
LLM_ROCKYLINUX8_PY310_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-13.0.2-devel-rocky8-x86_64-rocky8-py310-trt10.13.3.9-skip-tritondevel-202511200955-9055
19+
LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-13.0.2-devel-rocky8-x86_64-rocky8-py312-trt10.13.3.9-skip-tritondevel-202511200955-9055

tests/integration/defs/llmapi/test_llm_api_connector.py

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -356,11 +356,15 @@ def test_connector_disagg_prefill(enforce_single_worker, model_with_connector,
356356
save_async):
357357
model_fn, scheduler, worker = model_with_connector
358358

359-
model = model_fn(
359+
prefill_worker = model_fn(
360360
disable_overlap_scheduler=True,
361361
cache_transceiver_config=CacheTransceiverConfig(backend="DEFAULT"))
362362

363-
sampling_params = SamplingParams(ignore_eos=True)
363+
decode_worker = model_fn(
364+
cache_transceiver_config=CacheTransceiverConfig(backend="DEFAULT"),
365+
kv_connector_config=None)
366+
367+
sampling_params = SamplingParams(ignore_eos=True, max_tokens=16)
364368

365369
disaggregated_params = DisaggregatedParams(request_type="context_only")
366370

@@ -375,9 +379,16 @@ def test_connector_disagg_prefill(enforce_single_worker, model_with_connector,
375379
scheduler.request_finished.return_value = False
376380
worker.get_finished.return_value = [], []
377381

378-
model.generate([0] * 48,
379-
sampling_params=sampling_params,
380-
disaggregated_params=disaggregated_params)
382+
result = prefill_worker.generate([0] * 48,
383+
sampling_params=sampling_params,
384+
disaggregated_params=disaggregated_params)
385+
386+
gen_disagg_params = result.disaggregated_params
387+
gen_disagg_params.request_type = "generation_only"
388+
389+
result = decode_worker.generate([0] * 48,
390+
sampling_params=sampling_params,
391+
disaggregated_params=gen_disagg_params)
381392

382393
assert scheduler.build_connector_meta.call_count == 1
383394

0 commit comments

Comments
 (0)