Weekly release: 0.19.0rc0 #3588
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
The TensorRT-LLM team is pleased to announce that we have updated a weekly release
0.19.0rc0, and pushed an update to the Triton backend this April 15, 2025.The
0.19.0rc0dev release includes:examples/gemma/README.md. (feat: Support gemma-3-1b-it #3247)ENABLE_MULTI_DEVICEandENABLE_UCXas CMake options (feat: register ENABLE_MULTI_DEVICE and ENABLE_UCX as CMake options #3343)PyExecutorinference flow to estimatemax_num_tokensforkv_cache_manager(feat: Run PyExecutor's inference flow to estimate max_num_tokens for kv_cache_manager #3092)TLLM_OVERRIDE_LAYER_NUMandTLLM_TRACE_MODEL_FORWARDenvironment variables for debugging (feat: Support TLLM_OVERRIDE_LAYER_NUM and TLLM_TRACE_MODEL_FORWARD for debugging #3417)AutoTunerto both Fused MoE and NVFP4 Linear operators (feat: Apply the new torch-flow compatible AutoTuner to both Fused MoE and NVFP4 Linear operators. #3151)UserBuffersallocator for PyTorch flow (feat: Introduce UB allocator for pytorch flow #3257)init.py(feat: Enhance the integrated robustness of scaffolding with __init__.… #3312)numNodestoParallelConfig(feat: Add numNodes to ParallelConfig #3346)KvCacheConfiginexamples/gpqa_llmapi.py(feat: add qwen2 moe to torch flow; fix wrong imported KvCacheConfig in gpqa… #3369)max_seq_leninexecutor_config(fix: fix max_seq_len in executor_config #3487)context_and_generationrequest type in disaggregated overlap (fix: Allow context_and_generation request type in disagg overlap #3489)py_decoding_iterupdate in the decoder (fix: fix the py_decoding_iter update in decoder #3297)FP4Linear(fix [NVBUG 5208255] Fix missing bias add for FP4Linear. #3361)test_deepseek_allreduce.py(fix: runtime error in test_deepseek_allreduce.py #3226)PyExecutorand improved TP support (Fix torch nvsmall through pyexecutor and fix its TP support #3238)The cut-off commit to this release is 258ae9c. The code changes can be seen here: 5aeef6d...258ae9c.
Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions