TensorRT-LLM Release 0.17.0 #2726
zeroepoch
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
We are very pleased to announce the 0.17.0 version of TensorRT-LLM. This update includes:
Key Features and Enhancements
LLMAPI andtrtllm-benchcommand.tensorrt_llm._torch. The following is a list of supported infrastructure, models, and features that can be used with the PyTorch workflow.LLMAPI.examples/multimodal/README.md.userbufferbased AllReduce-Norm fusion kernel.executorAPI.API Changes
paged_context_fmhais enabled.--concurrencysupport for thethroughputsubcommand oftrtllm-bench.Fixed Issues
cluster_keyfor auto parallelism feature. ([feature request] Can we add H200 in infer_cluster_key() method? #2552)__post_init__function ofLLmArgsClass. Thanks for the contribution from @topenkoff in Fix kwarg name #2691.Infrastructure Changes
nvcr.io/nvidia/pytorch:25.01-py3.nvcr.io/nvidia/tritonserver:25.01-py3.Known Issues
--extra-index-url https://pypi.nvidia.comwhen runningpip install tensorrt-llmdue to new third-party dependencies.Beta Was this translation helpful? Give feedback.
All reactions