[feature] Add layerwise NVTX support #11870

kyleliang-nv · 2025-10-20T16:37:38Z

Motivation

Enable auto-inserting NVTX markers that contains layer name and input tensor shapes.

Modifications

Add pytorch_hooks.py which will add pre/post-foward hooks with pytorch.
Add enable-layerwise-nvtx-marker server arg. This will register the loaded model with pytorch_hooks.py
Add unittest for /start_profile + nsys profile
Move Prefill log from before to after run_batch, to be consistent with Decode log
Add documents to explain how to use start_profile with start_steps and num_steps
Add document to explain how to use enable-layerwise-nvtx-marker with /start_profile endpoint

Accuracy Tests

Benchmarking and Profiling

With enable-layerwise-nvtx-marker, the nsys output file will now contain NVTX which corresponds to the pytorch nn.module path and input tensor shapes.

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist · 2025-10-20T16:37:58Z

Summary of Changes

Hello @kyleliang-nv, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances SGLang's profiling capabilities by integrating layer-wise NVTX support for PyTorch models. This allows developers to gain granular insights into the performance of individual model layers using tools like Nsight Systems, facilitating bottleneck identification and optimization. The changes include a new server argument to enable this feature, a dedicated utility for PyTorch hooks, and updated documentation to guide users through the new profiling workflows, alongside minor improvements to logging for better step tracking.

Highlights

Layer-wise NVTX Profiling: Introduced a new feature to automatically insert NVTX markers for each layer in a PyTorch model, including layer names and input tensor shapes, for detailed profiling with Nsight Systems.
New Server Argument: Added --enable-layerwise-nvtx to activate this new profiling capability.
Enhanced Profiling Documentation: Updated the benchmark_and_profiling.md documentation with comprehensive guides on using the new layer-wise NVTX profiling and the existing /start_profile and /end_profile HTTP API endpoints, including num_steps and start_step parameters.
Logging Improvements: Incorporated step # into prefill and decode log messages for better tracking of model execution steps.
PyTorch Hooks Utility: Added a new utility file (pytorch_hooks.py) to manage PyTorch forward and pre-forward hooks for NVTX annotation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces layer-wise NVTX profiling support, a valuable feature for performance analysis. The changes include adding a PytHooks module for PyTorch hooks, a new server argument to enable this functionality, and comprehensive documentation updates. The implementation is well-done, but I have a few suggestions for the new pytorch_hooks.py file to improve code style and fix a minor bug in debug code. The documentation is thorough, though some minor formatting adjustments would enhance consistency.

docs/developer_guide/benchmark_and_profiling.md

python/sglang/srt/utils/pytorch_hooks.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

This reverts commit 46ecb6e.

test/srt/test_start_profile.py

python/sglang/srt/server_args.py

python/sglang/srt/utils/nvtx_pytorch_hooks.py

test/srt/run_suite.py

Fridge003

Nice job

kyleliang-nv · 2025-11-16T06:22:41Z

Nice job

Thanks for reviewing it and merging it in @Fridge003 !

kyleliang-nv requested review from Ying1123, hnyls2002, ispobock, merrymercy, xiezhq-hermann and zhyncs as code owners October 20, 2025 16:37

gemini-code-assist bot reviewed Oct 20, 2025

View reviewed changes

kyleliang-nv requested a review from ByronHsu as a code owner October 20, 2025 22:10

kyleliang-nv force-pushed the feature/layerwise_nvtx branch from 615e3e9 to c17442d Compare October 27, 2025 04:49

kyleliang-nv force-pushed the feature/layerwise_nvtx branch from c17442d to fa618d5 Compare November 5, 2025 05:19

kyleliang-nv and others added 19 commits November 13, 2025 21:30

Add layerwise-nvtx and dummy register model function

27b4756

Add pyt_hooks.py

271db95

Cleanup pyt_hooks

5211d15

Enable pyt_hooks

4108f0a

Guard laywerwise nvtx

e699e00

Code cleanup

c63456b

Rename pyt_hooks.py to pytorch_hooks.py

22fd4b9

Add doc to describle how /start_profile and /end_profile works

79150f2

Add doc for layerwise nvtx profiling

13e5faf

Improve wording of start profile

874ebae

Fix nvtx prefix string

5484dda

Fix bad merge

873325e

Rename arg to enable-layerwise-nvtx-marker

dae27d7

Move printing prefill log from before to after run_batch

4a45355

Add unittest for /start_profile + nsys profile

2dff0f5

Cleanup debug and comments in pytorch_hooks.py

22f7989

Apply suggestions from code review

410b91b

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Small fix/cleanup

74939bc

Revert "Move printing prefill log from before to after run_batch"

f3d023e

This reverts commit 46ecb6e.

kyleliang-nv force-pushed the feature/layerwise_nvtx branch from fa618d5 to f3d023e Compare November 14, 2025 05:30

kyleliang-nv requested a review from Fridge003 as a code owner November 14, 2025 05:30

github-actions bot added the documentation Improvements or additions to documentation label Nov 14, 2025

Fridge003 reviewed Nov 15, 2025

View reviewed changes

test/srt/test_start_profile.py Show resolved Hide resolved

python/sglang/srt/server_args.py Show resolved Hide resolved

python/sglang/srt/utils/nvtx_pytorch_hooks.py Show resolved Hide resolved

python/sglang/srt/utils/nvtx_pytorch_hooks.py Show resolved Hide resolved

kyleliang-nv and others added 5 commits November 14, 2025 21:47

Add test_start_profiler into 1-gpu nightly test

80de452

Update server arg md

080ed50

Rename pytorch_hooks.py to nvtx_pytorch_hooks.py

d14c259

Remove LSTM and ReflectionPad in nvtx hooks

a0b9030

Merge branch 'main' into feature/layerwise_nvtx

fc99921

Fridge003 reviewed Nov 15, 2025

View reviewed changes

test/srt/run_suite.py Outdated Show resolved Hide resolved

Update test/srt/run_suite.py

32944bd

Fridge003 approved these changes Nov 15, 2025

View reviewed changes

Fridge003 added the run-ci label Nov 15, 2025

Fridge003 merged commit 597d416 into sgl-project:main Nov 16, 2025
131 of 164 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] Add layerwise NVTX support #11870

[feature] Add layerwise NVTX support #11870

kyleliang-nv commented Oct 20, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Oct 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fridge003 left a comment

Uh oh!

Uh oh!

kyleliang-nv commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[feature] Add layerwise NVTX support #11870

[feature] Add layerwise NVTX support #11870

Conversation

kyleliang-nv commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Oct 20, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fridge003 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kyleliang-nv commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kyleliang-nv commented Oct 20, 2025 •

edited

Loading