-
Notifications
You must be signed in to change notification settings - Fork 3.4k
[feature] Add layerwise NVTX support #11870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Add layerwise NVTX support #11870
Conversation
Summary of ChangesHello @kyleliang-nv, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances SGLang's profiling capabilities by integrating layer-wise NVTX support for PyTorch models. This allows developers to gain granular insights into the performance of individual model layers using tools like Nsight Systems, facilitating bottleneck identification and optimization. The changes include a new server argument to enable this feature, a dedicated utility for PyTorch hooks, and updated documentation to guide users through the new profiling workflows, alongside minor improvements to logging for better step tracking. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces layer-wise NVTX profiling support, a valuable feature for performance analysis. The changes include adding a PytHooks module for PyTorch hooks, a new server argument to enable this functionality, and comprehensive documentation updates. The implementation is well-done, but I have a few suggestions for the new pytorch_hooks.py file to improve code style and fix a minor bug in debug code. The documentation is thorough, though some minor formatting adjustments would enhance consistency.
615e3e9 to
c17442d
Compare
c17442d to
fa618d5
Compare
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This reverts commit 46ecb6e.
fa618d5 to
f3d023e
Compare
Fridge003
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job
Thanks for reviewing it and merging it in @Fridge003 ! |
Motivation
Enable auto-inserting NVTX markers that contains layer name and input tensor shapes.
Modifications
pytorch_hooks.pywhich will add pre/post-foward hooks with pytorch.enable-layerwise-nvtx-markerserver arg. This will register the loaded model with pytorch_hooks.py/start_profile + nsys profilestart_profilewithstart_stepsandnum_stepsenable-layerwise-nvtx-markerwith/start_profileendpointAccuracy Tests
Benchmarking and Profiling
With

enable-layerwise-nvtx-marker, the nsys output file will now contain NVTX which corresponds to the pytorch nn.module path and input tensor shapes.Checklist