[chore]: weekly bump of uv.lock on main (2026-03-30)#4
Open
github-actions[bot] wants to merge 1 commit intomainfrom
Open
[chore]: weekly bump of uv.lock on main (2026-03-30)#4github-actions[bot] wants to merge 1 commit intomainfrom
github-actions[bot] wants to merge 1 commit intomainfrom
Conversation
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
noeyy-mino
pushed a commit
that referenced
this pull request
Mar 31, 2026
…ns (NVIDIA#1117) ### What does this PR do? Type of change: BugFix - ModelOpt was not placing Q/DQ nodes between Conv and LayerNormalization, causing TensorRT to select slower i8f16 kernels instead of faster i8i8 kernels for Conv layers whose output feeds into LayerNorm (e.g., ConvNext models). ### Changes: - Register LayerNormalization in ORT's QDQ registry (ort_utils.py) - Add find_conv_to_layernorm_nodes() to detect Conv→(copy ops)→LayerNorm patterns (graph_utils.py) - Add detected LayerNorm nodes to quantizable nodes list so Q/DQ pairs are inserted on Conv output (int8.py) ### Usage ```python # No API change — existing quantize() call now automatically handles Conv->LayerNorm patterns import modelopt.onnx.quantization as moq moq.quantize("convnext.onnx", quantize_mode="int8") # Output model will now have: Conv -> Transpose -> Q -> DQ -> LayerNorm ```` ### Testing - Added unit test test_conv_layernorm_quantization with a ConvNext-like test model (Conv→Transpose→LayerNorm→Transpose→Conv) - All 237 unit tests pass (236 existing + 1 new) - Validated on ConvNext-tiny (opset 17): 23/23 LayerNorm nodes get Q/DQ on activation input - Built TRT engine in nvcr.io/nvidia/tensorrt:26.02-py3 — Conv layers after LayerNorm select i8i8 kernels ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A - Did you write any new necessary tests?: ✅ - Did you update Changelog?: N/A ### Additional Information - NVBug: 5271237 - JIRA: OMNIML-2380 - Note: The residual Add output quantization discussed in comments #4-9 of the bug is not addressed here and will be handled in a separate PR. - Accuracy regression observed for the ConvNext model. Please have a look at this [comment](NVIDIA#1117 (comment)) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Expanded quantization to include LayerNormalization in Q/DQ flows and detect Conv→LayerNormalization patterns for quantization. * **Refactor** * Improved internal graph-analysis helpers to make detection logic reusable and more reliable. * **Tests** * Added a model builder for Conv→LayerNorm graphs and a unit test validating end-to-end Conv→LayerNormalization quantization. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Automated weekly update of uv.lock file for nSpect Scanning:
uv.lock— upgraded all transitive dependencies to latest compatible versions