[chore]: weekly bump of uv.lock on main (2026-03-30) by github-actions[bot] · Pull Request #4 · noeyy-mino/Model-Optimizer

github-actions · 2026-03-30T10:22:06Z

Summary

Automated weekly update of uv.lock file for nSpect Scanning:

uv.lock — upgraded all transitive dependencies to latest compatible versions

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…ns (NVIDIA#1117) ### What does this PR do? Type of change: BugFix - ModelOpt was not placing Q/DQ nodes between Conv and LayerNormalization, causing TensorRT to select slower i8f16 kernels instead of faster i8i8 kernels for Conv layers whose output feeds into LayerNorm (e.g., ConvNext models). ### Changes: - Register LayerNormalization in ORT's QDQ registry (ort_utils.py) - Add find_conv_to_layernorm_nodes() to detect Conv→(copy ops)→LayerNorm patterns (graph_utils.py) - Add detected LayerNorm nodes to quantizable nodes list so Q/DQ pairs are inserted on Conv output (int8.py) ### Usage ```python # No API change — existing quantize() call now automatically handles Conv->LayerNorm patterns import modelopt.onnx.quantization as moq moq.quantize("convnext.onnx", quantize_mode="int8") # Output model will now have: Conv -> Transpose -> Q -> DQ -> LayerNorm ```` ### Testing - Added unit test test_conv_layernorm_quantization with a ConvNext-like test model (Conv→Transpose→LayerNorm→Transpose→Conv) - All 237 unit tests pass (236 existing + 1 new) - Validated on ConvNext-tiny (opset 17): 23/23 LayerNorm nodes get Q/DQ on activation input - Built TRT engine in nvcr.io/nvidia/tensorrt:26.02-py3 — Conv layers after LayerNorm select i8i8 kernels ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A - Did you write any new necessary tests?: ✅ - Did you update Changelog?: N/A ### Additional Information - NVBug: 5271237 - JIRA: OMNIML-2380 - Note: The residual Add output quantization discussed in comments #4-9 of the bug is not addressed here and will be handled in a separate PR. - Accuracy regression observed for the ConvNext model. Please have a look at this [comment](NVIDIA#1117 (comment))  ## Summary by CodeRabbit * **New Features** * Expanded quantization to include LayerNormalization in Q/DQ flows and detect Conv→LayerNormalization patterns for quantization. * **Refactor** * Improved internal graph-analysis helpers to make detection logic reusable and more reliable. * **Tests** * Added a model builder for Conv→LayerNorm graphs and a unit test validating end-to-end Conv→LayerNormalization quantization.  --------- Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

[chore]: bump uv.lock

66377b2

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[chore]: weekly bump of uv.lock on main (2026-03-30)#4

[chore]: weekly bump of uv.lock on main (2026-03-30)#4
github-actions[bot] wants to merge 1 commit intomainfrom
auto/bump-uv-lock-main-2026-03-30

github-actions Bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

github-actions Bot commented Mar 30, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants