Skip to content

Conversation

@gcunhase
Copy link
Contributor

@gcunhase gcunhase commented Nov 26, 2025

What does this PR do?

Type of change: Bug fix

Overview: Some tensor's precision weren't set in the ONNX graph (value_info) when converting the model with Autocast. This caused ONNX-Graphsurgeon to incorrectly interpret those precisions in gs.import_onnx(ModelProto), which caused quantization of the converted model to fail.

Usage

$ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --keep_io_types
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.fp16.onnx --calibration_eps cpu

Testing

See bug 5680954@12.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

@gcunhase gcunhase requested a review from a team as a code owner November 26, 2025 20:50
@gcunhase gcunhase requested review from ajrasane and galagam November 26, 2025 20:50
@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.68%. Comparing base (7edf59c) to head (815ebf7).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #611      +/-   ##
==========================================
+ Coverage   74.66%   74.68%   +0.01%     
==========================================
  Files         183      183              
  Lines       18550    18552       +2     
==========================================
+ Hits        13851    13856       +5     
+ Misses       4699     4696       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gcunhase gcunhase force-pushed the dev/gcunhasergio/5680954_autocast_cast_out_type_issue branch 2 times, most recently from b3dd817 to 4503e1a Compare December 3, 2025 20:24
@gcunhase gcunhase force-pushed the dev/gcunhasergio/5680954_autocast_cast_out_type_issue branch from 4503e1a to 815ebf7 Compare December 4, 2025 19:39
@gcunhase gcunhase enabled auto-merge (squash) December 4, 2025 19:39
@gcunhase gcunhase merged commit 097037d into NVIDIA:main Dec 4, 2025
36 checks passed
kevalmorabia97 pushed a commit that referenced this pull request Dec 7, 2025
…aph (#611)

## What does this PR do?

**Type of change:** Bug fix

**Overview:** Some tensor's precision weren't set in the ONNX graph
(`value_info`) when converting the model with Autocast. This caused
ONNX-Graphsurgeon to incorrectly interpret those precisions in
`gs.import_onnx(ModelProto)`, which caused quantization of the converted
model to fail.

## Usage
```python
$ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --keep_io_types
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.fp16.onnx --calibration_eps cpu
```

## Testing
See bug 5680954@12.

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No

---------

Signed-off-by: gcunhase <[email protected]>
soodoshll pushed a commit to soodoshll/TensorRT-Model-Optimizer that referenced this pull request Dec 8, 2025
…aph (NVIDIA#611)

## What does this PR do?

**Type of change:** Bug fix

**Overview:** Some tensor's precision weren't set in the ONNX graph
(`value_info`) when converting the model with Autocast. This caused
ONNX-Graphsurgeon to incorrectly interpret those precisions in
`gs.import_onnx(ModelProto)`, which caused quantization of the converted
model to fail.

## Usage
```python
$ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --keep_io_types
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.fp16.onnx --calibration_eps cpu
```

## Testing
See bug 5680954@12.

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No

---------

Signed-off-by: gcunhase <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants