-
Notifications
You must be signed in to change notification settings - Fork 197
feat: add onnxslim support #478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: inisis <[email protected]>
setup.py
Outdated
| "onnxruntime-gpu~=1.22.0 ; platform_machine != 'aarch64' and platform_system != 'Darwin' and platform_system != 'Windows'", # noqa: E501 | ||
| "onnxruntime-directml==1.20.0; platform_system == 'Windows'", | ||
| "onnxscript", # For test_onnx_dynamo_export unit test | ||
| "onnxsim ; python_version < '3.12' and platform_machine != 'aarch64'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove onnxsim installation if it's no longer being used, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: inisis <[email protected]>
|
@gcunhase Hi, any update here? Thanks. |
|
@inisis I'm still validating Specifically, please check that the following CLI is still functional and performant: $ python -m modelopt.onnx.quantization --onnx_path=/mnt/models/bevformer_tiny_epoch_24_cp2_op13.onnx \
--trt_plugins=$PLUGIN_PATH \
--op_types_to_exclude MatMul \
--calibration_data_path=/workspace/BEVFormer_tensorrt/data/nuscenes/calib_data.npz \
--simplifyThanks! |
|
Hi, @gcunhase it took me some time to run bevformer-int8-eq, however everything is working fine, here are the results, EnvWithout simplify
With onnxsim
With onnxslim
to conclude:
Well, in terms of GPU Compute Time (median, ms), onnxsim is slightly faster, I compared two models using Onnxslim will merge Matmul + Add into Gemm, this is not in favor when using --op_types_to_exclude MatMul |
Hi @inisis thanks for validating this functionality. Were you also able to validate the numerical accuracy for the I will also do some investigation on the MatMul+Add vs Gemm substitution on my end in the meanwhile. Thanks! |
@gcunhase I didn't use the full dataset from nuscenes, it's too big, I used the mini one to do the calibration. If this counts, I can verify it on the mini one. |
No problem, let me try to verify the accuracy on my end. Thank you! |
Signed-off-by: inisis <[email protected]>
|
Hi @gcunhase , is there any update? Thanks |
|
@inisis we appreciate your contribution and wanted to make sure that there are no regressions before merging this PR. We've investigated potential risks in ~150 models and compiled a list of issues, divided into 3 categories, that would need to be solved before merging. All mentioned models and scripts are in the zip file: repro.zip 1. Functional failuresError logsError 1: repro_io_tensors_shape_dtype.onnx Error 2: repro_mode_error_mobilenetv1.onnx How to reproimport onnx
import onnxslim
model = onnx.load(input_model_path)
simplified_model = onnxslim.slim(model)2. ORT inference failuresError logsError 1: repro_mul_incompatible_dimensions.onnx Error 2: repro_gemm_invalid_shape.onnx How to reproRun the 3. ORT numerical accuracy failuresError logsThe simplified versions of the following models do not produce the same outputs as the original model for the same input data:
How to reproRun the -- |
| try: | ||
| model_simp, check = onnxsim.simplify(onnx_model) | ||
| if check: | ||
| model_simp = onnxslim.slim(onnx_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to verify that BEVFormer is compatible with onnxSLIM as long as we skip Gemm Fusion optimizations (skip_fusion_patterns=["FusionGemm"]). Otherwise, perf and accuracy degradation is observed.
Please update this line accordingly, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: inisis <[email protected]>
|
@gcunhase So much appreciation for your comprehensive testing, which has helped us improve onnxslim. All the issues you mentioned have been resolved in version 0.1.75 of onnxslim, and these models have also been added to onnxslim’s daily CI. Many thanks again. Here are some details when solving the issues: 1. Functional failuresIf model is ended with custom opertor as output, onnxslim is unable to do symbolic shape inference for it, so it will lose dtype and shape, we improved it by using the info already stored in the original model. 2. ORT inference failuresIn onnxslim the shape inference for the outputs of resize node is aligned with official onnx documentation where is onnxruntime, the output size if rounded, so there is a mismatch, and in some cases, there will be an incompatible_dimensions issue, now we are aligned with ort. 3. ORT numerical accuracy failuresthere is a precision issue with issue3_repro_conv_resize_issue.onnx the np.array_equal is passed, |



What does this PR do?
Type of change:
Add onnxslim support
Overview: Onnxslim is under active development and committed to long-time-support, it's easy to use and is dependent on very few packages.
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information