[Relax][ONNX] Add frontend support for QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear#19391
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements the ONNX operators QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear in the Relax frontend. It also enhances the legalization and struct info inference for quantization operations to correctly handle singleton tensors (shape-[1]) as scalars and expands the supported data types for the zero_point parameter. The review feedback indicates that the axis attribute in the v10 implementations of QuantizeLinear and DequantizeLinear is currently hardcoded to 0, whereas it should default to 1 and be retrieved from the operator attributes to comply with the ONNX specification.
| x, scale = inputs[0], inputs[1] | ||
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | ||
| out_dtype = "uint8" if zp is None else zp.struct_info.dtype | ||
| if zp is None: | ||
| zp = relax.const(0, out_dtype) | ||
| return relax.op.quantize(x, scale, zp, axis=0, out_dtype=out_dtype) |
There was a problem hiding this comment.
The axis for quantization is hardcoded to 0. According to the ONNX specification for QuantizeLinear opset 10, there is an axis attribute that defaults to 1. This implementation should handle the axis attribute from attr, similar to how it's done in _impl_v13.
| x, scale = inputs[0], inputs[1] | |
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | |
| out_dtype = "uint8" if zp is None else zp.struct_info.dtype | |
| if zp is None: | |
| zp = relax.const(0, out_dtype) | |
| return relax.op.quantize(x, scale, zp, axis=0, out_dtype=out_dtype) | |
| x, scale = inputs[0], inputs[1] | |
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | |
| axis = attr.get("axis", 1) | |
| if hasattr(x.struct_info, "ndim") and x.struct_info.ndim <= 1 and axis == 1: | |
| axis = 0 | |
| out_dtype = "uint8" if zp is None else zp.struct_info.dtype | |
| if zp is None: | |
| zp = relax.const(0, out_dtype) | |
| return relax.op.quantize(x, scale, zp, axis=axis, out_dtype=out_dtype) |
| x, scale = inputs[0], inputs[1] | ||
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | ||
| if zp is None: | ||
| zp = relax.const(0, x.struct_info.dtype) | ||
| return relax.op.dequantize(x, scale, zp, axis=0, out_dtype="float32") |
There was a problem hiding this comment.
The axis for dequantization is hardcoded to 0. The ONNX specification for DequantizeLinear opset 10 includes an axis attribute that defaults to 1. This should be handled from the attr dictionary, similar to the implementation in _impl_v13.
| x, scale = inputs[0], inputs[1] | |
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | |
| if zp is None: | |
| zp = relax.const(0, x.struct_info.dtype) | |
| return relax.op.dequantize(x, scale, zp, axis=0, out_dtype="float32") | |
| x, scale = inputs[0], inputs[1] | |
| zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None | |
| axis = attr.get("axis", 1) | |
| if hasattr(x.struct_info, "ndim") and x.struct_info.ndim <= 1 and axis == 1: | |
| axis = 0 | |
| if zp is None: | |
| zp = relax.const(0, x.struct_info.dtype) | |
| return relax.op.dequantize(x, scale, zp, axis=axis, out_dtype="float32") |
…ar, and DynamicQuantizeLinear
Summary
This PR adds Relax ONNX frontend support for:
QuantizeLinearDequantizeLinearDynamicQuantizeLinearThe implementation follows existing TVM ONNX frontend patterns and keeps QDQ handling consistent for singleton quantization parameters and optional zero-point inputs.
Changes
QuantizeLinear,DequantizeLinear, andDynamicQuantizeLinearDynamicQuantizeLinearTests
Added or updated tests in
tests/python/relax/test_frontend_onnx.pyto cover:QuantizeLinearin opset 10DequantizeLinearin opset 10QuantizeLinearin opset 13DynamicQuantizeLinearin opset 11Validation
Validated with:
python -m pytest -n 1 tests/python/relax/test_frontend_onnx.py -k "quantizelinear or dequantizelinear or dynamicquantizelinear" -vResult:
4 passed