Skip to content

Conversation

@ruro
Copy link

@ruro ruro commented Nov 21, 2025

Changes

  • added ONNXAttentionMetatype for the opset 23 Attention ONNX node
  • fixed scaled_dot_product_attention quantization in torch2 for the case when Q, K and V are parallel edges coming from the same input node

Reason for changes

See #3750

Related tickets

Fixes #3750

Tests

  • tests/onnx/quantization/test_graphs.py::test_synthetic_models_graph[AttentionModel] attention_model dot

  • tests/torch2/function_hook/quantization/test_quantized_graphs.py::test_quantized_graphs[unbind_scaled_dot_product_attention_model] unbind_scaled_dot_product_attention_model dot

@ruro ruro force-pushed the fix_onnx_attention_torch_sdpa_handling branch from 734ba64 to 791f962 Compare November 21, 2025 09:26
@github-actions github-actions bot added the NNCF ONNX Pull requests that updates NNCF ONNX label Nov 21, 2025
@ruro ruro marked this pull request as ready for review November 21, 2025 09:36
@ruro ruro requested a review from a team as a code owner November 21, 2025 09:36
@ruro
Copy link
Author

ruro commented Nov 21, 2025

Hm. iirc onnx added support for opset 23 in version 1.18.0. So the new test is currently failing in CI due to

onnx==1.17.0; python_version < '3.13'
onnx==1.18.0; python_version >= '3.13'

Do you have any preferences if I should mark this test as

@pytest.mark.skipif(
    version.parse(onnx.__version__) < version.parse("1.18.0"),
    reason="Opset 23 was added in onnx 1.18.0",
)

or bump the version or something else?

@andrey-churkin andrey-churkin self-assigned this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF ONNX Pull requests that updates NNCF ONNX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MULTIHEAD_ATTENTION_OUTPUT ignored patterns don't match "proper" SDPA / Attention

2 participants