[Inductor] Support scaled mm on inductor #2411

shiyang-weng · 2025-06-19T01:48:15Z

Blocked by #2379

Fuse following pattern to scaled_mm

    #   + - - - - | - - - - - -  | - - - -  +
    #   |    dq_per_tensor  dq_per_tensor   |
    #   |         |              |          |
    #   |    OPT(to_bf16)    OPT(to_bf16)   |
    #   |         |             |           |
    #   |    OPT(reshape)     permute       |
    #   |          \           /            |
    #   |             addmm/mm              |
    #   |                |                  |
    #   |      OPT(quant_per_tensor)        |
    #   |                |                  |
    #   |          OPT(reshape)             |

…uctor

pytorch-bot · 2025-06-19T01:48:19Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2411

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-06-19T02:40:24Z

test/float8/test_compile.py

@@ -392,5 +392,59 @@ def test_dynamic_scale_numeric_parity(
    assert torch.equal(float8_eager._data, float8_compile._data)


+@pytest.mark.parametrize(


I believe this is the training float8 test file, float8 inference is using https://github.com/pytorch/ao/blob/main/test/dtypes/test_affine_quantized_float.py

I believe this is the training float8 test file, float8 inference is using https://github.com/pytorch/ao/blob/main/test/dtypes/test_affine_quantized_float.py

Ok. I change the ut path on last pr #2379

shiyang-weng added 6 commits June 18, 2025 15:22

quantize_affine_float8/dequantize_affine_float8 not decomposed on ind…

a840ef5

…uctor

remove redundant unittest.skipIf

02d045b

fix rebase issue

9860c56

change dispatch key to a flag decomposed

ca662f3

support scaled_mm on inductor

f51a5be

fix rebase issue

719793c

shiyang-weng marked this pull request as draft June 19, 2025 01:48

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 19, 2025

jerryzh168 reviewed Jun 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inductor] Support scaled mm on inductor #2411

[Inductor] Support scaled mm on inductor #2411

Uh oh!

shiyang-weng commented Jun 19, 2025

Uh oh!

pytorch-bot bot commented Jun 19, 2025

Uh oh!

jerryzh168 Jun 19, 2025

Uh oh!

shiyang-weng Jun 19, 2025

Uh oh!

Uh oh!

		@@ -392,5 +392,59 @@ def test_dynamic_scale_numeric_parity(
		assert torch.equal(float8_eager._data, float8_compile._data)


		@pytest.mark.parametrize(

[Inductor] Support scaled mm on inductor #2411

Are you sure you want to change the base?

[Inductor] Support scaled mm on inductor #2411

Uh oh!

Conversation

shiyang-weng commented Jun 19, 2025

Uh oh!

pytorch-bot bot commented Jun 19, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2411

Uh oh!

jerryzh168 Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

shiyang-weng Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!