Aiter fp8 kv cache #13147

kkHuang-amd · 2025-11-12T10:28:22Z

Motivation

Support fp8 kv cache in aiter-backend of AMD.

Aiter backend only support mla decode fp8 computation.

Other attention function still do bf16 computation

Modifications

Aiter backend and model runner.

Next actions

Support other attention function also do fp8 computation
Accuracy issue in MTP run without SGLANG_AITER_MLA_PERSIST

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

…fp8 kv_cache

…la_decode_forward kernel

…ation for paged_attention_ragged and mha_batch_prefill_func

root and others added 11 commits October 15, 2025 06:17

[FP8-KV-CACHE] Init vers.

bb955d2

[FP8 KV-CACHE] Force dtype conversion from fp8 to bflot16 for enable …

f558e8b

…fp8 kv_cache

Pass mla_decode_fwd accuracy test

7fc664b

Fix no scale issue for non-fp8 kv and default to use non-persistent m…

8245d3a

…la_decode_forward kernel

Add new env vairable to control mla decode persist kernel use or not

06b78dc

MTP fp8-kv accuracy pass

ac6ee4c

Fix GPU fault when using persist mla_decode_fwd kernel in MTP

7be57a8

Code refactor

f3864a9

Refactor code v2

5ce2aa1

Merge branch 'sgl-project:main' into aiter-fp8-kv-cache

64ec4f8

Merge branch 'sgl-project:main' into aiter-fp8-kv-cache

8ceda76

sglang-bot added the run-ci label Nov 12, 2025

wunhuang and others added 4 commits November 13, 2025 06:24

Code refactor v3

7848b53

Merge branch 'sgl-project:main' into aiter-fp8-kv-cache

3a2c5c8

Format code

a8463cc

According to the q type to convert kv cache type for following comput…

82294e9

…ation for paged_attention_ragged and mha_batch_prefill_func

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Aiter fp8 kv cache #13147

Aiter fp8 kv cache #13147

Uh oh!

kkHuang-amd commented Nov 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aiter fp8 kv cache #13147

Are you sure you want to change the base?

Aiter fp8 kv cache #13147

Uh oh!

Conversation

kkHuang-amd commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Next actions

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kkHuang-amd commented Nov 12, 2025 •

edited

Loading