Add FusedSDPA QKV slice support for APC #393

ikurtchen · 2025-11-18T08:32:48Z

The function can be controlled by following env:

VLLM_FUSEDSDPA_QKV_SLICE_SEQ_LEN_THLD
- When query or context exceeds this threshold, enable QKV slice.
- Default threshold is 8192.
- Set to 0 to disable this function.
VLLM_FUSEDSDPA_Q_SLICE_CHUNK_SIZE
- The Q chunk size for full attention part.
VLLM_FUSEDSDPA_KV_SLICE_CHUNK_SIZE
- The KV chunk size for full attention part.
VLLM_FUSEDSDPA_CAUSAL_QKV_SLICE_CHUNK_SIZE
- The QKV chunk size for causal attention part.

The function can be controlled by following env: VLLM_FUSEDSDPA_QKV_SLICE_SEQ_LEN_THLD When query or context exceeds this threshold, enable QKV slice. Default threshold is 8192. Set to 0 to disable this function. VLLM_FUSEDSDPA_Q_SLICE_CHUNK_SIZE The Q chunk size for full attention part. VLLM_FUSEDSDPA_KV_SLICE_CHUNK_SIZE The KV chunk size for full attention part. VLLM_FUSEDSDPA_CAUSAL_QKV_SLICE_CHUNK_SIZE The QKV chunk size for causal attention part.

ikurtchen requested review from afierka-intel, jikunshang, kzawora-intel, madamczyk-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, tzielinski-habana and xuechendi as code owners November 18, 2025 08:32

ikurtchen mentioned this pull request Nov 20, 2025

Add function to split FusedSDPA in APC case for perf improvement #392

Closed

ikurtchen added 2 commits November 21, 2025 14:40

Do not use APC QKV slice with it's INC FP8 mode

6c17418

ikurtchen force-pushed the kurt/fusedsdpa_qkv_slice branch from 8c152d3 to 6c17418 Compare November 21, 2025 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add FusedSDPA QKV slice support for APC #393

Add FusedSDPA QKV slice support for APC #393

Uh oh!

ikurtchen commented Nov 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add FusedSDPA QKV slice support for APC #393

Are you sure you want to change the base?

Add FusedSDPA QKV slice support for APC #393

Uh oh!

Conversation

ikurtchen commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ikurtchen commented Nov 18, 2025 •

edited

Loading