Skip to content

Commit c5f20d5

Browse files
fix fa2 wrapper
Signed-off-by: Lucas Wilkinson <[email protected]>
1 parent b99f8c8 commit c5f20d5

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_flash_attn/flash_attn_interface.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,8 @@ def flash_attn_varlen_func(
226226
"FA2 does not support scheduler_metadata, q_descale, "
227227
"k_descale, v_descale"
228228
)
229+
if s_aux is not None:
230+
raise NotImplementedError("FA2 does not support s_aux")
229231
if num_splits > 1:
230232
raise NotImplementedError("FA2 does not support num_splits > 1")
231233
out, softmax_lse = torch.ops._vllm_fa2_C.varlen_fwd(
@@ -250,7 +252,6 @@ def flash_attn_varlen_func(
250252
softcap,
251253
return_softmax_lse and dropout_p > 0,
252254
None,
253-
s_aux=s_aux,
254255
)
255256
elif fa_version == 3:
256257
assert alibi_slopes is None, "Alibi is not supported in FA3"

0 commit comments

Comments
 (0)