Commit 00ab75e
authored
fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching (#42339)
This commit fixes two bugs in BenchmarkConfig reported in issue #42211:
1. **sdpa_backend inconsistency (line 105)**: The warning message states
"sdpa_backend must be None" but the code was setting it to "math".
Changed to None to match the warning message. This allows PyTorch to
auto-select the appropriate SDPA backend rather than forcing one globally,
which is correct for continuous batching with custom attention masks.
2. **Invalid attn_implementation (line 243)**: Changed from "paged|sdpa" to
"sdpa". Using "paged|sdpa" directly bypassed the validation logic at
lines 91-105 since it only checks for exactly "sdpa". The "paged|" prefix
is automatically added by init_continuous_batching() in continuous_api.py,
so the config should use plain "sdpa" for consistency with other configs.
Both bugs were introduced in commit 069684e (PR #41916).
Fixes #422111 parent 3410ba9 commit 00ab75e
1 file changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| |||
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
243 | | - | |
| 243 | + | |
244 | 244 | | |
0 commit comments