Commit baa60de
authored
fix finding bucket for context length (#2118)
- Works with HabanaAI/vllm-hpu-extension#385 to
enable padding ratio limit for the context length bucketing to reduce
the number of buckets.
- Truncate the context length based on the bucketing in the APC block
manager.
- Add assertion for `max_num_prefill_seqs==` when APC is enabled.
---------
Signed-off-by: Youlei Yang <[email protected]>1 parent 3bd00ce commit baa60de
File tree
3 files changed
+39
-5
lines changed- vllm
- core/block
- worker
3 files changed
+39
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4586 | 4586 | | |
4587 | 4587 | | |
4588 | 4588 | | |
| 4589 | + | |
| 4590 | + | |
| 4591 | + | |
| 4592 | + | |
| 4593 | + | |
| 4594 | + | |
| 4595 | + | |
| 4596 | + | |
4589 | 4597 | | |
4590 | 4598 | | |
4591 | 4599 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
1075 | 1076 | | |
1076 | 1077 | | |
1077 | 1078 | | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
1078 | 1090 | | |
1079 | 1091 | | |
| 1092 | + | |
| 1093 | + | |
1080 | 1094 | | |
1081 | 1095 | | |
1082 | 1096 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1963 | 1963 | | |
1964 | 1964 | | |
1965 | 1965 | | |
| 1966 | + | |
| 1967 | + | |
| 1968 | + | |
1966 | 1969 | | |
1967 | 1970 | | |
1968 | 1971 | | |
| |||
2836 | 2839 | | |
2837 | 2840 | | |
2838 | 2841 | | |
2839 | | - | |
2840 | | - | |
2841 | | - | |
| 2842 | + | |
| 2843 | + | |
2842 | 2844 | | |
2843 | 2845 | | |
2844 | 2846 | | |
| |||
4055 | 4057 | | |
4056 | 4058 | | |
4057 | 4059 | | |
4058 | | - | |
| 4060 | + | |
4059 | 4061 | | |
4060 | 4062 | | |
4061 | 4063 | | |
| |||
4144 | 4146 | | |
4145 | 4147 | | |
4146 | 4148 | | |
| 4149 | + | |
| 4150 | + | |
| 4151 | + | |
4147 | 4152 | | |
4148 | 4153 | | |
4149 | 4154 | | |
| |||
4289 | 4294 | | |
4290 | 4295 | | |
4291 | 4296 | | |
| 4297 | + | |
| 4298 | + | |
| 4299 | + | |
| 4300 | + | |
| 4301 | + | |
| 4302 | + | |
4292 | 4303 | | |
4293 | | - | |
| 4304 | + | |
| 4305 | + | |
4294 | 4306 | | |
4295 | 4307 | | |
4296 | 4308 | | |
| |||
0 commit comments