Optimize scheduler for chunk prefill by liyonghua0910 · Pull Request #7454 · PaddlePaddle/FastDeploy

liyonghua0910 · 2026-04-17T02:53:02Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-17T02:53:08Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-17 11:01 CST

📋 Review 摘要

PR 概述：优化 chunk prefill 场景下 RUNNING 请求的调度策略，移除抢占逻辑，改为直接中断并阻止 WAITING 请求调度
变更范围：Scheduler 调度逻辑（resource_manager_v1.py）
影响面 Tag：Scheduler KVCache

📝 PR 规范检查

PR 标题缺少规范 Tag，描述中 Motivation 和 Modifications 部分未填写。

标题建议（可直接复制）：

[Scheduler][Optimization] Optimize scheduler for chunk prefill

描述建议：请补充 Motivation（为什么需要此优化）和 Modifications（具体改了什么）。

问题

级别	文件	概述
🔴 Bug	`resource_manager_v1.py:698`	`_get_can_schedule_prefill_threshold_block` 调用缺少 `request` 参数，运行时会 TypeError
🔴 Bug	`resource_manager_v1.py:702`	`allocate_gpu_blocks` 多传了 `request_id` 参数，运行时会 TypeError

总体评价

调度策略优化方向合理，但存在两个方法调用参数不匹配的 Bug，均会导致运行时 TypeError 崩溃，需修复后才能合入。

PaddlePaddle-bot · 2026-04-17T03:01:25Z

                    )
                    num_new_tokens = self._get_num_new_tokens(request, token_budget)
                    num_new_block = self.get_new_block_nums(request, num_new_tokens)
+                    can_schedule_block_num_threshold = self._get_can_schedule_prefill_threshold_block(num_new_block)


🔴 Bug _get_can_schedule_prefill_threshold_block 方法签名为 (self, request, num_chunk_new_block)，需要两个参数，但这里只传了 num_new_block 一个参数，缺少 request。运行时会抛出 TypeError: _get_can_schedule_prefill_threshold_block() missing 1 required positional argument。

注意：同文件中 WAITING 请求调度部分（第762行、第810行）的调用是正确的，传了 (request, num_new_block) 两个参数。

建议修复：

can_schedule_block_num_threshold = self._get_can_schedule_prefill_threshold_block(request, num_new_block)

PaddlePaddle-bot · 2026-04-17T03:01:25Z

-                        request.block_tables.extend(self.cache_manager.allocate_gpu_blocks(num_new_block))
+                    if self.cache_manager.can_allocate_gpu_blocks(can_schedule_block_num_threshold):
+                        request.block_tables.extend(
+                            self.cache_manager.allocate_gpu_blocks(num_new_block, request.request_id)


🔴 Bug allocate_gpu_blocks 方法签名为 (self, num_blocks)，只接受一个参数，但这里额外传入了 request.request_id。运行时会抛出 TypeError: allocate_gpu_blocks() takes 2 positional arguments but 3 were given。

注意：同文件中其他所有 allocate_gpu_blocks 调用（第620行、第633行、第648行、第662行、第768行、第816行）都只传了一个参数。

建议修复：

request.block_tables.extend( self.cache_manager.allocate_gpu_blocks(num_new_block) )

Optimize scheduler for chunk prefill

739840a

liyonghua0910 had a problem deploying to Metax_ci April 17, 2026 02:53 — with GitHub Actions Failure

PaddlePaddle-bot suggested changes Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize scheduler for chunk prefill#7454

Optimize scheduler for chunk prefill#7454
liyonghua0910 wants to merge 1 commit intoPaddlePaddle:release/2.4from
liyonghua0910:release/2.4+20260416_opt_prefill

liyonghua0910 commented Apr 17, 2026

Uh oh!

paddle-bot bot commented Apr 17, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

liyonghua0910 commented Apr 17, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 17, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants