[Interrupt reasoning] Add interrupt_requests control command support#7445
[Interrupt reasoning] Add interrupt_requests control command support#7445lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-16 21:15 CST\n\n## 📋 Review 摘要\n\nPR 概述:在internal_adapter_utils.py中新增interrupt_requests控制命令,支持通过控制通道中断指定请求的推理。\n变更范围:fastdeploy/splitwise/internal_adapter_utils.py\n影响面 Tag:Engine\n\n### 📝 PR 规范检查\n\nPR 标题中的 Tag[Interrupt reasoning]不在官方 Tag 列表中,建议使用官方标签。\n\n标题建议(可直接复制):\n-[Feature] Add interrupt_requests control command support\n\n### 问题\n\n| 级别 | 文件 | 概述 |\n|------|------|------|\n| 🟡 建议 |internal_adapter_utils.py:109| 缺少对req_ids字段的防御性校验,异常时调用方无法收到响应 |\n| 🟡 建议 |internal_adapter_utils.py:108| 缺少logger.debug日志,与其他命令分支风格不一致 |\n\n### 总体评价\n\n变更逻辑清晰,整体遵循了现有控制命令的处理模式。建议补充防御性校验和调试日志以提升健壮性和可观测性。"
| self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result) | ||
|
|
||
| elif task["cmd"] == "interrupt_requests": | ||
| self.engine.resource_manager.add_abort_req_ids(task["req_ids"]) |
There was a problem hiding this comment.
🟡 建议 缺少对 req_ids 字段的防御性校验。
如果传入的 task 缺少 req_ids 键或其值为空列表,当前代码会抛出 KeyError 或执行一次无意义的空调用。虽然外层 except Exception 会捕获异常不会导致线程崩溃,但调用方不会收到任何响应,可能导致请求超时。
建议添加校验:
elif task["cmd"] == "interrupt_requests":
req_ids = task.get("req_ids", [])
if not req_ids:
result = {
"task_id": task_id_str,
"result": {"success": False, "message": "req_ids is empty or missing"},
}
else:
self.engine.resource_manager.add_abort_req_ids(req_ids)
result = {
"task_id": task_id_str,
"result": {"success": True, "interrupted_req_ids": req_ids},
}
with self.response_lock:
self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)| with self.response_lock: | ||
| self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result) | ||
|
|
||
| elif task["cmd"] == "interrupt_requests": |
There was a problem hiding this comment.
🟡 建议 其他命令分支(如 get_payload、check_health)在响应前都有 logger.debug(f"Response for task: {task_id_str}"),建议此处也添加以保持一致性,方便排查问题。
elif task["cmd"] == "interrupt_requests":
self.engine.resource_manager.add_abort_req_ids(task["req_ids"])
result = {
"task_id": task_id_str,
"result": {"success": True, "interrupted_req_ids": task["req_ids"]},
}
logger.debug(f"Response for task: {task_id_str}")
with self.response_lock:
self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)…7402, #7445 to release/online/20260415 (#7447) * [Speculate Decoding] Fix step_idx semantics in limit_thinking and set_stop_value kernels (#7166) - speculate_limit_thinking_content_length: update current_base_step to step_idx+1 (step_idx now records history count before current round); remove incorrect step_idx decrement on accept_num truncation; mark step_idx param as const. - speculate_set_stop_value_multi_seqs: fix can_stop gate to use step_idx_now+accept_num>=min_token_limit; fix skip check and pre_ids_idx formula (remove stale -accept_num offset); use <= condition so accept_idx maps directly to the accepted token that ends the stop sequence; fix accept_tokens index (remove -1). - Update unit tests for speculate_set_stop_value_multi_seqs kernel. * [Speculate Decoding] Fix bug of reasoning_phase_token_constraint kernel (#7349) Co-authored-by: guanshihui] <guanshihui@baidu.com> * [Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler (#7402) * [Interrupt reasoning] Add interrupt_requests control command support --------- Co-authored-by: guanshihui] <guanshihui@baidu.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7445 +/- ##
==========================================
Coverage ? 73.89%
==========================================
Files ? 398
Lines ? 54948
Branches ? 8608
==========================================
Hits ? 40603
Misses ? 11630
Partials ? 2715
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
支持接受中断推理的命令。
Modifications
在 internal_adapter_utils.py的 handle_control_cmd 方法中,新增 interrupt_requests 控制命令的处理分支:
接收 interrupt_requests 命令后,从 task["req_ids"] 中提取需要中断的请求 ID 列表
调用 self.engine.resource_manager.add_abort_req_ids() 将这些请求标记为待中止
构造包含 success 状态和已中断请求 ID 列表的响应,通过控制命令通道回传结果
Usage or Command
{
"cmd": "interrupt_requests",
"task_id": "<task_id>",
"req_ids": ["req_1", "req_2"]
}
Accuracy Tests
本次改动不涉及模型前向计算或 kernel 修改,无需精度测试。
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.