[Interrupt reasoning] Add interrupt_requests control command support by lonelygsh · Pull Request #7445 · PaddlePaddle/FastDeploy

lonelygsh · 2026-04-16T13:09:02Z

Motivation

支持接受中断推理的命令。

Modifications

在 internal_adapter_utils.py的 handle_control_cmd 方法中，新增 interrupt_requests 控制命令的处理分支：

接收 interrupt_requests 命令后，从 task["req_ids"] 中提取需要中断的请求 ID 列表
调用 self.engine.resource_manager.add_abort_req_ids() 将这些请求标记为待中止
构造包含 success 状态和已中断请求 ID 列表的响应，通过控制命令通道回传结果

Usage or Command

{
"cmd": "interrupt_requests",
"task_id": "<task_id>",
"req_ids": ["req_1", "req_2"]
}

Accuracy Tests

本次改动不涉及模型前向计算或 kernel 修改，无需精度测试。

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.不涉及精度变化
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-16T13:09:13Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-16 21:15 CST\n\n## 📋 Review 摘要\n\nPR 概述：在 internal_adapter_utils.py 中新增 interrupt_requests 控制命令，支持通过控制通道中断指定请求的推理。\n变更范围：fastdeploy/splitwise/internal_adapter_utils.py\n影响面 Tag：Engine\n\n### 📝 PR 规范检查\n\nPR 标题中的 Tag [Interrupt reasoning] 不在官方 Tag 列表中，建议使用官方标签。\n\n标题建议（可直接复制）：\n- [Feature] Add interrupt_requests control command support\n\n### 问题\n\n| 级别 | 文件 | 概述 |\n|------|------|------|\n| 🟡 建议 | internal_adapter_utils.py:109 | 缺少对 req_ids 字段的防御性校验，异常时调用方无法收到响应 |\n| 🟡 建议 | internal_adapter_utils.py:108 | 缺少 logger.debug 日志，与其他命令分支风格不一致 |\n\n### 总体评价\n\n变更逻辑清晰，整体遵循了现有控制命令的处理模式。建议补充防御性校验和调试日志以提升健壮性和可观测性。"

PaddlePaddle-bot · 2026-04-16T13:15:40Z

                        self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

+                elif task["cmd"] == "interrupt_requests":
+                    self.engine.resource_manager.add_abort_req_ids(task["req_ids"])


🟡 建议 缺少对 req_ids 字段的防御性校验。

如果传入的 task 缺少 req_ids 键或其值为空列表，当前代码会抛出 KeyError 或执行一次无意义的空调用。虽然外层 except Exception 会捕获异常不会导致线程崩溃，但调用方不会收到任何响应，可能导致请求超时。

建议添加校验：

elif task["cmd"] == "interrupt_requests": req_ids = task.get("req_ids", []) if not req_ids: result = { "task_id": task_id_str, "result": {"success": False, "message": "req_ids is empty or missing"}, } else: self.engine.resource_manager.add_abort_req_ids(req_ids) result = { "task_id": task_id_str, "result": {"success": True, "interrupted_req_ids": req_ids}, } with self.response_lock: self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

PaddlePaddle-bot · 2026-04-16T13:15:40Z

                    with self.response_lock:
                        self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

+                elif task["cmd"] == "interrupt_requests":


🟡 建议 其他命令分支（如 get_payload、check_health）在响应前都有 logger.debug(f"Response for task: {task_id_str}")，建议此处也添加以保持一致性，方便排查问题。

elif task["cmd"] == "interrupt_requests": self.engine.resource_manager.add_abort_req_ids(task["req_ids"]) result = { "task_id": task_id_str, "result": {"success": True, "interrupted_req_ids": task["req_ids"]}, } logger.debug(f"Response for task: {task_id_str}") with self.response_lock: self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

…7402, #7445 to release/online/20260415 (#7447) * [Speculate Decoding] Fix step_idx semantics in limit_thinking and set_stop_value kernels (#7166) - speculate_limit_thinking_content_length: update current_base_step to step_idx+1 (step_idx now records history count before current round); remove incorrect step_idx decrement on accept_num truncation; mark step_idx param as const. - speculate_set_stop_value_multi_seqs: fix can_stop gate to use step_idx_now+accept_num>=min_token_limit; fix skip check and pre_ids_idx formula (remove stale -accept_num offset); use <= condition so accept_idx maps directly to the accepted token that ends the stop sequence; fix accept_tokens index (remove -1). - Update unit tests for speculate_set_stop_value_multi_seqs kernel. * [Speculate Decoding] Fix bug of reasoning_phase_token_constraint kernel (#7349) Co-authored-by: guanshihui] <guanshihui@baidu.com> * [Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler (#7402) * [Interrupt reasoning] Add interrupt_requests control command support --------- Co-authored-by: guanshihui] <guanshihui@baidu.com>

codecov-commenter · 2026-04-16T14:48:56Z

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@f7a2418). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/splitwise/internal_adapter_utils.py	0.00%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7445   +/-   ##
==========================================
  Coverage           ?   73.89%           
==========================================
  Files              ?      398           
  Lines              ?    54948           
  Branches           ?     8608           
==========================================
  Hits               ?    40603           
  Misses             ?    11630           
  Partials           ?     2715

Flag	Coverage Δ
GPU	`73.89% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

[Interrupt reasoning] Add interrupt_requests control command support

d198537

lonelygsh temporarily deployed to Metax_ci April 16, 2026 13:09 — with GitHub Actions Inactive

paddle-bot bot added the contributor External developers label Apr 16, 2026

PaddlePaddle-bot reviewed Apr 16, 2026

View reviewed changes

lonelygsh mentioned this pull request Apr 16, 2026

[Cherry-Pick][Speculate Decoding][Engine] Cherry-pick #7166, #7349, #7402, #7445 to release/online/20260415 #7447

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Interrupt reasoning] Add interrupt_requests control command support#7445

[Interrupt reasoning] Add interrupt_requests control command support#7445
lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
lonelygsh:feature/interrupt-requests-control-cmd

lonelygsh commented Apr 16, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 16, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 16, 2026

Uh oh!

PaddlePaddle-bot Apr 16, 2026

Uh oh!

codecov-commenter commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lonelygsh commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 16, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lonelygsh commented Apr 16, 2026 •

edited

Loading

codecov-commenter commented Apr 16, 2026 •

edited

Loading