Skip to content

[Interrupt reasoning] Add interrupt_requests control command support#7445

Open
lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
lonelygsh:feature/interrupt-requests-control-cmd
Open

[Interrupt reasoning] Add interrupt_requests control command support#7445
lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
lonelygsh:feature/interrupt-requests-control-cmd

Conversation

@lonelygsh
Copy link
Copy Markdown
Contributor

@lonelygsh lonelygsh commented Apr 16, 2026

Motivation

支持接受中断推理的命令。

Modifications

在 internal_adapter_utils.py的 handle_control_cmd 方法中,新增 interrupt_requests 控制命令的处理分支:

接收 interrupt_requests 命令后,从 task["req_ids"] 中提取需要中断的请求 ID 列表
调用 self.engine.resource_manager.add_abort_req_ids() 将这些请求标记为待中止
构造包含 success 状态和已中断请求 ID 列表的响应,通过控制命令通道回传结果

Usage or Command

{
"cmd": "interrupt_requests",
"task_id": "<task_id>",
"req_ids": ["req_1", "req_2"]
}

Accuracy Tests

本次改动不涉及模型前向计算或 kernel 修改,无需精度测试。

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.不涉及精度变化
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 16, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Apr 16, 2026
Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-16 21:15 CST\n\n## 📋 Review 摘要\n\nPR 概述:在 internal_adapter_utils.py 中新增 interrupt_requests 控制命令,支持通过控制通道中断指定请求的推理。\n变更范围fastdeploy/splitwise/internal_adapter_utils.py\n影响面 TagEngine\n\n### 📝 PR 规范检查\n\nPR 标题中的 Tag [Interrupt reasoning] 不在官方 Tag 列表中,建议使用官方标签。\n\n标题建议(可直接复制):\n- [Feature] Add interrupt_requests control command support\n\n### 问题\n\n| 级别 | 文件 | 概述 |\n|------|------|------|\n| 🟡 建议 | internal_adapter_utils.py:109 | 缺少对 req_ids 字段的防御性校验,异常时调用方无法收到响应 |\n| 🟡 建议 | internal_adapter_utils.py:108 | 缺少 logger.debug 日志,与其他命令分支风格不一致 |\n\n### 总体评价\n\n变更逻辑清晰,整体遵循了现有控制命令的处理模式。建议补充防御性校验和调试日志以提升健壮性和可观测性。"

self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

elif task["cmd"] == "interrupt_requests":
self.engine.resource_manager.add_abort_req_ids(task["req_ids"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 缺少对 req_ids 字段的防御性校验。

如果传入的 task 缺少 req_ids 键或其值为空列表,当前代码会抛出 KeyError 或执行一次无意义的空调用。虽然外层 except Exception 会捕获异常不会导致线程崩溃,但调用方不会收到任何响应,可能导致请求超时。

建议添加校验:

elif task["cmd"] == "interrupt_requests":
    req_ids = task.get("req_ids", [])
    if not req_ids:
        result = {
            "task_id": task_id_str,
            "result": {"success": False, "message": "req_ids is empty or missing"},
        }
    else:
        self.engine.resource_manager.add_abort_req_ids(req_ids)
        result = {
            "task_id": task_id_str,
            "result": {"success": True, "interrupted_req_ids": req_ids},
        }
    with self.response_lock:
        self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

with self.response_lock:
self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

elif task["cmd"] == "interrupt_requests":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 其他命令分支(如 get_payloadcheck_health)在响应前都有 logger.debug(f"Response for task: {task_id_str}"),建议此处也添加以保持一致性,方便排查问题。

elif task["cmd"] == "interrupt_requests":
    self.engine.resource_manager.add_abort_req_ids(task["req_ids"])
    result = {
        "task_id": task_id_str,
        "result": {"success": True, "interrupted_req_ids": task["req_ids"]},
    }
    logger.debug(f"Response for task: {task_id_str}")
    with self.response_lock:
        self.recv_control_cmd_server.response_for_control_cmd(task_id_str, result)

freeliuzc pushed a commit that referenced this pull request Apr 16, 2026
…7402, #7445 to release/online/20260415 (#7447)

* [Speculate Decoding] Fix step_idx semantics in limit_thinking and set_stop_value kernels (#7166)

- speculate_limit_thinking_content_length: update current_base_step to
  step_idx+1 (step_idx now records history count before current round);
  remove incorrect step_idx decrement on accept_num truncation; mark
  step_idx param as const.
- speculate_set_stop_value_multi_seqs: fix can_stop gate to use
  step_idx_now+accept_num>=min_token_limit; fix skip check and pre_ids_idx
  formula (remove stale -accept_num offset); use <= condition so accept_idx
  maps directly to the accepted token that ends the stop sequence; fix
  accept_tokens index (remove -1).
- Update unit tests for speculate_set_stop_value_multi_seqs kernel.

* [Speculate Decoding] Fix bug of reasoning_phase_token_constraint kernel (#7349)

Co-authored-by: guanshihui] <guanshihui@baidu.com>

* [Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler (#7402)

* [Interrupt reasoning] Add interrupt_requests control command support

---------

Co-authored-by: guanshihui] <guanshihui@baidu.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@f7a2418). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/splitwise/internal_adapter_utils.py 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7445   +/-   ##
==========================================
  Coverage           ?   73.89%           
==========================================
  Files              ?      398           
  Lines              ?    54948           
  Branches           ?     8608           
==========================================
  Hits               ?    40603           
  Misses             ?    11630           
  Partials           ?     2715           
Flag Coverage Δ
GPU 73.89% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants