[BugFix][APIServer] Support max_completion_tokens in CompletionRequest for OpenAI API compat by ZhijunLStudio · Pull Request #7459 · PaddlePaddle/FastDeploy

ZhijunLStudio · 2026-04-17T04:25:23Z

Fix #2697: OpenAI deprecated max_tokens in favor of max_completion_tokens for both chat and completion endpoints. ChatCompletionRequest already supports max_completion_tokens (protocol.py line 688), but CompletionRequest was missing the field entirely. This causes OpenAI SDK clients that use max_completion_tokens on /v1/completions to either ignore the parameter or fail silently.

Motivation

The OpenAI API specification requires max_completion_tokens as the preferred parameter for controlling output length on both /v1/chat/completions and /v1/completions endpoints. FastDeploy's /v1/completions endpoint only accepts max_tokens, breaking compatibility with clients that follow the current OpenAI API spec (e.g., official OpenAI Python SDK v1.x+).

Related issues: #2697, #2815, #2816 (previous fixes for max_completion_tokens in finish_reason logic were merged to release/2.0.2 but the CompletionRequest field was never added to develop).

Modifications

fastdeploy/entrypoints/openai/protocol.py:
- Added max_completion_tokens: Optional[int] = None field to CompletionRequest
- Marked max_tokens as deprecated with Field(deprecated=...), consistent with ChatCompletionRequest
- Added priority logic in to_dict_for_infer(): max_completion_tokens if max_completion_tokens is not None else max_tokens (handles edge case where max_completion_tokens=0 should not fall back to max_tokens)
- Applied same is not None fix to ChatCompletionRequest.to_dict_for_infer() for consistency
tests/entrypoints/test_completion_max_completion_tokens.py: Added 12 unit tests covering:
- Field existence and default values
- Backward compatibility with max_tokens
- max_completion_tokens priority in to_dict_for_infer()
- Edge case: max_completion_tokens=0 does not fall back to max_tokens
- Source code verification

Usage or Command

# Before: max_completion_tokens was silently ignored on /v1/completions
import requests
requests.post("http://localhost:8000/v1/completions", json={
    "model": "default",
    "prompt": "Hello",
    "max_completion_tokens": 100  # This was ignored
})

# After: max_completion_tokens is properly handled
# It maps to internal max_tokens, taking priority over max_tokens if both set

Run tests:

python -m pytest tests/entrypoints/test_completion_max_completion_tokens.py -v

Accuracy Tests

N/A — this is an API parameter change that does not affect model inference logic.

Checklist

Add at least a tag in the PR title: [BugFix][APIServer]
Format your code, run pre-commit before commit.
Add unit tests (12 tests, all passing).
Provide accuracy results. (N/A for this fix — does not affect model inference.)
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag. (N/A — submitting to develop.)

paddle-bot · 2026-04-17T04:25:32Z

Thanks for your contribution!

…PI compat Fix PaddlePaddle#2697: ChatCompletionRequest already supports max_completion_tokens but CompletionRequest was missing the field. Add max_completion_tokens with deprecated max_tokens annotation, consistent with ChatCompletionRequest. Ensure max_completion_tokens takes priority in to_dict_for_infer().

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-17 13:43 CST

📋 Review 摘要

PR 概述：为 CompletionRequest 添加 max_completion_tokens 字段支持，与 OpenAI API 规范保持一致，同时修复 ChatCompletionRequest 中 max_completion_tokens=0 被错误回退的问题。
变更范围：entrypoints/openai/protocol.py、新增测试文件
影响面 Tag：APIServer

问题

级别	文件	概述
🟡 建议	`tests/entrypoints/test_completion_max_completion_tokens.py:40`	测试使用本地 mock 类而非导入生产代码，无法真正验证生产代码行为

总体评价

变更逻辑正确且与 ChatCompletionRequest 保持一致。CompletionRequest.to_dict_for_infer() 中优先级逻辑放在 self.dict() 循环之后，确保能正确覆盖；ChatCompletionRequest 中 or → is not None 的修复也是正确的（处理 max_completion_tokens=0 的边界情况）。主要建议是测试应直接导入生产类以提高测试可靠性。

PaddlePaddle-bot · 2026-04-17T05:43:46Z

+# ---------------------------------------------------------------------------
+
+
+class CompletionRequest(BaseModel):


🟡 建议 测试使用了本地定义的 CompletionRequest 而非导入生产代码的类

当前测试通过在测试文件内重新定义一个最小化的 CompletionRequest 类来验证逻辑，这意味着即使生产代码存在 bug，测试也可能通过。而 TestSourceCodeVerification 使用字符串匹配来验证源码，较为脆弱且无法验证运行时行为。

建议直接导入生产类进行测试：

from fastdeploy.entrypoints.openai.protocol import CompletionRequest

这样可以直接验证生产代码的行为，且当生产代码发生变更时测试能及时感知到。同时可以移除 TestSourceCodeVerification 类，因为直接导入生产类后字符串匹配验证就不再必要了。

ZhijunLStudio · 2026-04-17T06:38:18Z

Reopening to re-trigger CI checks after updating PR body.

codecov-commenter · 2026-04-17T07:52:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@91b8bf2). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7459   +/-   ##
==========================================
  Coverage           ?   73.86%           
==========================================
  Files              ?      398           
  Lines              ?    54977           
  Branches           ?     8613           
==========================================
  Hits               ?    40608           
  Misses             ?    11653           
  Partials           ?     2716

Flag	Coverage Δ
GPU	`73.86% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ZhijunLStudio had a problem deploying to Metax_ci April 17, 2026 04:25 — with GitHub Actions Error

paddle-bot bot added the contributor External developers label Apr 17, 2026

luotao1 self-assigned this Apr 17, 2026

luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Apr 17, 2026

ZhijunLStudio force-pushed the fix/issue-2697-max-completion-tokens branch from f66cdf4 to 3d7bb7b Compare April 17, 2026 04:33

ZhijunLStudio had a problem deploying to Metax_ci April 17, 2026 04:33 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

ZhijunLStudio force-pushed the fix/issue-2697-max-completion-tokens branch from 3d7bb7b to 80a5d92 Compare April 17, 2026 05:39

ZhijunLStudio had a problem deploying to Metax_ci April 17, 2026 05:39 — with GitHub Actions Failure

PaddlePaddle-bot reviewed Apr 17, 2026

View reviewed changes

ZhijunLStudio closed this Apr 17, 2026

ZhijunLStudio reopened this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix][APIServer] Support max_completion_tokens in CompletionRequest for OpenAI API compat#7459

[BugFix][APIServer] Support max_completion_tokens in CompletionRequest for OpenAI API compat#7459
ZhijunLStudio wants to merge 1 commit intoPaddlePaddle:developfrom
ZhijunLStudio:fix/issue-2697-max-completion-tokens

ZhijunLStudio commented Apr 17, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 17, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Uh oh!

ZhijunLStudio commented Apr 17, 2026

Uh oh!

codecov-commenter commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		# ---------------------------------------------------------------------------


		class CompletionRequest(BaseModel):

Conversation

ZhijunLStudio commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 17, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

ZhijunLStudio commented Apr 17, 2026

Uh oh!

codecov-commenter commented Apr 17, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ZhijunLStudio commented Apr 17, 2026 •

edited

Loading