[Bug]: trtllm-serve: Harmony-format tokens and reasoning fields emitted in responses for gpt-oss-120B

### System Info

**Description**
During tool calls, OpenAI Chat Completions responses sometimes include Harmony-format control tokens in assistant content and Harmony-only fields in JSON (e.g., reasoning) when hosting the GPT-OSS 120B model via trt-llm

Example:
`{
  "model": "gpt-oss-20b",
  "messages": [
    {"role": "user", "content": "Search for latest policy doc title"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "search",
        "description": "How is the weather today",
        "parameters": {
          "type": "object",
          "properties": { "q": { "type": "string" } },
          "required": ["q"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}
`
Would result in a response:
`<|channel|>commentary<|message|>{
  "q": "search the web"
}`

The harmony tokens end up leaking arbitrarily during tool calls and mostly in commentary channels and sometimes in the analysis channels as well.


**Triton Information**
Version: using TensorRT-LLM OpenAI server (trtllm-serve).
TensorRT-LLM OpenAI server image versions: 1.2.0rc0 and 1.2.0rc0.post1.
Container vs build: Using the official container images (no custom build).



### Who can help?

_No response_

### Information

- [x] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Run TensorRT-LLM OpenAI HTTP server (trtllm-serve serve) with  gpt-oss-120b. No custom stop tokens or output filters configured.
POST to /v1/chat/completions with:
Messages that trigger tool planning/execution.
A tools schema (tools / tool_choice) to enable tool calling.
Observe responses (both streaming and non-streaming):
Assistant content contains Harmony markers (e.g., <|channel|>commentary<|message|>{...}).
Non-streaming JSON sometimes includes reasoning

Here is an example request:
`{
  "model": "gpt-oss-20b",
  "messages": [
    {"role": "user", "content": "Search for latest policy doc title"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "search",
        "description": "How is the weather today",
        "parameters": {
          "type": "object",
          "properties": { "q": { "type": "string" } },
          "required": ["q"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}
`

### Expected behavior

Would result in a response:
{
  "q": "search the web for weather"
}`

### actual behavior

Would result in a response:
`<|channel|>commentary<|message|>{
  "q": "search the web for weather"
}`

### additional notes

**Model description:**
Models: openai/gpt-oss-120b, openai/gpt-oss-20b.
Served via TensorRT-LLM OpenAI server; downloaded at container start; no request/response mutation layer.
Inputs: OpenAI Chat Completions payloads with messages and tools.
Outputs: OpenAI Chat Completions (expected clean JSON and content).

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: trtllm-serve: Harmony-format tokens and reasoning fields emitted in responses for gpt-oss-120B #9256

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: trtllm-serve: Harmony-format tokens and reasoning fields emitted in responses for gpt-oss-120B #9256

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions