Skip to content

[Bug]: Reasoning tokens are not passed through for Gemini #974

@skavans

Description

@skavans

Prerequisites

  • I have searched existing issues and discussions to avoid duplicates
  • I am using the latest version (or have tested against main/nightly)

Description

When streaming the inference, the bifrost gateway seems to not properly pass the delta.reasoning tokens through.

Steps to reproduce

If I request the provider directly, I see the following:

~ % curl -X POST https://provider/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer token-here" \
-d '{
  "model": "gemini-3-pro-preview",
  "stream": true,
  "messages": [
    {"role": "user", "content": "tell me fun 10 word story"}
  ]
}'
data: {"id":"chatcmpl-gemini-1764531905290","object":"chat.completion.chunk","created":1764531905,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Exploring Word Count Limits**\n\nI'm currently dwelling on the constraint of a strict 10-word limit for this \"fun\" story. My mind is cycling through potential themes, primarily animals like cats, dogs, and penguins, and also considering food as a possible subject.\n\n\n","reasoning_details":[]},"finish_reason":null}]}

data: {"id":"chatcmpl-gemini-1764531908469","object":"chat.completion.chunk","created":1764531908,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Analyzing Story Elements**\n\nI've been analyzing the recent drafts, and I'm leaning toward the \"Taco Mars\" story for its quirky, amusing concept. Its whimsical nature and simplicity truly resonate. Comparing the \"hat-wearing cat\" with the \"Mars tacos\" narrative, the latter wins.\n\n\n","reasoning_details":[]},"finish_reason":null}]}

data: {"id":"chatcmpl-gemini-1764531911101","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Crafting the Narrative**\n\nI've decided to refine the \"Mars Tacos\" concept, and it feels quite solid now. I've successfully met the 10-word constraint with \"We went to Mars and found they only eat tacos.\" I'm also content with the word count and the whimsical nature. I'm satisfied now, and ready to finalize the details and declare this effort finished.\n\n\n","reasoning_details":[]},"finish_reason":null}]}

data: {"id":"chatcmpl-gemini-1764531911122","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"We went to Mars and found they only eat tacos."},"finish_reason":null}]}

data: {"id":"chatcmpl-gemini-1764531911131","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

You can see the delta.reasoning tokens here. But when I request the same provider via the bifrost, I see empty reasoning:

~ % curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "navy/gemini-3-pro-preview",
  "stream": true,
  "messages": [
    {"role": "user", "content": "tell me fun 10 word story"}
  ]
}'
data: {"id":"chatcmpl-gemini-1764531872043","choices":[{"index":0,"delta":{"content":""}}],"created":1764531872,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":0,"chunk_index":0}}

data: {"id":"chatcmpl-gemini-1764531876106","choices":[{"index":0,"delta":{"content":""}}],"created":1764531876,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":4059,"chunk_index":1}}

data: {"id":"chatcmpl-gemini-1764531880027","choices":[{"index":0,"delta":{"content":""}}],"created":1764531880,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":3945,"chunk_index":2}}

data: {"id":"chatcmpl-gemini-1764531882551","choices":[{"index":0,"delta":{"content":""}}],"created":1764531882,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":2499,"chunk_index":3}}

data: {"id":"chatcmpl-gemini-1764531882570","choices":[{"index":0,"delta":{"content":"My cat learned to type; he just bought a boat."}}],"created":1764531882,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":18,"chunk_index":4}}

data: {"id":"chatcmpl-gemini-1764531872043","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"created":0,"model":"","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":{"total_tokens":0},"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":10548,"chunk_index":5}}

data: [DONE]

Expected behavior

I expect to see the reasoning content passed through the bifrost gateway

Actual behavior

The reasoning content is empty

Affected area(s)

Core (Go)

Version

v1.3.37

Environment


Relevant logs/output

Regression?

No response

Severity

High (major functionality broken)

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

Status

In progress

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions