-
Notifications
You must be signed in to change notification settings - Fork 132
Description
Prerequisites
- I have searched existing issues and discussions to avoid duplicates
- I am using the latest version (or have tested against main/nightly)
Description
When streaming the inference, the bifrost gateway seems to not properly pass the delta.reasoning tokens through.
Steps to reproduce
If I request the provider directly, I see the following:
~ % curl -X POST https://provider/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer token-here" \
-d '{
"model": "gemini-3-pro-preview",
"stream": true,
"messages": [
{"role": "user", "content": "tell me fun 10 word story"}
]
}'
data: {"id":"chatcmpl-gemini-1764531905290","object":"chat.completion.chunk","created":1764531905,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Exploring Word Count Limits**\n\nI'm currently dwelling on the constraint of a strict 10-word limit for this \"fun\" story. My mind is cycling through potential themes, primarily animals like cats, dogs, and penguins, and also considering food as a possible subject.\n\n\n","reasoning_details":[]},"finish_reason":null}]}
data: {"id":"chatcmpl-gemini-1764531908469","object":"chat.completion.chunk","created":1764531908,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Analyzing Story Elements**\n\nI've been analyzing the recent drafts, and I'm leaning toward the \"Taco Mars\" story for its quirky, amusing concept. Its whimsical nature and simplicity truly resonate. Comparing the \"hat-wearing cat\" with the \"Mars tacos\" narrative, the latter wins.\n\n\n","reasoning_details":[]},"finish_reason":null}]}
data: {"id":"chatcmpl-gemini-1764531911101","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"","reasoning":"**Crafting the Narrative**\n\nI've decided to refine the \"Mars Tacos\" concept, and it feels quite solid now. I've successfully met the 10-word constraint with \"We went to Mars and found they only eat tacos.\" I'm also content with the word count and the whimsical nature. I'm satisfied now, and ready to finalize the details and declare this effort finished.\n\n\n","reasoning_details":[]},"finish_reason":null}]}
data: {"id":"chatcmpl-gemini-1764531911122","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{"content":"We went to Mars and found they only eat tacos."},"finish_reason":null}]}
data: {"id":"chatcmpl-gemini-1764531911131","object":"chat.completion.chunk","created":1764531911,"model":"gemini-3-pro-preview","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]You can see the delta.reasoning tokens here. But when I request the same provider via the bifrost, I see empty reasoning:
~ % curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "navy/gemini-3-pro-preview",
"stream": true,
"messages": [
{"role": "user", "content": "tell me fun 10 word story"}
]
}'
data: {"id":"chatcmpl-gemini-1764531872043","choices":[{"index":0,"delta":{"content":""}}],"created":1764531872,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":0,"chunk_index":0}}
data: {"id":"chatcmpl-gemini-1764531876106","choices":[{"index":0,"delta":{"content":""}}],"created":1764531876,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":4059,"chunk_index":1}}
data: {"id":"chatcmpl-gemini-1764531880027","choices":[{"index":0,"delta":{"content":""}}],"created":1764531880,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":3945,"chunk_index":2}}
data: {"id":"chatcmpl-gemini-1764531882551","choices":[{"index":0,"delta":{"content":""}}],"created":1764531882,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":2499,"chunk_index":3}}
data: {"id":"chatcmpl-gemini-1764531882570","choices":[{"index":0,"delta":{"content":"My cat learned to type; he just bought a boat."}}],"created":1764531882,"model":"gemini-3-pro-preview","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":null,"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":18,"chunk_index":4}}
data: {"id":"chatcmpl-gemini-1764531872043","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"created":0,"model":"","object":"chat.completion.chunk","service_tier":"","system_fingerprint":"","usage":{"total_tokens":0},"extra_fields":{"request_type":"chat_completion_stream","provider":"navy","model_requested":"gemini-3-pro-preview","latency":10548,"chunk_index":5}}
data: [DONE]Expected behavior
I expect to see the reasoning content passed through the bifrost gateway
Actual behavior
The reasoning content is empty
Affected area(s)
Core (Go)
Version
v1.3.37
Environment
Relevant logs/output
Regression?
No response
Severity
High (major functionality broken)
coderabbitai
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
In progress