-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Summary
OpenCode does not clear reasoning_content from previous conversation turns when sending messages to DeepSeek models, which violates the DeepSeek Thinking Mode API specification and causes unnecessary token usage, increased costs, and slower responses.
Problem
According to DeepSeek's official documentation for V3.2 models with thinking mode:
"In each turn of the conversation, the model outputs the CoT (reasoning_content) and the final answer (content). In the next turn of the conversation, the CoT from previous turns is not concatenated into the context"
The spec explicitly shows a clear_reasoning_content() function that should be called before Turn 2:
def clear_reasoning_content(messages):
for message in messages:
if hasattr(message, 'reasoning_content'):
message.reasoning_content = NoneCurrent Behavior
OpenCode's logic in packages/opencode/src/provider/transform.ts currently:
- ✅ Correctly adds
reasoning_contentfor tool call continuations within the same turn - ✅ Strips reasoning from individual messages without tool calls
- ❌ Does not clear
reasoning_contentfrom ALL assistant messages in history when a new user turn begins
Result: reasoning chains from previous turns accumulate in the context window, being repeatedly sent in each new turn.
Impact
- Wasted tokens: You pay for repeated
reasoning_contentfrom prior turns - Higher costs: DeepSeek charges per token
- Slower responses: Larger context = more processing time
- Context window pressure: Fills up context limits faster with redundant data
- Spec violation: Not following DeepSeek's documented API contract
Concrete Example
Turn 1:
- User asks question
- Model reasons: 500 tokens
- Makes tool call
- Model reasons more: 300 tokens
- Gives answer
- Total reasoning: 800 tokens
Turn 2 (new user question):
- OpenCode sends Turn 1's 800 reasoning tokens again
- Model generates new reasoning: 600 tokens
- API receives: 800 (old) + 600 (new) = 1,400 tokens
- Should only send: 600 tokens
Expected Behavior
When a new user message starts a fresh turn, OpenCode should:
- Detect the turn boundary (new user message)
- Strip
reasoning_contentfrom ALL previous assistant messages before sending to the API - Send only current-turn reasoning to DeepSeek
Reproduction
- Start conversation with DeepSeek V3 model (
deepseek-chatwith thinking enabled, ordeepseek-reasoner) - Make a multi-step conversation that outputs reasoning with tool calls
- On second and subsequent user messages, inspect the payload—previous turns' reasoning is being sent again
Suggested Fix
Add turn boundary detection in packages/opencode/src/provider/transform.ts:
if (model.providerID === "deepseek" || model.api.id.toLowerCase().includes("deepseek")) {
// Find last user message (start of current turn)
let lastUserIndex = -1
for (let i = msgs.length - 1; i >= 0; i--) {
if (msgs[i].role === "user") {
lastUserIndex = i
break
}
}
return msgs.map((msg, index) => {
if (msg.role === "assistant" && Array.isArray(msg.content)) {
const reasoningParts = msg.content.filter((part: any) => part.type === "reasoning")
const hasToolCalls = msg.content.some((part: any) => part.type === "tool-call")
const reasoningText = reasoningParts.map((part: any) => part.text).join("")
const filteredContent = msg.content.filter((part: any) => part.type !== "reasoning")
// Only include reasoning_content for messages in the current turn (after lastUserIndex)
if (hasToolCalls && reasoningText && index > lastUserIndex) {
return {
...msg,
content: filteredContent,
providerOptions: {
...msg.providerOptions,
openaiCompatible: {
.. .(msg.providerOptions as any)?.openaiCompatible,
reasoning_content: reasoningText,
},
},
}
}
// Strip reasoning from all other messages
return {
...msg,
content: filteredContent,
}
}
return msg
})
}References
- DeepSeek Thinking Mode API
- DeepSeek Multi-turn Conversation Section
- DeepSeek Tool Calls with Thinking Mode
- Current transform code location:
packages/opencode/src/provider/transform.ts
OpenCode Version: 1.0.x (current)
Affected Models: All DeepSeek V3.2 models with interleaved thinking support (deepseek-chat when thinking enabled, deepseek-reasoner)