Skip to content

DeepSeek V3.2 : reasoning_content not cleared from message history on new turns, causing excess token usage and violating API spec #5577

@cperion

Description

@cperion

Summary

OpenCode does not clear reasoning_content from previous conversation turns when sending messages to DeepSeek models, which violates the DeepSeek Thinking Mode API specification and causes unnecessary token usage, increased costs, and slower responses.

Problem

According to DeepSeek's official documentation for V3.2 models with thinking mode:

"In each turn of the conversation, the model outputs the CoT (reasoning_content) and the final answer (content). In the next turn of the conversation, the CoT from previous turns is not concatenated into the context"

The spec explicitly shows a clear_reasoning_content() function that should be called before Turn 2:

def clear_reasoning_content(messages):
    for message in messages:
        if hasattr(message, 'reasoning_content'):
            message.reasoning_content = None

Current Behavior

OpenCode's logic in packages/opencode/src/provider/transform.ts currently:

  • ✅ Correctly adds reasoning_content for tool call continuations within the same turn
  • ✅ Strips reasoning from individual messages without tool calls
  • ❌ Does not clear reasoning_content from ALL assistant messages in history when a new user turn begins

Result: reasoning chains from previous turns accumulate in the context window, being repeatedly sent in each new turn.

Impact

  • Wasted tokens: You pay for repeated reasoning_content from prior turns
  • Higher costs: DeepSeek charges per token
  • Slower responses: Larger context = more processing time
  • Context window pressure: Fills up context limits faster with redundant data
  • Spec violation: Not following DeepSeek's documented API contract

Concrete Example

Turn 1:

  • User asks question
  • Model reasons: 500 tokens
  • Makes tool call
  • Model reasons more: 300 tokens
  • Gives answer
  • Total reasoning: 800 tokens

Turn 2 (new user question):

  • OpenCode sends Turn 1's 800 reasoning tokens again
  • Model generates new reasoning: 600 tokens
  • API receives: 800 (old) + 600 (new) = 1,400 tokens
  • Should only send: 600 tokens

Expected Behavior

When a new user message starts a fresh turn, OpenCode should:

  1. Detect the turn boundary (new user message)
  2. Strip reasoning_content from ALL previous assistant messages before sending to the API
  3. Send only current-turn reasoning to DeepSeek

Reproduction

  1. Start conversation with DeepSeek V3 model (deepseek-chat with thinking enabled, or deepseek-reasoner)
  2. Make a multi-step conversation that outputs reasoning with tool calls
  3. On second and subsequent user messages, inspect the payload—previous turns' reasoning is being sent again

Suggested Fix

Add turn boundary detection in packages/opencode/src/provider/transform.ts:

if (model.providerID === "deepseek" || model.api.id.toLowerCase().includes("deepseek")) {
  // Find last user message (start of current turn)
  let lastUserIndex = -1
  for (let i = msgs.length - 1; i >= 0; i--) {
    if (msgs[i].role === "user") {
      lastUserIndex = i
      break
    }
  }

  return msgs.map((msg, index) => {
    if (msg.role === "assistant" && Array.isArray(msg.content)) {
      const reasoningParts = msg.content.filter((part: any) => part.type === "reasoning")
      const hasToolCalls = msg.content.some((part: any) => part.type === "tool-call")
      const reasoningText = reasoningParts.map((part: any) => part.text).join("")
      const filteredContent = msg.content.filter((part: any) => part.type !== "reasoning")

      // Only include reasoning_content for messages in the current turn (after lastUserIndex)
      if (hasToolCalls && reasoningText && index > lastUserIndex) {
        return {
          ...msg,
          content: filteredContent,
          providerOptions:  {
            ...msg.providerOptions,
            openaiCompatible: {
              .. .(msg.providerOptions as any)?.openaiCompatible,
              reasoning_content: reasoningText,
            },
          },
        }
      }
      // Strip reasoning from all other messages
      return {
        ...msg,
        content: filteredContent,
      }
    }
    return msg
  })
}

References

OpenCode Version: 1.0.x (current)
Affected Models: All DeepSeek V3.2 models with interleaved thinking support (deepseek-chat when thinking enabled, deepseek-reasoner)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions