Skip to content

Fix Qwen3 thinking mode (#224)#225

Merged
DenisovAV merged 2 commits intomainfrom
fix/issue-224-qwen3-thinking
Apr 18, 2026
Merged

Fix Qwen3 thinking mode (#224)#225
DenisovAV merged 2 commits intomainfrom
fix/issue-224-qwen3-thinking

Conversation

@DenisovAV
Copy link
Copy Markdown
Owner

Summary

  • Qwen3 generates <think> blocks by default — now stripped when isThinking: false
  • Separate Qwen filter (starts insideThinking=false, detects opening <think> tag) — safe for Qwen2.5 which doesn't generate thinking
  • Always apply thinking filter for Qwen/DeepSeek/Gemma4, discard ThinkingResponse when user didn't request it

Tested on 6 models × 2 modes = 12 tests passed (Qwen3, Qwen2.5, Gemma4 E2B, Gemma3 1B, Gemma3n E2B, FunctionGemma 270M)

Closes #224

- Add separate Qwen thinking filter (insideThinking=false, detects <think> opening tag)
- Always apply thinking filter for models that may generate thinking (Qwen, DeepSeek, Gemma 4)
- Strip ThinkingResponse when isThinking=false
- Add Qwen3 thinking support to README and model config
- Bump version to 0.13.5
- Rewrite DeepSeek and Qwen stream filters with buffer pattern (like Gemma 4)
- Handle partial <think>/<​/think> tags split across token boundaries
- Add unit tests for partial tag split cases (DeepSeek + Qwen)
- Add Qwen2.5 passthrough test (no thinking tags)
@DenisovAV DenisovAV merged commit 20c909c into main Apr 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Thinking mode not correctly handled in litert-community/Qwen3-0.6B

1 participant