-
Notifications
You must be signed in to change notification settings - Fork 116
Description
Problem Description
When using Qwen models (specifically qwen3-4b, qwen3-8b, qwen3-14b, qwen3-32b) through ContextGem, we encounter the following error:
LLMAPIError( contextgem.internal.exceptions.LLMAPIError: Exception occurred while calling LLM API (after 3 retries) - Original error: litellm.BadRequestError: OpenAIException - parameter.enable_thinking must be set to false for non-streaming calls LiteLLM Retried: 3 times
This error occurs because:
- Qwen3 series models support thinking mode (reasoning capabilities)
- In non-streaming calls, these models require
enable_thinking=false
to be explicitly set - ContextGem currently doesn't handle this parameter automatically
Expected Behavior
ContextGem should automatically handle the enable_thinking
parameter for Qwen models, similar to how it handles other model-specific parameters.
Current Workaround
Currently, we need to use Monkey Patching to modify the _build_request_config
method:
def apply_qwen_thinking_patch():
from contextgem import DocumentLLM
original_build_request_config = DocumentLLM._build_request_config
def patched_build_request_config(self):
request_config = original_build_request_config(self)
if not self._supports_reasoning:
request_config["enable_thinking"] = False
return request_config
DocumentLLM._build_request_config = patched_build_request_config
Proposed Solution
Add support for enable_thinking
parameter in the _build_request_config
method:
# In _build_request_config method, before return request_config:
if not self._supports_reasoning:
# For models that support thinking but are used in non-reasoning mode
# (like Qwen3 series), disable thinking mode
request_config["enable_thinking"] = False
Environment
- ContextGem version: [latest]
- Python version: 3.11+
- Model: dashscope/qwen3-4b, dashscope/qwen3-8b, etc.
- API: DashScope (Alibaba Cloud)
Additional Context
This issue affects users who want to use Qwen3 series models with ContextGem. The models work fine when enable_thinking=false
is set, but ContextGem doesn't handle this automatically, requiring manual workarounds.
Related
- This is similar to how ContextGem handles other model-specific parameters
- The fix should be backward compatible and not affect other models
- The solution should work for all Qwen3 series models (4b, 8b, 14b, 32b)
Priority
This is a blocking issue for users wanting to use Qwen3 models with ContextGem. A simple fix would greatly improve the user experience.