Skip to content

fix: 修复模型调用失败与 Warmup 机制#2075

Open
Bright-Chengliang wants to merge 13 commits intolbjlaq:mainfrom
Bright-Chengliang:custom/warmup-fix-and-model-updates
Open

fix: 修复模型调用失败与 Warmup 机制#2075
Bright-Chengliang wants to merge 13 commits intolbjlaq:mainfrom
Bright-Chengliang:custom/warmup-fix-and-model-updates

Conversation

@Bright-Chengliang
Copy link

@Bright-Chengliang Bright-Chengliang commented Feb 22, 2026

Summary

  • 修复 Claude Sonnet 4.6 模型调用失败问题 (should_enable_thinking_by_default 未匹配 sonnet-4-6)
  • 修复 Claude Opus 4.6 Thinking warmup 400 错误 (ThinkingConfig budget 与 max_tokens 冲突)
  • 修复 warmup 机制对 GPT-OSS 模型的错误调用 (非 Google 模型)
  • 修复 Gemini 模型 warmup 500 错误 (wrap_request 覆盖 maxOutputTokens)
  • 修复配额监视器显示 i18n 原始 key 而非短标签
  • 修复 Claude 模型在配额关注列表中不显示的问题
  • 更新客户端版本至 1.18.4 以支持 Gemini 3.1 Pro

Test plan

  • Claude Sonnet 4.6 调用成功
  • Claude Opus 4.6 Thinking 调用成功
  • Gemini 3.1 Pro 调用成功
  • 配额监视器正确显示模型短标签

🤖 Generated with Claude Code

Bright-Chengliang and others added 13 commits February 22, 2026 10:34
…ay fix

- Rewrite warmup handler: send minimal 1-token requests with full req/resp logging
- Fix scheduler: add scheduled_warmup.enabled check
- Enhanced quota.rs warmup logging with complete request/response bodies
- Merge upstream v4.1.22 model mappings with local GPT-OSS & Claude 4.6 models
- Fix PinnedQuotaModels: remove thinking filter that hid all Claude models
- Update modelConfig.ts with i18n fields, sortModels export, and IconComponent type fix
- Adopt dynamic MODEL_CONFIG-driven approach in useProxyModels

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AccountCard: remove hardcoded thinking variant filter that hid all Claude models,
  replace with shortLabel-based dedup
- AccountTable: remove `id.includes('thinking')` filter that blocked Claude display
- Add missing `proxy.model.gpt_oss` translation to zh.json and en.json

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AccountTable now shows concise model keywords (e.g. "Opus 4.6 TK",
"G3 Flash", "OSS 120B") instead of verbose i18n descriptions
("最强思维", "极速预览", "开源大模型").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root causes and fixes:
- Claude models (400/404): Added missing `anthropic-beta` header by
  switching from call_v1_internal to call_v1_internal_with_headers
- GPT-OSS models (500): Skip warmup for non-Google models (gpt-oss,
  gpt-4, gpt-3) since they can't be sent to v1internal API
- Gemini models (500): wrap_request() auto-injects thinkingConfig and
  overrides maxOutputTokens to 32768+; now force-reset to 1 and remove
  thinkingConfig after wrapping to keep warmup minimal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause: transform_claude_request_in() auto-injects ThinkingConfig
with budget=10000 for thinking models, but warmup's max_tokens=1 is
less than the budget, causing Google v1internal API to return 400.

Fix:
- Explicitly set thinking.type="disabled" in warmup ClaudeRequest to
  prevent auto-injection of ThinkingConfig
- After transform, force remove thinkingConfig and reset maxOutputTokens
  to 1 as double safety measure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Google v1internal API naming rules differ between Sonnet and Opus:
- Sonnet: `claude-sonnet-4-6` (NO -thinking suffix)
- Opus: `claude-opus-4-6-thinking` (WITH -thinking suffix)

Updated all references across:
- model_mapping.rs: core model + all alias mappings
- opencode_sync.rs: ModelDef + ANTIGRAVITY_MODEL_IDS
- config.rs: default_pinned_models
- modelConfig.ts: frontend MODEL_CONFIG key and labels

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Google v1internal API checks x-client-version header and rejects
Gemini 3.1 Pro requests from clients reporting version < 1.18.x
("Gemini 3.1 Pro is not available on this version").

Updated KNOWN_STABLE constants to match Antigravity 1.18.4:
- Version: 1.16.5 -> 1.18.4
- Chrome: 132.0.6834.160 -> 142.0.7444.175
- Electron: 39.2.3 (unchanged)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add sonnet-4-6 variants to should_enable_thinking_by_default() so that
ThinkingConfig is auto-injected for claude-sonnet-4-6. Without this, the
Google v1internal API rejects requests because Claude models require
thinking configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t-4-6-thinking

- Map gemini-3.1-flash -> gemini-3-flash (invalid model ID from clients)
- Map claude-sonnet-4-6-thinking -> claude-sonnet-4-6 (deprecated name)

Both were returning 429 due to missing mapping entries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change system instruction role from "Antigravity" to "Aether"
- Use concise unrestricted assistant prompt across all 3 mappers (Claude, OpenAI, Gemini)
- Update duplicate detection strings and tests accordingly
- Bump known stable version to 1.107.0 with version validation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents "thinking.signature: Field required" and similar upstream API
rejections by adding three layers of defense:

1. Pre-deserialization JSON sanitizer that fills missing/null fields
   with defaults (signature, thinking, text, tool_use, tool_result)
2. #[serde(default)] on thinking text field to prevent deser failures
3. Strip thinking blocks with None signature instead of passing through
   (None was being omitted by skip_serializing_if, triggering upstream error)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…i requests

- Add capabilities (vision, function_calling) to /v1/models endpoint so
  clients like Cherry Studio correctly detect multimodal support
- Preserve images in tool_result for current turn instead of stripping all
  images indiscriminately (Claude protocol path)
- Inject warning text instead of silently dropping unreadable file:// images
  (OpenAI protocol path)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant