Implement per-model reasoning fixups for Anthropic models#3067
Implement per-model reasoning fixups for Anthropic models#3067amitksingh1490 merged 6 commits intotailcallhq:mainfrom
Conversation
|
This is great, but there is a problem, we need to implement this transformer for every provider which has anthropic models. (Bedrock, Vertex-anthropic, Opencode Go, Requesty) etc. But if we do it on orch.rs like we do here #3031 we will do it at one place and every provider would work. |
|
@amitksingh1490 agreed, this should be extended to all providers. But #3031 doesn't really solve the Anthropic conundrum either: they have 4 model "thinking tiers" and each of those need some special handling to make sure they're not broken. I.e. a remap from 'thinking: enabled' to 'thinking: adaptive' is a must for opus-4.7 to work, otherwise it gets so stupid it cannot even understand how to use built-in tools in forge. |
|
ya that pr was for reference implementation. It is solving different issue |
|
Yeah, in general, you're right that broken Opus 4.7 is a "further off" issue. Either way: I've moved the fix further upstream so it will apply to Bedrock and any other providers with Anthropic models. One note is that Bedrock documentation doesn't cover Opus 4.7 at all, so it's a bit speculative. |
… dto to transformers
Co-Authored-By: ForgeCode <noreply@forgecode.dev>
4c8cc11 to
ecceb06
Compare
Anthropic models are currently in a state of flux in respect to their thinking configs.
For example, Opus 4.5 doesn't support 'xhigh' and 'max' effort and no adaptive thinking. Opus 4.6 adds 'max' effort and adaptive thinking. Opus 4.7 adds 'xhigh' and drops everything but adaptive thinking.
This gives us the following "tiers":
thinkingeffortxhighmaxtemperature/top_p/top_kAdaptiveOnlyopus-4-7AdaptiveFriendlyopus-4-6,sonnet-4-6maxLegacyWithEffortopus-4-5highhighLegacyNoEffortOn top of that, on Opus 4.7 adaptive thinking hides reasoning content by default. Forge opts into visible reasoning unless the caller set
reasoning.exclude: true. Which means the following has to be applied:ReasoningConfig.excludedisplayNone(unset)"summarized"Some(false)"summarized"Some(true)"omitted"Furthermore, Opus 4.7 adds
task_budget. This doesn't map cleanly to any pre-existing settings, and remapping reasoning budget to this new API concept doesn't seem quite right.This is fairly lightly tested, but I'm running opus-4-7 with max effort in my coding session and with this fix it feels much better: I assume 'max' effort is respected now.
I also ran some smoke tests (such as "car wash problem") and it now consistently correctly answers on max effort (compared to failing in 2.11.3 w/o this fix) which hints at thinking/effort being broken before.
Fixes #3066
Also, as it turns out, fixes #3030 and is alternative to #3031