feat(runtime-fallback): automatic model switching on API errors#1408
feat(runtime-fallback): automatic model switching on API errors#1408youming-ai wants to merge 13 commits intocode-yeongyu:devfrom
Conversation
|
All contributors have signed the CLA. Thank you! ✅ |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5ba495f53a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/hooks/runtime-fallback/index.ts
Outdated
| if (agent && pluginConfig.agents?.[agent as keyof typeof pluginConfig.agents]) { | ||
| const agentConfig = pluginConfig.agents[agent as keyof typeof pluginConfig.agents] | ||
| if (agentConfig?.fallback_models) { | ||
| return normalizeFallbackModels(agentConfig.fallback_models) | ||
| } |
There was a problem hiding this comment.
Use category fallback_models when resolving fallbacks
This resolver only looks at pluginConfig.agents (and a sessionID heuristic) to find fallback models, so any fallback_models configured under categories never take effect. If an agent inherits its model via a category (which is the common configuration path), the hook will still log “No fallback models configured” and skip fallback even though the category has them. Consider resolving the agent’s category (from agent config or event info) and falling back to pluginConfig.categories[category].fallback_models when agent-specific overrides are absent.
Useful? React with 👍 / 👎.
src/hooks/runtime-fallback/index.ts
Outdated
| if (!state.failedModels.has(model)) return false | ||
|
|
||
| const cooldownMs = cooldownSeconds * 1000 | ||
| const timeSinceLastFallback = Date.now() - state.lastFallbackTime | ||
|
|
There was a problem hiding this comment.
Track cooldown per failed model to avoid global lockout
Cooldown is computed from a single lastFallbackTime for all failed models, so any fallback resets the cooldown window for every model in failedModels. In sessions where multiple fallbacks happen quickly, older models can stay blocked longer than the configured cooldown_seconds, and the list can be exhausted even though some models should be eligible again. Store per-model failure timestamps (e.g., a map of model → lastFailedAt) so the cooldown applies to each model independently.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
4 issues found across 8 files
Confidence score: 3/5
- Category-level
fallback_modelsare ignored insrc/hooks/runtime-fallback/index.ts, so agents inheriting category config may not fall back as intended, which could reduce reliability under failure. - Cooldown tracking in
src/hooks/runtime-fallback/index.tsuses a singlelastFallbackTimefor all models, potentially extending cooldowns incorrectly when multiple models fail, which can skew fallback behavior. - Overall risk is moderate due to multiple runtime-fallback logic issues that could affect retry/fallback decisions in production flows.
- Pay close attention to
src/hooks/runtime-fallback/index.ts,src/hooks/runtime-fallback/constants.ts- fallback selection and error classification logic may misbehave under certain conditions.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/hooks/runtime-fallback/constants.ts">
<violation number="1" location="src/hooks/runtime-fallback/constants.ts:32">
P2: Unanchored numeric regexes (`/429/`, `/503/`, `/529/`) will match any occurrence of those digits inside larger numbers, causing false retry/fallback classification on unrelated error messages.</violation>
</file>
<file name="src/hooks/runtime-fallback/index.ts">
<violation number="1" location="src/hooks/runtime-fallback/index.ts:97">
P2: Fallback agent detection is limited to a hardcoded regex list, so custom agents in config won’t be recognized from sessionID when the event lacks an agent, preventing configured fallback models from being used.</violation>
<violation number="2" location="src/hooks/runtime-fallback/index.ts:108">
P2: The fallback model resolver only checks agent-specific `fallback_models` but ignores category-level fallback configurations. Since `CategoryConfigSchema` supports `fallback_models` and agents commonly inherit settings from categories, this function should also resolve the agent's category and check `pluginConfig.categories[category].fallback_models` when agent-specific fallbacks are not defined.</violation>
<violation number="3" location="src/hooks/runtime-fallback/index.ts:115">
P2: Cooldown tracking uses a single `lastFallbackTime` timestamp for all models, which causes incorrect cooldown behavior. When multiple models fail in sequence, earlier failures have their cooldown window extended by later failures. Consider storing per-model failure timestamps (e.g., `failedModels: Map<string, number>` instead of `Set<string>`) so each model's cooldown is tracked independently.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/hooks/runtime-fallback/index.test.ts">
<violation number="1" location="src/hooks/runtime-fallback/index.test.ts:476">
P2: Test assertion permits the "No fallback models configured" path, so the new tests can pass without exercising any fallback switching logic; with the provided mock config (no fallback_models), these tests become ineffective and mask real failures.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
2 issues found across 7 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/tools/delegate-task/executor.ts">
<violation number="1" location="src/tools/delegate-task/executor.ts:552">
P2: `executeSyncTask` registers the session category but never removes it, leaving `SessionCategoryRegistry` entries to accumulate for each sync session. Consider removing the session from the registry on completion/error (similar to `subagentSessions.delete`).</violation>
</file>
<file name="src/agents/utils.ts">
<violation number="1" location="src/agents/utils.ts:307">
P3: Duplicated fallback_models normalization logic is repeated in four places. This should be centralized (e.g., a helper) to avoid inconsistent changes later.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
2787659 to
78bb36a
Compare
a0c1a63 to
adae0f7
Compare
Add configuration schemas for runtime model fallback feature: - RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback - FallbackModelsSchema for init-time fallback model selection - Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema - Export types and schemas from config/index.ts Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add Category-level fallback_models support in getFallbackModelsForSession() - Try agent-level fallback_models first - Then try agent's category fallback_models - Support all builtin agents including hephaestus, sisyphus-junior, build, plan - Expand agent name recognition regex to include: - hephaestus, sisyphus-junior, build, plan, multimodal-looker - Add comprehensive test coverage (6 new tests, total 24): - Model switching via chat.message hook - Agent-level fallback_models configuration - SessionID agent pattern detection - Cooldown mechanism validation - Max attempts limit enforcement All 24 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Implement full fallback_models support across all integration points: 1. Model Resolution Pipeline (src/shared/model-resolution-pipeline.ts) - Add userFallbackModels to ModelResolutionRequest - Process user fallback_models before hardcoded fallback chain - Support both connected provider and availability checking modes 2. Agent Utils (src/agents/utils.ts) - Update applyModelResolution to accept userFallbackModels - Inject fallback_models for all builtin agents (sisyphus, oracle, etc.) - Support both single string and array formats 3. Model Resolver (src/shared/model-resolver.ts) - Add userFallbackModels to ExtendedModelResolutionInput type - Pass through to resolveModelPipeline 4. Delegate Task Executor (src/tools/delegate-task/executor.ts) - Extract category fallback_models configuration - Pass to model resolution pipeline - Register session category for runtime-fallback hook 5. Session Category Registry (src/shared/session-category-registry.ts) - New module: maps sessionID -> category - Used by runtime-fallback to lookup category fallback_models - Auto-cleanup support 6. Runtime Fallback Hook (src/hooks/runtime-fallback/index.ts) - Check SessionCategoryRegistry first for category fallback_models - Fallback to agent-level configuration - Import and use SessionCategoryRegistry Test Results: - runtime-fallback: 24/24 tests passing - model-resolver: 46/46 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
adae0f7 to
cea5f43
Compare
…ching Replace word-boundary regex with stricter patterns that match status codes only at start/end of string or surrounded by whitespace. Prevents false matches like '1429' or '4290'. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add shared utility to normalize fallback_models config values. Handles both single string and array inputs consistently. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace 5 instances of inline fallback_models normalization with the shared normalizeFallbackModels() utility function. Eliminates code duplication and ensures consistent behavior. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Resolved conflicts in: - src/config/schema.ts (kept both hooks) - src/hooks/index.ts (exported both hooks) - src/index.ts (imported both hooks) - src/shared/index.ts (exported both utilities)
Summary
Implements runtime model fallback that automatically switches to backup models when the primary model encounters transient errors (rate limits, overload, etc.).
Background
This is a cleaned-up version of #1237. The original PR contained both:
The original PR had merge conflicts due to the already-merged feature. This new PR contains only the unique runtime-fallback functionality.
Changes
Merge resolution note
assets/oh-my-opencode.schema.jsonscript/build-schema.tsto use Zod v4toJSONSchema()(zod-to-json-schema returned{}with Zod v4)Configuration Example
{ "runtime_fallback": { "enabled": true, "retry_on_errors": [429, 503, 529], "max_fallback_attempts": 3, "cooldown_seconds": 60, "notify_on_fallback": true }, "agents": { "sisyphus": { "model": "anthropic/claude-opus-4-5", "fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"] } } }Testing
Supersedes #1237
Summary by cubic
Automatic runtime model fallback switches to backup models on transient API errors to keep conversations running. Adds config and a hook that detects rate limits and applies the switch on the next request.
New Features
Migration
Written for commit bd7f2be. Summary will update on new commits.