Skip to content

feat(runtime-fallback): automatic model switching on API errors#1408

Open
youming-ai wants to merge 13 commits intocode-yeongyu:devfrom
youming-ai:feat/runtime-fallback-only
Open

feat(runtime-fallback): automatic model switching on API errors#1408
youming-ai wants to merge 13 commits intocode-yeongyu:devfrom
youming-ai:feat/runtime-fallback-only

Conversation

@youming-ai
Copy link

@youming-ai youming-ai commented Feb 3, 2026

Summary

Implements runtime model fallback that automatically switches to backup models when the primary model encounters transient errors (rate limits, overload, etc.).

Background

This is a cleaned-up version of #1237. The original PR contained both:

  1. ✅ (init-time) - Already merged to dev via commit 81db76f
  2. ✅ (runtime) - This PR

The original PR had merge conflicts due to the already-merged feature. This new PR contains only the unique runtime-fallback functionality.

Changes

  • Add runtime_fallback configuration with customizable error codes, cooldown, and notifications
  • Implement runtime-fallback hook that intercepts API errors (429, 503, 529)
  • Support fallback_models from agent/category configuration
  • Full TypeScript types and comprehensive tests

Merge resolution note

  • Rebased onto latest dev and regenerated assets/oh-my-opencode.schema.json
  • Fixed script/build-schema.ts to use Zod v4 toJSONSchema() (zod-to-json-schema returned {} with Zod v4)

Configuration Example

{
  "runtime_fallback": {
    "enabled": true,
    "retry_on_errors": [429, 503, 529],
    "max_fallback_attempts": 3,
    "cooldown_seconds": 60,
    "notify_on_fallback": true
  },
  "agents": {
    "sisyphus": {
      "model": "anthropic/claude-opus-4-5",
      "fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"]
    }
  }
}

Testing

  • ✅ bun run typecheck
  • ✅ bun test

Supersedes #1237


Summary by cubic

Automatic runtime model fallback switches to backup models on transient API errors to keep conversations running. Adds config and a hook that detects rate limits and applies the switch on the next request.

  • New Features

    • Runtime fallback hook handles session.error and assistant message errors; switches model for the next request.
    • New config: runtime_fallback (enabled by default) with retry_on_errors, max_fallback_attempts, cooldown_seconds, and notify_on_fallback.
    • Supports agent/category fallback_models at init-time and runtime; model resolution prioritizes user-configured fallbacks before built-in chains and picks the next available model while honoring cooldowns and attempt limits.
    • Optional toast notification when a fallback is triggered.
  • Migration

    • Add fallback_models to agents or categories to enable switching.
    • Optionally tune runtime_fallback settings in the config.

Written for commit bd7f2be. Summary will update on new commits.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

All contributors have signed the CLA. Thank you! ✅
Posted by the CLA Assistant Lite bot.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ba495f53a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 90 to 94
if (agent && pluginConfig.agents?.[agent as keyof typeof pluginConfig.agents]) {
const agentConfig = pluginConfig.agents[agent as keyof typeof pluginConfig.agents]
if (agentConfig?.fallback_models) {
return normalizeFallbackModels(agentConfig.fallback_models)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use category fallback_models when resolving fallbacks

This resolver only looks at pluginConfig.agents (and a sessionID heuristic) to find fallback models, so any fallback_models configured under categories never take effect. If an agent inherits its model via a category (which is the common configuration path), the hook will still log “No fallback models configured” and skip fallback even though the category has them. Consider resolving the agent’s category (from agent config or event info) and falling back to pluginConfig.categories[category].fallback_models when agent-specific overrides are absent.

Useful? React with 👍 / 👎.

Comment on lines 112 to 139
if (!state.failedModels.has(model)) return false

const cooldownMs = cooldownSeconds * 1000
const timeSinceLastFallback = Date.now() - state.lastFallbackTime

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track cooldown per failed model to avoid global lockout

Cooldown is computed from a single lastFallbackTime for all failed models, so any fallback resets the cooldown window for every model in failedModels. In sessions where multiple fallbacks happen quickly, older models can stay blocked longer than the configured cooldown_seconds, and the list can be exhausted even though some models should be eligible again. Store per-model failure timestamps (e.g., a map of model → lastFailedAt) so the cooldown applies to each model independently.

Useful? React with 👍 / 👎.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 8 files

Confidence score: 3/5

  • Category-level fallback_models are ignored in src/hooks/runtime-fallback/index.ts, so agents inheriting category config may not fall back as intended, which could reduce reliability under failure.
  • Cooldown tracking in src/hooks/runtime-fallback/index.ts uses a single lastFallbackTime for all models, potentially extending cooldowns incorrectly when multiple models fail, which can skew fallback behavior.
  • Overall risk is moderate due to multiple runtime-fallback logic issues that could affect retry/fallback decisions in production flows.
  • Pay close attention to src/hooks/runtime-fallback/index.ts, src/hooks/runtime-fallback/constants.ts - fallback selection and error classification logic may misbehave under certain conditions.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/constants.ts">

<violation number="1" location="src/hooks/runtime-fallback/constants.ts:32">
P2: Unanchored numeric regexes (`/429/`, `/503/`, `/529/`) will match any occurrence of those digits inside larger numbers, causing false retry/fallback classification on unrelated error messages.</violation>
</file>

<file name="src/hooks/runtime-fallback/index.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.ts:97">
P2: Fallback agent detection is limited to a hardcoded regex list, so custom agents in config won’t be recognized from sessionID when the event lacks an agent, preventing configured fallback models from being used.</violation>

<violation number="2" location="src/hooks/runtime-fallback/index.ts:108">
P2: The fallback model resolver only checks agent-specific `fallback_models` but ignores category-level fallback configurations. Since `CategoryConfigSchema` supports `fallback_models` and agents commonly inherit settings from categories, this function should also resolve the agent's category and check `pluginConfig.categories[category].fallback_models` when agent-specific fallbacks are not defined.</violation>

<violation number="3" location="src/hooks/runtime-fallback/index.ts:115">
P2: Cooldown tracking uses a single `lastFallbackTime` timestamp for all models, which causes incorrect cooldown behavior. When multiple models fail in sequence, earlier failures have their cooldown window extended by later failures. Consider storing per-model failure timestamps (e.g., `failedModels: Map<string, number>` instead of `Set<string>`) so each model's cooldown is tracked independently.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/index.test.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.test.ts:476">
P2: Test assertion permits the "No fallback models configured" path, so the new tests can pass without exercising any fallback switching logic; with the provided mock config (no fallback_models), these tests become ineffective and mask real failures.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tools/delegate-task/executor.ts">

<violation number="1" location="src/tools/delegate-task/executor.ts:552">
P2: `executeSyncTask` registers the session category but never removes it, leaving `SessionCategoryRegistry` entries to accumulate for each sync session. Consider removing the session from the registry on completion/error (similar to `subagentSessions.delete`).</violation>
</file>

<file name="src/agents/utils.ts">

<violation number="1" location="src/agents/utils.ts:307">
P3: Duplicated fallback_models normalization logic is repeated in four places. This should be centralized (e.g., a helper) to avoid inconsistent changes later.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Rebase Bot and others added 9 commits February 4, 2026 19:41
Add configuration schemas for runtime model fallback feature:
- RuntimeFallbackConfigSchema with enabled, retry_on_errors,
  max_fallback_attempts, cooldown_seconds, notify_on_fallback
- FallbackModelsSchema for init-time fallback model selection
- Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema
- Export types and schemas from config/index.ts

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add Category-level fallback_models support in getFallbackModelsForSession()
  - Try agent-level fallback_models first
  - Then try agent's category fallback_models
  - Support all builtin agents including hephaestus, sisyphus-junior, build, plan

- Expand agent name recognition regex to include:
  - hephaestus, sisyphus-junior, build, plan, multimodal-looker

- Add comprehensive test coverage (6 new tests, total 24):
  - Model switching via chat.message hook
  - Agent-level fallback_models configuration
  - SessionID agent pattern detection
  - Cooldown mechanism validation
  - Max attempts limit enforcement

All 24 tests passing

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Implement full fallback_models support across all integration points:

1. Model Resolution Pipeline (src/shared/model-resolution-pipeline.ts)
   - Add userFallbackModels to ModelResolutionRequest
   - Process user fallback_models before hardcoded fallback chain
   - Support both connected provider and availability checking modes

2. Agent Utils (src/agents/utils.ts)
   - Update applyModelResolution to accept userFallbackModels
   - Inject fallback_models for all builtin agents (sisyphus, oracle, etc.)
   - Support both single string and array formats

3. Model Resolver (src/shared/model-resolver.ts)
   - Add userFallbackModels to ExtendedModelResolutionInput type
   - Pass through to resolveModelPipeline

4. Delegate Task Executor (src/tools/delegate-task/executor.ts)
   - Extract category fallback_models configuration
   - Pass to model resolution pipeline
   - Register session category for runtime-fallback hook

5. Session Category Registry (src/shared/session-category-registry.ts)
   - New module: maps sessionID -> category
   - Used by runtime-fallback to lookup category fallback_models
   - Auto-cleanup support

6. Runtime Fallback Hook (src/hooks/runtime-fallback/index.ts)
   - Check SessionCategoryRegistry first for category fallback_models
   - Fallback to agent-level configuration
   - Import and use SessionCategoryRegistry

Test Results:
- runtime-fallback: 24/24 tests passing
- model-resolver: 46/46 tests passing

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
youming-ai and others added 4 commits February 5, 2026 23:15
…ching

Replace word-boundary regex with stricter patterns that match

status codes only at start/end of string or surrounded by whitespace.

Prevents false matches like '1429' or '4290'.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add shared utility to normalize fallback_models config values.

Handles both single string and array inputs consistently.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace 5 instances of inline fallback_models normalization with

the shared normalizeFallbackModels() utility function.

Eliminates code duplication and ensures consistent behavior.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Resolved conflicts in:

- src/config/schema.ts (kept both hooks)

- src/hooks/index.ts (exported both hooks)

- src/index.ts (imported both hooks)

- src/shared/index.ts (exported both utilities)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants