Skip to content

SDK should support generic local inference for subagent/tool-call tasks (BYOM limited to Anthropic only) #1544

@ian-morgan99

Description

@ian-morgan99

Problem

The Copilot SDK does not expose a configuration path to route subagent and tool-call tasks through generic local inference endpoints (e.g., LM Studio, Ollama). While BYOM providers are supported, they are limited to Anthropic providers — there is no mechanism for arbitrary local model routing.

What BYOM Does Today

  • BYOM supports Anthropic provider configuration
  • Subagents inherit the session model by design
  • The multiplier guard prevents subagent model escalation (see #3565)

What BYOM Does Not Do

  • Register generic local inference servers as providers
  • Route subagent tasks through LM Studio or other local endpoints
  • Support non-Anthropic external model providers

Impact

  • Privacy: workspace context leaks to cloud when local routing unavailable
  • Security: sensitive code transmitted to third-party inference when no local path exists
  • Cost: untracked cloud API usage from subagent tasks that could run locally
  • Trust: silent cloud routing violates local-only expectations

Requested Behavior

  1. SDK configuration for generic local model routing in subagent/tool-call paths (not just Anthropic BYOM)
  2. Per-request cost/privacy warnings when cloud fallback triggers
  3. Tiered privacy levels with explicit opt-in for cloud routing
  4. Default to local-first routing where available

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions