Skip to content

FEATURE: SDK should support generic local inference endpoints as model providers #1545

@ian-morgan99

Description

@ian-morgan99

Problem Statement

The Copilot SDK does not expose a configuration path for generic local inference endpoints (non-Anthropic BYOM). This blocks extensions and integrations from routing agent tasks through local models.

Current State

  • BYOM support exists but is limited to Anthropic-specific configurations
  • No mechanism to register arbitrary OpenAI-compatible endpoints as model providers
  • Subagent dispatch has no local routing path

Impact on Extensions

Extensions like VSCode-LMStudio-Bridge cannot expose local models to the Copilot ecosystem because:

  1. No SDK API for registering custom model providers
  2. No configuration schema for generic inference endpoints
  3. Subagent tasks bypass any local routing entirely

How Other Agents Handle This

Opencode (anomalyco/opencode)

  • ProviderConfig abstraction with explicit provider registration
  • Supports llama.cpp, Ollama, and generic OpenAI-compatible endpoints
  • Provider configuration flows through all dispatch paths including subagents

Codex (openai/codex)

  • ModelProvider abstraction with routing layer (models_endpoint.rs)
  • Clean separation between tool execution and model inference
  • Supports multiple providers including local endpoints

What We Need in the SDK

  1. Model provider registration API: Allow extensions to register custom model providers
  2. Generic endpoint support: Support arbitrary OpenAI-compatible inference endpoints
  3. Provider priority/fallback: Support tiered routing (local first, cloud fallback)
  4. Subagent model inheritance: Ensure subagent tasks respect the session model

Security Implications

  • Data leakage: Workspace context transmitted to cloud without user consent
  • Cost opacity: Untracked cloud API usage from subagent tasks
  • Trust erosion: Silent cloud routing violates local-only expectations
  • Compliance risk: Sensitive code may be processed by third-party inference services

Related Issues

Acceptance Criteria

  1. SDK exposes API for registering custom model providers
  2. Generic OpenAI-compatible endpoints can be configured as providers
  3. Provider priority/fallback configuration is supported
  4. Subagent tasks respect session model when it is a local endpoint
  5. Documentation updated with provider registration and routing behavior

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions