FEATURE: SDK should support generic local inference endpoints as model providers

## Problem Statement

The Copilot SDK does not expose a configuration path for **generic local inference endpoints** (non-Anthropic BYOM). This blocks extensions and integrations from routing agent tasks through local models.

### Current State
- BYOM support exists but is limited to Anthropic-specific configurations
- No mechanism to register arbitrary OpenAI-compatible endpoints as model providers
- Subagent dispatch has no local routing path

### Impact on Extensions
Extensions like VSCode-LMStudio-Bridge cannot expose local models to the Copilot ecosystem because:
1. No SDK API for registering custom model providers
2. No configuration schema for generic inference endpoints
3. Subagent tasks bypass any local routing entirely

## How Other Agents Handle This

### Opencode (anomalyco/opencode)
- `ProviderConfig` abstraction with explicit provider registration
- Supports llama.cpp, Ollama, and generic OpenAI-compatible endpoints
- Provider configuration flows through all dispatch paths including subagents

### Codex (openai/codex)
- `ModelProvider` abstraction with routing layer (`models_endpoint.rs`)
- Clean separation between tool execution and model inference
- Supports multiple providers including local endpoints

## What We Need in the SDK

1. **Model provider registration API**: Allow extensions to register custom model providers
2. **Generic endpoint support**: Support arbitrary OpenAI-compatible inference endpoints
3. **Provider priority/fallback**: Support tiered routing (local first, cloud fallback)
4. **Subagent model inheritance**: Ensure subagent tasks respect the session model

## Security Implications

- **Data leakage**: Workspace context transmitted to cloud without user consent
- **Cost opacity**: Untracked cloud API usage from subagent tasks
- **Trust erosion**: Silent cloud routing violates local-only expectations
- **Compliance risk**: Sensitive code may be processed by third-party inference services

## Related Issues

- github/copilot-cli#3565 — multiplier guard silently downgrades subagent model
- ian-morgan99/VSCode-LMStudio-Bridge#337 — bridge-level gap analysis
- ian-morgan99/VSCode-LMStudio-Bridge#336 — zero BYOM integration in bridge

## Acceptance Criteria

1. [ ] SDK exposes API for registering custom model providers
2. [ ] Generic OpenAI-compatible endpoints can be configured as providers
3. [ ] Provider priority/fallback configuration is supported
4. [ ] Subagent tasks respect session model when it is a local endpoint
5. [ ] Documentation updated with provider registration and routing behavior

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: SDK should support generic local inference endpoints as model providers #1545

Problem Statement

Current State

Impact on Extensions

How Other Agents Handle This

Opencode (anomalyco/opencode)

Codex (openai/codex)

What We Need in the SDK

Security Implications

Related Issues

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

FEATURE: SDK should support generic local inference endpoints as model providers #1545

Description

Problem Statement

Current State

Impact on Extensions

How Other Agents Handle This

Opencode (anomalyco/opencode)

Codex (openai/codex)

What We Need in the SDK

Security Implications

Related Issues

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions