feat: Model-selection policy schema for agentic workflows

## Summary

Define a model-selection policy object that captures how agentic workflows choose, validate, and fall back between models at runtime. The schema lives in this repo (gh-aw-firewall) alongside other policy primitives (network ACLs, seccomp profiles), but is understood by the gh-aw compiler and enforced by AWF at execution time.

Companion to [gh-aw#29191](https://github.com/github/gh-aw/issues/29191) (model fallback feature request).

## Proposed Policy Schema

```jsonc
{
  "$schema": "https://github.com/github/gh-aw-firewall/schemas/model-policy.v1.json",
  "version": "1",

  // Primary model selection
  "model": {
    "id": "gpt-5.2",                    // requested model
    "reasoning_effort": "medium",        // optional engine-specific param
    "provider": "copilot"                // copilot | anthropic | openai | custom
  },

  // Fallback chain — tried in order when primary is unavailable
  "fallback": [
    { "id": "gpt-4.1", "provider": "copilot" },
    { "id": "claude-sonnet-4-20250514", "provider": "anthropic" },
    { "strategy": "auto" }              // sentinel: pick best available
  ],

  // Constraints applied to ALL model selections (primary + fallbacks)
  "constraints": {
    "capabilities": ["tool-use", "vision"],  // required capabilities
    "max_context_window": null,              // null = no limit
    "min_context_window": 128000,            // minimum tokens
    "cost_tier": "standard"                  // standard | premium | economy
  },

  // Behavior when no model satisfies constraints
  "on_unavailable": "fail",  // "fail" | "warn-and-use-best" | "queue"

  // Audit/observability
  "audit": {
    "log_selection": true,          // log which model was selected and why
    "log_fallback_reason": true     // log why primary was skipped
  }
}
```

## Integration Points

### 1. Compiler (`gh-aw compile`)

The compiler reads model-selection policy from workflow frontmatter and serializes it into the lock file:

```yaml
# In .md workflow frontmatter
model: gpt-5.2
model-policy:
  fallback: [gpt-4.1, claude-sonnet-4-20250514, auto]
  constraints:
    capabilities: [tool-use]
    min_context_window: 128000
  on_unavailable: fail
```

The compiler validates:
- Model IDs against a known registry (warn on unknown, don't block)
- Constraint fields are well-formed
- Fallback chain doesn't exceed max depth (e.g., 5)

### 2. AWF Runtime Enforcement

At container startup, AWF:

1. **Reads** the model policy from the workflow metadata (passed via env var `AWF_MODEL_POLICY_B64` or similar)
2. **Queries** available models via the API proxy sidecar (`GET /models`)
3. **Resolves** the effective model by walking: primary → fallback[0] → fallback[1] → ... → `auto`
4. **Applies constraints** to filter candidates (capabilities, context window, cost tier)
5. **Sets** `AWF_RESOLVED_MODEL` env var in the agent container
6. **Emits** audit log entries for observability

### 3. API Proxy Sidecar

The api-proxy can enforce the resolved model:
- If agent requests a model different from `AWF_RESOLVED_MODEL`, either:
  - **Rewrite** the request to use the resolved model (transparent enforcement)
  - **Reject** with 400 and guidance (strict enforcement)
- Configuration: `"enforcement": "rewrite" | "reject" | "passthrough"`

## Schema Location

```
gh-aw-firewall/
├── schemas/
│   └── model-policy.v1.json          # JSON Schema definition
├── src/
│   ├── model-policy.ts               # Parser + validator
│   └── model-resolver.ts             # Resolution logic (primary → fallback → auto)
└── docs/
    └── model-selection-policy.md      # Specification document
```

## Design Principles

1. **Declarative over imperative** — Policy describes intent, not implementation
2. **Fail-safe defaults** — Without explicit policy, current behavior (hard fail) is preserved
3. **Auditable** — Every model selection decision is logged with reasoning
4. **Composable** — Org-level policies can constrain repo-level policies (future: org policy inheritance)
5. **Engine-agnostic** — Works across Copilot, Claude, Codex, and custom engines

## Open Questions

- [ ] Should `auto` strategy use capability matching or just pick the "best" available model?
- [ ] How does org-level model governance interact? (e.g., org bans certain models)
- [ ] Should the policy support time-based selection? (e.g., use cheaper model for scheduled jobs)
- [ ] Is `AWF_MODEL_POLICY_B64` the right transport, or should it be a file mount?

## Related

- [gh-aw#29191](https://github.com/github/gh-aw/issues/29191) — Model fallback feature request (upstream)
- [sweagentd#11264](https://github.com/github/sweagentd/issues/11264) — CCA jobs failing with `model_not_supported` (motivating incident)
- Network policy precedent: `--allow-domains` → Squid ACL (same pattern: declarative policy → runtime enforcement)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Model-selection policy schema for agentic workflows #2309

Summary

Proposed Policy Schema

Integration Points

1. Compiler (`gh-aw compile`)

2. AWF Runtime Enforcement

3. API Proxy Sidecar

Schema Location

Design Principles

Open Questions

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: Model-selection policy schema for agentic workflows #2309

Description

Summary

Proposed Policy Schema

Integration Points

1. Compiler (gh-aw compile)

2. AWF Runtime Enforcement

3. API Proxy Sidecar

Schema Location

Design Principles

Open Questions

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Compiler (`gh-aw compile`)