Skip to content

Structured outputs in conclaves: schema-validated agent results #57

@BeinerChes

Description

@BeinerChes

Motivation

Lab journals 09 and 10 found two recurring classes of conclave failure that simpler validation would have caught at $0:

  • Writer summary hallucinates blocker counts (journal 09 run 150, journal 10 run 157 — "3 blockers" / "4 blockers" in the summary string while the body lists different numbers).
  • Writer passes through specialist contradictions (journal 10: Security flagged absolute-path traversal as a blocker while Conventions praised the same code as correctly sandboxed, in the same run).

Both are structural defects the conclave can't detect today because agents emit free-form text and the Writer has no schema to validate against. See .notes/09-runtime-ts-review-matrix-and-kb-paradox.md and .notes/10-builtin-tools-review-matrix.md for full context.

What Anthropic gives us

Anthropic shipped structured outputs GA across Opus 4.6 / Sonnet 4.6 / Sonnet 4.5 / Haiku 4.5 (Claude API + Bedrock). Two complementary features:

  1. JSON outputs (output_config.format on the core Messages API; outputFormat on query() in the Claude Agent SDK). The model physically cannot emit tokens that violate the schema — constrained decoding. Agent can still use tools during reasoning; the final result is the schema-valid JSON.

  2. Strict tool use (strict: true on tool definitions). When the model calls a tool, parameters are guaranteed to match the tool's input_schema.

Both are schema-driven. Both remove retries/parsing failures. TypeScript has first-class helpers: zodOutputFormat(), jsonSchemaOutputFormat(), and z.toJSONSchema() from the SDK.

Docs:

Goal

Let conclave authors define structured outputs on agent nodes via a form in the editor, with no TypeScript written by the author. The engine threads the schema through to the Claude SDK's outputFormat option, returns validated JSON to downstream nodes, and — for multiple sibling specialists feeding one downstream node — runs a cheap comparison check to detect contradictions.

This is reuse of existing primitives: no new node type, no new edge semantics, no new framework. The agent inspector gets one new tab; the Claude path reads one new config field; MCP layer is untouched.

Out of scope (explicitly)

  • No custom "add_finding tool" per-agent. The Agent SDK's native outputFormat subsumes that design — agents use existing tools freely during reasoning, the SDK enforces the final shape.
  • No new "Gate" or "Blocker" node type. Contradiction detection is an engine feature wired automatically when sibling specialists feed one downstream node, not a node authors manually add.
  • No LangChain-style code-first framework. Authors never open a code editor to use this.

Phase 0: Verification (before shipping anything)

  • Confirm @anthropic-ai/claude-agent-sdk@0.2.91 supports options.outputFormat on query(). If not, bump the SDK. Inspect the Options type in node_modules/@anthropic-ai/claude-agent-sdk/dist/index.d.ts.
  • Confirm the SDK's tool() helper passes strict: true through (secondary priority — phase 2+ only).
  • Note: SDK declares zod ^4.0.0 as peer dep; we're on zod ^3.24. Pre-existing warning, not blocking here. File separately if it bites.

Phase 1: UI — Structured Output as a droppable internal tool

Scope: packages/client only. No runtime behavior change yet.

UX model: structured output is opt-in, added by dragging a "Structured Output" entry from the node palette onto an agent node — same gesture as adding a built-in tool (Read, Write, Bash) today. Agents that don't need structured output don't see any schema config cluttering their inspector. When the "Structured Output" chip is present on an agent, the inspector surfaces a field builder; when it's absent, the inspector is unchanged.

This is not a graph node — it's a chip/slot that lives inside an agent node, like the existing built-in-tool chips.

  • Add outputSchema?: JsonSchemaObject to ResolvedAgentConfig in packages/shared/src/schemas/agent.ts. Stays undefined when no Structured Output chip is attached; becomes the user-built schema when one is.
  • Add "Structured Output" as a new entry in the node palette (packages/client/src/components/editor/node-palette.tsx), in the same section as built-in tools. Icon: something schema-ish (ListChecks or Braces from Lucide).
  • Dragging it onto an agent node sets outputSchema: { type: "object", properties: {} } (empty schema, valid-but-useless starting state). The agent inspector grows a new section revealing the field builder. Removing the chip sets outputSchema: undefined.
  • Field builder UI (in the agent inspector, only visible when outputSchema is set):
    • Mode toggle: Single output / List of items
    • For each field: name · type (string/integer/number/boolean/enum) · required checkbox · description · constraints (min/max/minLength/maxLength/enum values)
    • "Add field" / "Remove field" / drag to reorder
    • Live preview of the generated JSON Schema
  • The agent node renders a small chip/badge (e.g. "⚙ Structured Output" with field count) so the graph view shows which agents emit structured data at a glance.

Rationale for chip-based UX over always-on tab: matches existing palette-driven composition (built-in tools, MCP servers already work this way); keeps the inspector uncluttered for agents that emit prose; makes structured output visible in the graph itself, not buried in a tab. No new concepts — it's a capability you add, just like adding Read or Bash.

Acceptance: user drops a Structured Output chip on a conclave-28 specialist via the palette, defines a few fields in the inspector, saves the conclave, and sees outputSchema populated in oc-dev get_conclave output. Removing the chip clears the schema. Engine still ignores the schema at this phase — validate the UX is sound before the backend lights up.

Phase 2: Server — wire into the Claude path

Scope: packages/server/src/agent/runtime.ts only.

  • In runClaudeAgent, if config.outputSchema is present, pass it through:
const agentQuery = query({
  prompt,
  options: {
    ...existingOptions,
    outputFormat: { type: "json_schema", schema: config.outputSchema },
  },
});
  • In the result-message handling loop, read result.structured_output (the validated JSON). If present, set resultOutput to the structured JSON serialized as a string (edges currently carry strings — keep the data model unchanged for now).
  • Handle the new error subtype error_max_structured_output_retries — treat as agent failure with a clear error message. Emit via onOutput for visibility.
  • When structured output is on, the free-text resultOutput prose is no longer propagated; only the structured JSON. Document this in the inspector UI.

Acceptance: a conclave-28 specialist with a schema defined in Phase 1 runs, and the Writer receives structured JSON from each specialist instead of free-text review prose.

Phase 3: Writer schema + default templates

Scope: config-only. No code change beyond seeding a schema into conclave #28's Writer node.

  • Define a canonical CodeReviewFindings schema:
{
  "type": "object",
  "properties": {
    "findings": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "severity": { "enum": ["blocker", "major", "minor", "nit"] },
          "file": { "type": "string" },
          "line": { "type": "integer", "minimum": 1 },
          "description": { "type": "string" },
          "raisedBy": { "type": "string" }
        },
        "required": ["severity", "file", "line", "description", "raisedBy"]
      }
    },
    "counts": {
      "type": "object",
      "properties": {
        "blocker": { "type": "integer", "minimum": 0 },
        "major":   { "type": "integer", "minimum": 0 },
        "minor":   { "type": "integer", "minimum": 0 },
        "nit":     { "type": "integer", "minimum": 0 }
      },
      "required": ["blocker", "major", "minor", "nit"]
    }
  },
  "required": ["findings", "counts"]
}
  • Wire conclave Migrate server tests from vitest to bun test (server-only) #28 light_code_review specialists AND the Writer to use this schema.
  • After the Writer node, add a tiny Code node (or engine-level check) that asserts counts.blocker === findings.filter(f => f.severity === "blocker").length etc. If mismatch, emit a warning. With structured output this can only fail if the model hallucinates — it's belt-and-suspenders.

Acceptance: rerun the journal-09 and journal-10 experiments. The Writer summary can no longer say "3 blockers" while the body has 0. Goal: zero count-hallucinations across at least 5 runs.

Phase 4: Cross-specialist contradiction detection

Scope: packages/server/src/engine/agent-executor.ts + a small utility module.

  • When multiple agent nodes route into a single downstream node AND all have outputSchema defined with compatible shapes, the engine accumulates their structured outputs into a shared bucket keyed by the downstream node.
  • Before the downstream node runs, engine runs a comparison check over the combined findings:
    • Same (file, line) key with different severity → contradiction → emit event; mark the input with a contradictions: [...] field so the downstream agent sees it as data.
    • Optional: cosine-sim on description strings when (file, line) matches loosely — flags "almost-same finding, different wording".
  • Do NOT block execution on contradictions — surface them. The downstream Writer or a reviewer agent decides how to handle. (Blocking would fight the "no tool-nudging" memory and the user's preference for soft gates.)

Acceptance: rerun journal-10 builtin-tools matrix. The contradiction between Security ("path traversal blocker") and Conventions ("workspace path resolution prevents path traversal") surfaces as an explicit contradictions entry the Writer sees — not silently accepted.

Phase 5: Ollama / OpenAI parity

Scope: packages/server/src/agent/{ollama,openai-chat,openai-responses}.ts and AgentBase.

  • Ollama: supports structured output via format: jsonSchema in newer versions. Translate config.outputSchema to the Ollama request shape. No constrained-decoding guarantee — validate the response with Zod post-hoc, retry up to N times with the validation error fed back.
  • OpenAI Chat + Responses: supports response_format: { type: "json_schema", json_schema: {...} }. Analogous implementation.
  • Document in the editor: "Claude models: guaranteed. Ollama/OpenAI: best-effort with post-hoc validation + retry." Users should know the degradation.

Acceptance: same Phase-3 experiment works with an Ollama-backed specialist. May have higher error rate; that's expected and documented.

What this does not solve

  • Semantic errors: journal-10's "Bash Command Injection is a blocker" false positive was a threat-model misread, not a schema violation. Structured outputs don't catch wrong judgments — only wrong shapes. That still needs a cross-specialist critic pass (out of scope here, tracked separately).

Links

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions