Skip to content

fix(llm): add disableResponseFormat option for proxies that reject response_format#67

Open
laurentftech wants to merge 2 commits intoclay-good:mainfrom
laurentftech:fix/disable-response-format
Open

fix(llm): add disableResponseFormat option for proxies that reject response_format#67
laurentftech wants to merge 2 commits intoclay-good:mainfrom
laurentftech:fix/disable-response-format

Conversation

@laurentftech
Copy link
Copy Markdown
Contributor

Problem

Some OpenAI-compatible proxy endpoints return {"detail":"There was an error parsing the body"} when the request includes response_format — they support the chat completions API but not structured output extensions. This causes all LLM requests to fail silently (retrying with empty error messages).

Root cause: spec-gen sends response_format: {type: "json_schema", ...} or response_format: {type: "json_object"} for JSON completions. Endpoints like vLLM or custom gateways reject this field entirely.

Diagnosis: confirmed by testing with/without response_format via PowerShell — the request without it succeeds, the one with it fails.

Fix

Add disableResponseFormat: boolean option:

  • LLMServiceOptions + Required<LLMServiceOptions>
  • OpenAICompatibleProvider constructor (4th arg, default false)
  • GenerationConfig in types/index.ts
  • Wired through generate.ts from config

When true, response_format is omitted. The LLM still produces JSON via system prompt instructions; completeJSON() handles free-form JSON as it already does for models without schema support.

Also improves error messages: all provider errors now include the HTTP status code (HTTP 422: ...) instead of just the raw body — making retry log lines actionable.

Config

generation:
  provider: openai-compat
  openaiCompatBaseUrl: https://your-proxy/v1
  disableResponseFormat: true
  model: your-model

Tests

Two new unit tests on OpenAICompatibleProvider:

  • disableResponseFormat=trueresponse_format absent from request body
  • disableResponseFormat=falseresponse_format present (default behavior unchanged)

🤖 Generated with Claude Code

laurentftech and others added 2 commits April 16, 2026 21:50
…sponse_format

Some OpenAI-compatible proxy endpoints (e.g. vLLM, custom gateways) return
{"detail":"There was an error parsing the body"} when the request includes
response_format — they support the chat completions API but not structured
output extensions.

Add disableResponseFormat: boolean to:
- LLMServiceOptions / LLMServiceOptions.Required
- OpenAICompatibleProvider constructor (4th arg, default false)
- GenerationConfig in types/index.ts

When true, response_format is omitted entirely from requests. The LLM still
produces JSON via the system prompt instructions; completeJSON() handles
free-form JSON parsing as it already does for models without schema support.

Also improves error messages across all providers: errors now include the
HTTP status code (e.g. "HTTP 422: ...") instead of just the raw body,
making retry logs actionable.

Config usage:
  generation:
    provider: openai-compat
    disableResponseFormat: true

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…esponse_format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant