fix(llm): add disableResponseFormat option for proxies that reject response_format#67
Open
laurentftech wants to merge 2 commits intoclay-good:mainfrom
Open
fix(llm): add disableResponseFormat option for proxies that reject response_format#67laurentftech wants to merge 2 commits intoclay-good:mainfrom
laurentftech wants to merge 2 commits intoclay-good:mainfrom
Conversation
…sponse_format
Some OpenAI-compatible proxy endpoints (e.g. vLLM, custom gateways) return
{"detail":"There was an error parsing the body"} when the request includes
response_format — they support the chat completions API but not structured
output extensions.
Add disableResponseFormat: boolean to:
- LLMServiceOptions / LLMServiceOptions.Required
- OpenAICompatibleProvider constructor (4th arg, default false)
- GenerationConfig in types/index.ts
When true, response_format is omitted entirely from requests. The LLM still
produces JSON via the system prompt instructions; completeJSON() handles
free-form JSON parsing as it already does for models without schema support.
Also improves error messages across all providers: errors now include the
HTTP status code (e.g. "HTTP 422: ...") instead of just the raw body,
making retry logs actionable.
Config usage:
generation:
provider: openai-compat
disableResponseFormat: true
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…esponse_format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Some OpenAI-compatible proxy endpoints return
{"detail":"There was an error parsing the body"}when the request includesresponse_format— they support the chat completions API but not structured output extensions. This causes all LLM requests to fail silently (retrying with empty error messages).Root cause: spec-gen sends
response_format: {type: "json_schema", ...}orresponse_format: {type: "json_object"}for JSON completions. Endpoints like vLLM or custom gateways reject this field entirely.Diagnosis: confirmed by testing with/without
response_formatvia PowerShell — the request without it succeeds, the one with it fails.Fix
Add
disableResponseFormat: booleanoption:LLMServiceOptions+Required<LLMServiceOptions>OpenAICompatibleProviderconstructor (4th arg, defaultfalse)GenerationConfigintypes/index.tsgenerate.tsfrom configWhen
true,response_formatis omitted. The LLM still produces JSON via system prompt instructions;completeJSON()handles free-form JSON as it already does for models without schema support.Also improves error messages: all provider errors now include the HTTP status code (
HTTP 422: ...) instead of just the raw body — making retry log lines actionable.Config
Tests
Two new unit tests on
OpenAICompatibleProvider:disableResponseFormat=true→response_formatabsent from request bodydisableResponseFormat=false→response_formatpresent (default behavior unchanged)🤖 Generated with Claude Code