GPT-5 models configured with reasoning.summary: "auto" return condensed reasoning summaries that may omit important reasoning details.#55
Open
82deutschmark wants to merge 2 commits intoarcprize:mainfrom
Conversation
Add _ensure_verbosity() helper that automatically sets text.verbosity to "high" for Responses API calls unless explicitly configured. This ensures GPT-5/o-series reasoning models return detailed reasoning logs.
Changes reasoning.summary from "auto" to "detailed" for all GPT-5 variants to ensure comprehensive reasoning traces are returned. Rationale: - "auto" returns condensed summaries that may omit reasoning details - "detailed" provides full reasoning traces for better interpretability - For benchmarking/research, complete reasoning logs are essential - Affects 12 GPT-5 model configs (gpt-5, gpt-5-mini, gpt-5-nano variants)
Author
|
@gkamradt is this the wrong fix for the issue? I've been curious about this for a while. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
GPT-5 models configured with
reasoning.summary: "auto"return condensed reasoning summaries that may omit important reasoning details. For a benchmarking harness, this means valuable interpretability data is lost even though reasoning tokens are being generated and paid for.Real-world impact: Example from gpt-5-1-2025-11-13-thinking-high results shows ~60K-70K reasoning tokens charged but zero reasoning content returned because the
text.verbosityparameter was not configured (addressed in separate PR) andreasoning.summary: "auto"provides only condensed summaries.Solution
Changed
reasoning.summaryfrom"auto"to"detailed"for all GPT-5 model variants.Difference between settings:
"auto"(old): Condensed summaries, model decides what to include"detailed"(new): Comprehensive reasoning traces with full chain-of-thoughtRelated
This PR complements the code-level fix that ensures
text.verbosity: "high"is set programmatically. Together, these changes ensure maximum reasoning content is captured from GPT-5 models.