Skip to content

fix: improve prompt cache compatibility for custom Codex endpoints#113

Open
jujinqian162 wants to merge 3 commits intoMadAppGang:mainfrom
jujinqian162:fix/codex-cache-headers
Open

fix: improve prompt cache compatibility for custom Codex endpoints#113
jujinqian162 wants to merge 3 commits intoMadAppGang:mainfrom
jujinqian162:fix/codex-cache-headers

Conversation

@jujinqian162
Copy link
Copy Markdown

Summary

When using a third-party Codex-compatible endpoint via an overridden OpenAI Codex base URL, prompt cache
hit rates were much lower than when talking to Codex directly.

This change improves cache compatibility for that setup by:

  • attaching a stable prompt_cache_key to Codex Responses API requests
  • sending Codex-compatible request metadata headers for custom Codex transports
  • stripping volatile cache-busting fields from routed Anthropic request payloads before forwarding them
    upstream

Background

During investigation, one source of instability was request metadata that could vary across otherwise
equivalent requests.

In particular, Claude Code injects an x-anthropic-billing-header string that may include:

  • a volatile cch=... field
  • a more specific cc_version value than the upstream service appears to need for cache reuse

Those values are useful for billing and client attribution, but they can also make semantically identical
requests look different at the payload level. For third-party Codex-compatible backends, that reduces the
chance of matching the cache behavior seen when using Codex directly.

What changed

To make request fingerprints more stable:

  • volatile cch=... entries are removed from routed Anthropic payloads
  • cc_version values are normalized to a shorter stable form instead of preserving extra patch/build
    suffixes
  • Codex Responses requests now include a stable prompt_cache_key
  • Codex-compatible metadata headers are sent on custom Codex transport requests

Together, these changes make repeated requests look closer to the direct Codex path and reduce avoidable
cache misses caused by transport-level request variation.

Testing

  • added transport tests for stable prompt_cache_key and Codex-compatible headers
  • added sanitizer coverage for volatile cache-buster removal and billing header normalization
  • ran:
    • bun test packages/cli/src/providers/transport/openai.test.ts packages/cli/src/request-sanitizer.test.ts

JustJuicer and others added 2 commits April 27, 2026 22:02
Attach a stable prompt_cache_key to Codex Responses payloads and send Codex-compatible session headers so prompt caching can hit consistently.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sanitize Anthropic request payloads before routing so volatile billing-header cch values do not perturb prompt cache keys, and cover the sanitizer with focused tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c73b416ebe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/cli/src/request-sanitizer.ts Outdated
Comment on lines +7 to +9
function sanitizeValue(value: unknown): unknown {
if (typeof value === "string") return sanitizeString(value);
if (Array.isArray(value)) return value.map((item) => sanitizeValue(item));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict billing-header rewriting to non-user text

stripVolatileCacheBusters recursively sanitizes every string in the request, and sanitizeString rewrites any string that merely contains x-anthropic-billing-header. In /v1/messages and /v1/messages/count_tokens, this can alter user prompt content (for example, debugging prompts that include x-anthropic-billing-header: ...; cch=...) before forwarding upstream, which is a request-corruption regression for those inputs. Sanitization should target only the injected billing-header metadata location rather than arbitrary message text.

Useful? React with 👍 / 👎.

Restrict billing-header rewriting to injected system text so user-authored request content is preserved while volatile cache-buster fields are still normalized.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants