fix(api-proxy): write placeholder token-usage record when usage extraction fails by lpcox · Pull Request #5251 · github/gh-aw-firewall

lpcox · 2026-06-18T15:24:17Z

Problem

A successful (2xx) LLM completion response from which no usage payload could be extracted previously produced no line at all in token-usage.jsonl. The hard gate is in finalizeHttpTracking():

const normalized = normalizeUsage(usage);
if (!normalized) {            // no usage extracted from the response
  if (typeof onSpanEnd === 'function') onSpanEnd(proxyRes.statusCode);
  return;                     // ← returns WITHOUT writing any record
}

When this fires, the request becomes invisible to every downstream consumer — step summaries, OTEL fan-out, and AI-credit aggregation. This is the producer-side root cause of the regression where Copilot calls stopped appearing in token-usage logs, as called out in github/gh-aw#40085:

The root cause — api-proxy no longer writing token-usage.jsonl for Copilot calls — requires an upstream fix.

I verified the parsers/injection are correct for all documented payload shapes (chat-completions with stream_options.include_usage, Responses API response.completed/response.done, non-streaming JSON). The only no-usage case is a streaming chat-completions response whose final usage chunk is absent — but whatever the exact upstream trigger, the proxy silently dropping the entire request is the brittle behavior worth fixing.

Fix

For completion-style endpoints (chat/completions, responses, messages, :generateContent, …), when a 2xx response yields no usage, write a placeholder record with zeroed token counts and usage_missing: true:

The request stays observable downstream (dedup by request_id, request counts, OTEL spans).
The extraction gap is diagnosable in artifacts (currently it leaves no trace), enabling a precise follow-up that recovers the real tokens.
The usage_missing flag clearly marks the counts as non-measured, so consumers don't treat them as a genuine zero-cost call.
Non-completion traffic (e.g. /models, health checks) is still skipped to avoid noise.
Behavior is unchanged when usage is present.

Changes

token-parsers.js: add looksLikeCompletionRequest() path heuristic.
token-tracker-http.js: add writeMissingUsageRecord(); gate the no-usage branch on it.
schemas/token-usage.schema.json: document the usage_missing field.
Unit tests for the new helper and the placeholder-record behavior (streaming + non-streaming, completion vs non-completion paths, onSpanEnd still fires).

Testing

cd containers/api-proxy && npm test → 1247 passed (54 suites), +20 new tests.

Relationship to github/gh-aw#40085

That PR hardens the consumer (honor ai_credits_this_response, dedup across both log paths, fix the .jsonl/.json artifact extension). This PR is the producer-side fix it asks for: the api-proxy will now always emit a row per completion request, so a row can never silently vanish again.

Replaces 'sleep 3' with proper health check loops that wait up to 30 seconds for proxies to be ready before running tests. Root cause: Squid/Envoy containers were not fully initialized before agent containers tried to connect, causing spurious test failures. Changes: - Squid runc test: Wait for proxy port 3128 to respond - Squid gVisor test: Wait for proxy port 3128 to respond - Envoy runc test: Wait for admin /ready endpoint - Envoy gVisor test: Wait for admin /ready endpoint - Squid perf test: Wait for proxy port 3128 to respond - Envoy perf test: Wait for admin /ready endpoint Each health check: - Retries for up to 30 seconds - Uses lightweight curl container for network checks - Shows container logs on failure - Exits with error if proxy fails to start Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

A successful (2xx) LLM completion response from which no usage payload could be extracted previously produced NO line in `token-usage.jsonl` at all: `finalizeHttpTracking()` early-returned on `!normalizeUsage(usage)`. This made the request invisible to every downstream consumer (step summaries, OTEL fan-out, AI-credit aggregation) and is the root cause of the regression where Copilot calls stopped appearing in token-usage logs (see github/gh-aw#40085). Now, for completion-style endpoints (chat/completions, responses, messages, generateContent, ...), a placeholder record is written with zeroed token counts and `usage_missing: true` so the request stays observable and the extraction gap is diagnosable, while clearly marking the counts as non-measured. Non-completion traffic (e.g. /models, health checks) is still skipped to avoid noise. Behavior is unchanged when usage is present. - Add `looksLikeCompletionRequest()` path heuristic to token-parsers - Add `writeMissingUsageRecord()` and gate the no-usage branch on it - Document `usage_missing` in schemas/token-usage.schema.json - Add unit tests for the new helper and placeholder-record behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

github-actions · 2026-06-18T15:25:57Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	97.57%	97.61%	📈 +0.04%
Statements	97.50%	97.54%	📈 +0.04%
Functions	98.84%	98.84%	➡️ +0.00%
Branches	92.95%	92.98%	📈 +0.03%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/workdir-setup.ts`	92.7% → 94.5% (+1.82%)	92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

lpcox and others added 2 commits June 17, 2026 22:08

Copilot AI review requested due to automatic review settings June 18, 2026 15:24

Copilot started reviewing on behalf of lpcox June 18, 2026 15:24 View session

Copilot AI reviewed Jun 18, 2026

Copilot stopped reviewing on behalf of lpcox due to an error June 18, 2026 15:26
Error performing PR review: Authorization error, you may need to run /login (Request ID: 4403:1EFCE0:7E5C10:84CD01:6A340DCD)

lpcox closed this Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api-proxy): write placeholder token-usage record when usage extraction fails#5251

fix(api-proxy): write placeholder token-usage record when usage extraction fails#5251
lpcox wants to merge 2 commits into
mainfrom
fix/api-proxy-token-usage-missing-record

lpcox commented Jun 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lpcox commented Jun 18, 2026

Problem

Fix

Changes

Testing

Relationship to github/gh-aw#40085

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 18, 2026

✅ Coverage Check Passed

Overall Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants