Fix AIC extraction and computation from token usage JSONL files#40085
Fix AIC extraction and computation from token usage JSONL files#40085Copilot wants to merge 3 commits into
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #40085 does not have the 'implementation' label and has 0 new lines of code in business logic directories (≤100 threshold). |
|
✅ Test Quality Sentinel completed test quality analysis. |
There was a problem hiding this comment.
✅ Ready to approve
The changes are cohesive, align with the stated root cause/resilience goals, and are backed by targeted unit tests for the new AIC extraction and deduplication behavior.
Note: this review does not count toward required approvals for merging.
Pull request overview
This PR fixes AI Credits (AIC) reporting by making token-usage parsing resilient to api-proxy output differences, preferring proxy-emitted per-request AIC when available, and ensuring the expected usage artifacts are written for downstream collectors/telemetry.
Changes:
- Prefer
ai_credits_this_responsewhen present and compute totals by summing per-entry AIC to keep per-request and aggregated views consistent. - Read token usage from both primary and audit
token-usage.jsonlpaths withrequest_id-based deduplication when exportingGH_AW_AIC. - Write
agent_usage.jsonlalongsideagent_usage.json, with tests covering the new path and artifact emission behavior.
File summaries
| File | Description |
|---|---|
| actions/setup/js/parse_token_usage.cjs | Writes agent_usage.jsonl in addition to agent_usage.json so artifact collectors can find usage data consistently. |
| actions/setup/js/parse_token_usage.test.cjs | Adds/updates tests to validate the new AGENT_USAGE_JSONL_PATH constant and that the .jsonl file is written. |
| actions/setup/js/parse_mcp_gateway_log.cjs | Fixes AIC extraction by honoring ai_credits_this_response, sums totals from per-entry AIC, and reads/dedupes token usage across primary + audit paths. |
| actions/setup/js/parse_mcp_gateway_log.test.cjs | Adds tests for audit-path fallback, cross-path deduplication, and explicit per-response AIC handling. |
| actions/setup/js/daily_aic_workflow_helpers.cjs | Updates daily aggregation to recognize ai_credits_this_response as an explicit AIC source before falling back to token-based computation. |
Copilot's findings
- Files reviewed: 5/5 changed files
- Comments generated: 0
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
🧪 Test Quality Sentinel Report
📊 Metrics & Test Classification (7 tests analyzed)
JavaScript: 7 (
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — requesting changes on two correctness issues.
📋 Key Themes & Highlights
Key Issues
- Priority inversion (
daily_aic_workflow_helpers.cjs):ai_credits_this_responseis checked last in the alias chain, so legacyai_creditswins when both fields are present. This silently defeats the fix for daily aggregation (inline comment flagged at line 218). > 0guard discards explicit zero (both files): proxy-emittedai_credits_this_response: 0falls through to token-computed AIC instead of being honoured as an authoritative zero-cost assertion.- Missing test coverage for
sumAICFromUsageJSONLFiles: the newai_credits_this_responsepath in the daily helpers is not exercised by any test.
Positive Highlights
- ✅ The per-entry AIC refactor in
parseTokenUsageJsonlis clean — moving from model-aggregate-first to per-entry-first is the right semantic model and handles mixed explicit/computed entries correctly. - ✅ Strong test surface on
parse_mcp_gateway_log.test.cjs: audit-only, both-paths-dedup, explicit-field, and fallback cases are all covered. - ✅ Dual-path read with
request_iddeduplication mirrors the strategy already used byparse_token_usage.cjs— good consistency. - ✅ Writing
agent_usage.jsonlalongside.jsoncloses a clean data-loss path with minimal risk.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer
| continue; | ||
| } | ||
| // Prefer proxy-emitted per-request AIC over locally computed when present. | ||
| const explicitPerRequest = getNumericAliasField(usage, parsed, ["ai_credits_this_response"]); |
There was a problem hiding this comment.
[/diagnose] The inline comment says "Prefer proxy-emitted per-request AIC over locally computed" but ai_credits_this_response is checked last — after ai_credits, aiCredits, and aic. If a record contains both ai_credits and ai_credits_this_response (api-proxy v0.27.2 emits both), the older ai_credits field wins here, defeating the fix for daily aggregation.
💡 Suggested ordering fix
Move ai_credits_this_response to be checked before the legacy aliases, consistent with how parseTokenUsageJsonl in parse_mcp_gateway_log.cjs already treats it as the highest-priority field:
// Highest fidelity: proxy-emitted per-request AIC.
const explicitPerRequest = getNumericAliasField(usage, parsed, ["ai_credits_this_response"]);
if (explicitPerRequest > 0) {
total += explicitPerRequest;
continue;
}
// Legacy aggregate aliases (lower priority).
const explicitAICredits = getNumericAliasField(usage, parsed, ["ai_credits", "aiCredits"]);This keeps daily totals consistent with the step-summary values already reported by parseTokenUsageJsonl.
| const durationMs = entry.duration_ms || 0; | ||
| // When the proxy emits an explicit per-request AIC value, prefer it over | ||
| // the locally-computed value so that proxy-side pricing updates take effect. | ||
| const explicitDeltaAIC = typeof entry.ai_credits_this_response === "number" && entry.ai_credits_this_response > 0 ? entry.ai_credits_this_response : null; |
There was a problem hiding this comment.
[/diagnose] The > 0 guard silently discards an explicit ai_credits_this_response: 0 emitted by the proxy, falling through to locally-computed AIC instead. A value of 0 could be a legitimate proxy assertion (e.g., a cached or free-tier call), and overriding it with a computed cost produces an incorrect result.
💡 Suggested fix
Drop the > 0 guard — treat any numeric value from the proxy as authoritative, including zero:
const explicitDeltaAIC = typeof entry.ai_credits_this_response === "number"
? entry.ai_credits_this_response
: null;The ?? computed fallback already handles null correctly, so this change only affects the explicit-zero case.
| } | ||
| // Prefer proxy-emitted per-request AIC over locally computed when present. | ||
| const explicitPerRequest = getNumericAliasField(usage, parsed, ["ai_credits_this_response"]); | ||
| if (explicitPerRequest > 0) { |
There was a problem hiding this comment.
[/diagnose] Same > 0 guard issue as in parse_mcp_gateway_log.cjs: an explicit ai_credits_this_response: 0 from the proxy is silently discarded and falls through to token-based computation. Should use a presence check (!== undefined && !== null) rather than a magnitude check.
| total += explicitAIC; | ||
| continue; | ||
| } | ||
| // Prefer proxy-emitted per-request AIC over locally computed when present. |
There was a problem hiding this comment.
[/tdd] The new ai_credits_this_response branch in sumAICFromUsageJSONLFiles has no test coverage. The existing test in check_daily_aic_workflow_guardrail.test.cjs exercises ai_credits, aiCredits, and aic aliases but not this new field — so a future regression here would go undetected.
💡 Suggested test addition (check_daily_aic_workflow_guardrail.test.cjs)
it("sums ai_credits_this_response when present", () => {
const tmpDir = fs.mkdtempSync(...);
const filePath = path.join(tmpDir, "usage.jsonl");
fs.writeFileSync(
filePath,
[
JSON.stringify({ ai_credits_this_response: 3.5 }),
JSON.stringify({ ai_credits_this_response: 5.25 }),
].join("\n")
);
expect(exports.sumAICFromUsageJSONLFiles([filePath])).toBeCloseTo(8.75);
});| // parse_token_usage.cjs. The pattern covers standard JSON escaping (\\, \") but not | ||
| // exotic cases; api-proxy request IDs are UUIDs so this is sufficient in practice. | ||
| const idMatch = line.match(/"request_id"\s*:\s*"((?:\\.|[^"\\])*)"/); | ||
| const dedupeKey = idMatch ? `request_id:${idMatch[1]}` : line; |
There was a problem hiding this comment.
[/diagnose] When request_id is absent, the full raw line is used as the dedupe key. If the proxy serializes the same logical record to the primary and audit paths with different JSON field ordering (which is valid), both copies will have different keys and both will be counted — producing a doubled AIC total.
The comment acknowledges "api-proxy request IDs are UUIDs so this is sufficient in practice", which is reasonable given current behavior. Worth adding an explicit // TODO noting the field-order risk so it's easy to find if double-counting is ever reported.
There was a problem hiding this comment.
Review: AIC extraction fixes — non-blocking observations
The overall approach is sound: honoring ai_credits_this_response, deduplicating across both token-usage paths, and writing agent_usage.jsonl alongside .json all address real gaps. Four non-blocking issues worth addressing before this pattern propagates.
📋 Issue summary
| # | File | Severity | Issue |
|---|---|---|---|
| 1 | parse_mcp_gateway_log.cjs:210 |
Medium | Audit path processed first — "first-seen wins" dedup makes audit data take priority over the primary path, contradicting the "fallback" description |
| 2 | parse_mcp_gateway_log.cjs:91 |
Medium | ai_credits_this_response > 0 guard treats zero-cost responses as absent, falling back to computed AIC and overcounting |
| 3 | daily_aic_workflow_helpers.cjs:218 |
Medium | Comment says "prefer over locally computed" but ai_credits_this_response is actually only checked after both ai_credits/aiCredits and aic return zero; inconsistent with how parse_mcp_gateway_log.cjs handles the same field |
| 4 | parse_token_usage.cjs:177 |
Low | Non-atomic double write to .json + .jsonl; a failure on the second write leaves the two files silently diverged |
🔎 Code quality review by PR Code Quality Reviewer
| // Read from both the primary path and the audit path, deduplicating by request_id. | ||
| // The audit path may contain additional entries when the primary path is absent or | ||
| // partially written (e.g. the proxy was restarted mid-run). | ||
| const paths = [TOKEN_USAGE_AUDIT_PATH, TOKEN_USAGE_PATH]; |
There was a problem hiding this comment.
Dedup gives audit-path data priority over the primary path — first-seen wins, so audit entries shadow primary entries when the same request_id appears in both.
💡 Details
const paths = [TOKEN_USAGE_AUDIT_PATH, TOKEN_USAGE_PATH];The docstring calls TOKEN_USAGE_AUDIT_PATH a fallback (checked
| // Read from both the primary path and the audit path, deduplicating by request_id. | ||
| // The audit path may contain additional entries when the primary path is absent or | ||
| // partially written (e.g. the proxy was restarted mid-run). | ||
| const paths = [TOKEN_USAGE_AUDIT_PATH, TOKEN_USAGE_PATH]; |
There was a problem hiding this comment.
Dedup gives audit-path data priority over the primary path — first-seen wins, so the audit copy shadows the primary copy when the same request_id appears in both files.
💡 Details
const paths = [TOKEN_USAGE_AUDIT_PATH, TOKEN_USAGE_PATH];The file-list docstring calls TOKEN_USAGE_AUDIT_PATH "checked as fallback", but the dedup loop processes it first. The first occurrence of a request_id wins, so if both files carry the same entry the audit version is the one kept.
In normal operation both files should be byte-identical for the same request, so there is no practical difference. But if the primary path ever has a corrected/re-emitted record for a request_id that already appeared in the audit log (e.g. a retry with updated credits), the primary version is silently discarded.
Either:
- Swap the order to
[TOKEN_USAGE_PATH, TOKEN_USAGE_AUDIT_PATH]so the primary is authoritative and the audit is the true fallback, or - Keep the current order but update the comment to explain why audit wins (e.g. "audit is written first and is more complete when the primary is partially flushed").
parse_token_usage.cjs uses the same [AUDIT, PRIMARY] order, so if this is intentional the rationale should be documented in both files.
| const durationMs = entry.duration_ms || 0; | ||
| // When the proxy emits an explicit per-request AIC value, prefer it over | ||
| // the locally-computed value so that proxy-side pricing updates take effect. | ||
| const explicitDeltaAIC = typeof entry.ai_credits_this_response === "number" && entry.ai_credits_this_response > 0 ? entry.ai_credits_this_response : null; |
There was a problem hiding this comment.
ai_credits_this_response: 0 is treated as absent, causing AIC to be overcounted for zero-cost responses.
💡 Details
const explicitDeltaAIC =
typeof entry.ai_credits_this_response === "number" && entry.ai_credits_this_response > 0
? entry.ai_credits_this_response
: null;The > 0 guard cannot distinguish "field not present" from "field is present and explicitly zero". When a proxy emits "ai_credits_this_response": 0 (e.g. a fully-cached, free-tier, or failed request), explicitDeltaAIC is null and the fallback computeInferenceAIC runs, returning a positive value based on token counts. The proxy's explicit assertion that the request cost nothing is ignored and AIC is overcounted.
Fix: separate presence detection from value filtering:
const rawExplicit = entry.ai_credits_this_response;
const explicitDeltaAIC =
typeof rawExplicit === "number" ? rawExplicit : null;Then in the aggregation pass:
entry.deltaAIC = entry.explicitDeltaAIC !== null ? entry.explicitDeltaAIC : computed;The same > 0 pattern appears in daily_aic_workflow_helpers.cjs line 219 for the same field and should be fixed there too.
| continue; | ||
| } | ||
| // Prefer proxy-emitted per-request AIC over locally computed when present. | ||
| const explicitPerRequest = getNumericAliasField(usage, parsed, ["ai_credits_this_response"]); |
There was a problem hiding this comment.
Misleading comment: ai_credits_this_response is not preferred over locally-computed — it is only checked when ALL legacy explicit fields are absent.
💡 Details
The comment on line 217 says "Prefer proxy-emitted per-request AIC over locally computed when present", which implies ai_credits_this_response takes priority. It doesn't. The actual priority chain is:
ai_credits/aiCredits— if> 0, use it andcontinueaic— if> 0, use it andcontinueai_credits_this_response— only reached if the two checks above both returned 0computeInferenceAICfallback
If a transitional proxy record happens to emit both ai_credits and ai_credits_this_response, the legacy field wins and the more-authoritative per-request value from ai_credits_this_response is silently discarded. This is inconsistent with how parse_mcp_gateway_log.cjs handles the same data (it checks ai_credits_this_response directly and ignores ai_credits entirely).
If the intent is backward compatibility (old proxy → ai_credits, new proxy → ai_credits_this_response), the comment should say so. If ai_credits_this_response should truly take precedence over ai_credits, move it before the legacy checks.
| }; | ||
| fs.writeFileSync(AGENT_USAGE_PATH, JSON.stringify(agentUsage) + "\n"); | ||
| // Also write the .jsonl path so the usage artifact collector can find it. | ||
| fs.writeFileSync(AGENT_USAGE_JSONL_PATH, JSON.stringify(agentUsage) + "\n"); |
There was a problem hiding this comment.
Non-atomic double write: if the second writeFileSync fails, .json and .jsonl diverge silently.
💡 Details
fs.writeFileSync(AGENT_USAGE_PATH, JSON.stringify(agentUsage) + "\n");
// Also write the .jsonl path so the usage artifact collector can find it.
fs.writeFileSync(AGENT_USAGE_JSONL_PATH, JSON.stringify(agentUsage) + "\n");If the second write throws (disk full, missing parent dir, permission error), agent_usage.json contains the current run's data but agent_usage.jsonl is either absent or holds stale data from a prior run. Both the OTEL span sender (send_otlp_span.cjs reads .json) and the usage artifact collector (reads .jsonl) will now see different data with no log entry explaining why.
At minimum, wrap the second write in a try/catch and emit coreObj.warning on failure so the discrepancy is visible in the workflow log. If both files must stay in sync, write to a temp path and fs.renameSync both atomically (rename is atomic on POSIX for same-filesystem moves).
|
@copilot review all comments and address the unresolved review feedback.
|
|
\n@copilot review all comments and address unresolved review feedback.\n\npr-sous-chef: fix the unresolved AIC review threads, then push an update branch.
|
AIC values were silently reporting as 0 in workflow footers and OTEL telemetry due to several extraction bugs, compounded by an api-proxy v0.27.4 regression that stopped writing
token-usage.jsonlfor Copilot API calls.Fixes
Honor
ai_credits_this_response(parse_mcp_gateway_log.cjs—parseTokenUsageJsonl): api-proxy v0.27.2 emits this explicit per-request AIC field; we were ignoring it and recomputing from tokens instead. Now uses it directly when present.Check both token-usage.jsonl paths (
parse_mcp_gateway_log.cjs—writeStepSummaryWithTokenUsage): Previously only read the primary firewall path; now reads both primary and audit paths withrequest_id-based deduplication, matching the strategy already used byparse_token_usage.cjs.Write
agent_usage.jsonlalongside.json(parse_token_usage.cjs):send_otlp_span.cjsreadsagent_usage.json; the usage artifact collector copiesagent_usage.jsonl. Extension mismatch meant usage data never landed in artifacts.Recognize
ai_credits_this_responsein daily aggregation (daily_aic_workflow_helpers.cjs—sumAICFromUsageJSONLFiles): Added the per-response field alongside the existingai_credits/aiCredits/aicaliases.pr-sous-chef: update branch requested from run https://github.com/github/gh-aw/actions/runs/27780036106