Show tool output previews in Copilot CLI conversation rendering#40116
Conversation
Parse Wire request blocks in Copilot CLI debug logs to extract actual tool output content, enabling formatResultPreview() to show the first 2 non-empty lines of each tool call result — matching the behaviour already present for Claude runs. Wire requests contain the full conversation history including tool results from prior turns. The new extractWireRequestToolResults() function scans these blocks line-by-line (avoiding expensive full JSON parsing of the 11 MB log) and builds a Map<tool_call_id → content>. After all entries are built, non-error tool results have their content field populated from that map before rendering. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot render the copilot cli vent conversation from the events.jsonl log file, not the raw logs. |
- Revert extractWireRequestToolResults (raw [DEBUG] Wire request: parsing) - Fix convertCopilotEventsToLegacyLogEntries to extract tool output from data.result.content (native Copilot CLI events.jsonl format) - Fix error message extraction to use data.error.message (not String(obj)) - Update copilot_sdk_session.cjs to include result in tool.execution_complete events so SDK-based sessions also get tool output previews in events.jsonl - Add test: renders tool output preview from result.content in events.jsonl Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in the latest commit. The approach is now:
The |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #40116 does not have the 'implementation' label and has 0 new lines of code in business logic directories (≤100 threshold). The 3 changed files are not in monitored business logic paths. |
There was a problem hiding this comment.
Pull request overview
This PR fixes missing tool-output previews when rendering Copilot CLI conversations by extracting LLM-facing tool results from the structured events.jsonl format (instead of falling back to a "success" placeholder), and by ensuring SDK-based sessions write the needed result fields into events.jsonl.
Changes:
- Update Copilot events → legacy trace normalization to extract tool output from
data.result.content, and improve error text extraction viadata.error.message. - Include the SDK tool
resultobject in serializedtool.execution_completeJSONL events so the same parser path can render previews. - Add a regression test covering tool output extraction from
result.contentin events.jsonl.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/log_parser_shared.cjs | Extracts tool output from Copilot CLI events.jsonl result.content and improves error message extraction. |
| actions/setup/js/copilot_sdk_session.cjs | Extends SDK JSONL serialization to include tool execution result for downstream rendering. |
| actions/setup/js/parse_copilot_log.test.cjs | Adds a test intended to prevent regression in rendering tool output from events.jsonl. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 3
| } else if (data.result && typeof data.result.content === "string") { | ||
| // Native Copilot CLI events.jsonl format: result.content is the concise | ||
| // tool result text sent to the LLM (may be truncated for token efficiency). | ||
| output = data.result.content; | ||
| } else if (data.error) { |
| it("renders tool output preview from result.content in Copilot CLI events.jsonl", () => { | ||
| const eventsLog = [ | ||
| '{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}', | ||
| '{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}', | ||
| '{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":true,"result":{"content":"file1.txt\\nfile2.txt\\nfile3.txt"}}}', | ||
| '{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Done"}}', | ||
| ].join("\n"); | ||
|
|
||
| const result = parseCopilotLog(eventsLog); | ||
|
|
||
| expect(result.markdown).toContain("bash"); | ||
| expect(result.markdown).toContain("file1.txt"); | ||
| }); |
| // Include result.content (concise LLM-facing output) so that the log | ||
| // parser can render tool output previews from events.jsonl directly. | ||
| const result = event.data?.result ?? undefined; | ||
| // max-tool-denials intentionally tracks permission denials only. | ||
| // Tool execution failures are still logged, but do not increment the guardrail counter. |
There was a problem hiding this comment.
REQUEST_CHANGES — two issues to fix before merge
The happy-path fix (rendering result.content from events.jsonl) is correct and the core logic in log_parser_shared.cjs is sound. Two things need to be addressed:
Issues
1. (Medium) Unbounded result object written to JSONL — copilot_sdk_session.cjs serializes the entire ToolExecutionCompleteResult to disk, including detailedContent (full diffs) and uiResource, even though the parser only ever reads result.content. This is a performance regression on any tool-heavy run that edits files. Fix: write { content: result.content } only.
2. (Medium) Error-extraction fix is untested — log_parser_shared.cjs line 936 changes String(data.error) to data.error.message, a meaningful behavior fix ([object Object] → real message), but no test covers a failed tool call with an Error-shaped object. A future refactor could silently revert it.
🔎 Code quality review by PR Code Quality Reviewer
| // max-tool-denials intentionally tracks permission denials only. | ||
| // Tool execution failures are still logged, but do not increment the guardrail counter. | ||
| writeEvent("tool.execution_complete", { toolName, mcpServerName, success }, event.timestamp); | ||
| writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp); |
There was a problem hiding this comment.
Full ToolExecutionCompleteResult object written to JSONL, but only result.content is ever consumed: detailedContent and uiResource are serialized unnecessarily, bloating events.jsonl for tool-heavy runs.
💡 Suggested fix
Write only the fields the log parser actually reads:
// current
const result = event.data?.result ?? undefined;
writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp);
// suggested
const resultContent = event.data?.result?.content;
const result = resultContent !== undefined ? { content: resultContent } : undefined;
writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp);The SDK's ToolExecutionCompleteResult has four fields: content (concise, truncated), contents (structured blocks), detailedContent (the full tool output including complete diffs — explicitly documented as potentially very large), and uiResource. Only result.content is ever read in log_parser_shared.cjs. Serializing the whole object via JSON.stringify means every tool call that edits a file will write an unbounded detailedContent string to disk, and the parser has to deserialize all of it for nothing.
| output = data.result.content; | ||
| } else if (data.error) { | ||
| output = String(data.error); | ||
| output = typeof data.error === "object" && typeof data.error.message === "string" ? data.error.message : String(data.error); |
There was a problem hiding this comment.
The error-extraction fix (String(data.error) → .message) has no test coverage: this is a real behavior change — before, an Error object would render as [object Object]; after, it renders as the human-readable message — but the test suite only adds a happy-path case.
💡 Suggested test
Add a case to the existing describe block covering failed tool execution with an Error object:
it("renders error.message from a failed tool call in events.jsonl", () => {
const eventsLog = [
'{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}',
'{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}',
'{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":false,"error":{"message":"Permission denied"}}}',
'{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Failed"}}',
].join("\n");
const result = parseCopilotLog(eventsLog);
expect(result.markdown).toContain("Permission denied");
// Must not contain the old serialization artifact
expect(result.markdown).not.toContain("[object Object]");
});Without this, a future refactor could revert the fix silently.
🧪 Test Quality Sentinel Report
📊 Metrics & Test Classification (1 test analyzed)
Go: 0 (
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — clean targeted fix with good PR hygiene. A few suggestions to make it more robust.
📋 Key Themes & Highlights
Key Themes
- Potential array content gap:
data.result.contentis only extracted when it's astring; if the SDK emits MCP-style content-block arrays ([{type,text}]) for some tool types, those still fall through to"success". The same pattern is already handled inlog_parser_format.cjsline 250. - Missing tests for the error fix: The
data.error.messageextraction improvement has no regression test — a future refactor could restoreString(data.error)without any warning. - Thin test assertions: The new test doesn't assert the absence of the
"success"placeholder or verify second/third output lines.
Positive Highlights
- ✅ Root cause is clearly identified and the two-layer fix (write path + parse path) is correct
- ✅ Small, focused diff (+24/-3) — easy to reason about
- ✅ Good that a regression test was added alongside the fix
- ✅ Error extraction improvement (
data.error.message) is a nice bonus fix - ✅ Inline comment explains the purpose of
result.contentfor future readers
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer
| output = data.output; | ||
| } else if (typeof data.result === "string") { | ||
| output = data.result; | ||
| } else if (data.result && typeof data.result.content === "string") { |
There was a problem hiding this comment.
[/diagnose] This branch only handles string content, but the Copilot SDK and MCP protocol can deliver result.content as an array of content blocks ([{type:"text",text:"..."}]). When that shape appears the branch silently falls through to "success" — the regression this PR aims to fix.
💡 Suggested guard
Add an array case below the string check:
} else if (data.result && Array.isArray(data.result.content)) {
// MCP-style content blocks: [{type:"text",text:"..."}]
output = data.result.content.map(c => (typeof c === "string" ? c : c.text || "")).join("\n");
}This mirrors the pattern in log_parser_format.cjs line 250.
| output = data.result.content; | ||
| } else if (data.error) { | ||
| output = String(data.error); | ||
| output = typeof data.error === "object" && typeof data.error.message === "string" ? data.error.message : String(data.error); |
There was a problem hiding this comment.
[/tdd] The error extraction fix (data.error.message instead of String(data.error)) has no test coverage. Without a test, the "[object Object]" regression could silently return.
💡 Suggested test
Add alongside the existing result.content test:
it("renders error message from data.error.message in tool.execution_complete", () => {
const eventsLog = [
'{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}',
'{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}',
'{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":false,"error":{"message":"Permission denied"}}}',
'{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Done"}}',
].join("\n");
const result = parseCopilotLog(eventsLog);
expect(result.markdown).toContain("Permission denied");
expect(result.markdown).not.toContain("[object Object]");
});| const result = parseCopilotLog(eventsLog); | ||
|
|
||
| expect(result.markdown).toContain("bash"); | ||
| expect(result.markdown).toContain("file1.txt"); |
There was a problem hiding this comment.
[/tdd] The two assertions cover the happy path but leave important behaviour untested: (1) no check that the old "success" placeholder is absent, and (2) only the first line is verified — file2.txt / file3.txt could silently be dropped.
💡 Strengthen the test
expect(result.markdown).toContain("file1.txt");
expect(result.markdown).toContain("file2.txt"); // verify multiline output reaches markdown
expect(result.markdown).not.toContain("success"); // guard against fallthrough to placeholderA spec-style name would also help: "renders multiline tool output from result.content without a 'success' placeholder".
| const success = event.data?.success ?? !event.data?.error; | ||
| // Include result.content (concise LLM-facing output) so that the log | ||
| // parser can render tool output previews from events.jsonl directly. | ||
| const result = event.data?.result ?? undefined; |
There was a problem hiding this comment.
[/diagnose] event.data?.result ?? undefined is a no-op: optional chaining already returns undefined when the property is absent. Just event.data?.result is identical and clearer.
const result = event.data?.result;|
@copilot review all comments and address the unresolved review feedback on this PR.
|
Copilot CLI runs were rendering tool calls with no output preview, while Claude runs showed the first 2 non-empty lines. The fix reads tool results from the structured
events.jsonlfile that the Copilot CLI writes for each session, rather than scraping raw debug logs.Root cause
convertCopilotEventsToLegacyLogEntrieshandledtool.execution_completeevents but did not extract output fromdata.result.content— the concise LLM-facing tool result field present in the native Copilot CLI events.jsonl format. As a result, all successful tool calls fell through to a"success"placeholder with no content preview.Changes
log_parser_shared.cjs—convertCopilotEventsToLegacyLogEntriesnow extracts tool output fromdata.result.content(native Copilot CLI events.jsonl format,ToolExecutionCompleteResult.content). Also fixes error extraction to usedata.error.messageinstead ofString(errorObject).copilot_sdk_session.cjs—tool.execution_completeevents written to events.jsonl now include theresultobject from the SDK event, so SDK-based sessions also surface tool output previews via the same path.parse_copilot_log.test.cjs— adds a test verifying that tool output is rendered fromresult.contentin events.jsonl format.