Skip to content

Show tool output previews in Copilot CLI conversation rendering#40116

Merged
pelikhan merged 3 commits into
mainfrom
copilot/improve-agent-conversation-rendering
Jun 19, 2026
Merged

Show tool output previews in Copilot CLI conversation rendering#40116
pelikhan merged 3 commits into
mainfrom
copilot/improve-agent-conversation-rendering

Conversation

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Copilot CLI runs were rendering tool calls with no output preview, while Claude runs showed the first 2 non-empty lines. The fix reads tool results from the structured events.jsonl file that the Copilot CLI writes for each session, rather than scraping raw debug logs.

Root cause

convertCopilotEventsToLegacyLogEntries handled tool.execution_complete events but did not extract output from data.result.content — the concise LLM-facing tool result field present in the native Copilot CLI events.jsonl format. As a result, all successful tool calls fell through to a "success" placeholder with no content preview.

Changes

  • log_parser_shared.cjsconvertCopilotEventsToLegacyLogEntries now extracts tool output from data.result.content (native Copilot CLI events.jsonl format, ToolExecutionCompleteResult.content). Also fixes error extraction to use data.error.message instead of String(errorObject).

  • copilot_sdk_session.cjstool.execution_complete events written to events.jsonl now include the result object from the SDK event, so SDK-based sessions also surface tool output previews via the same path.

  • parse_copilot_log.test.cjs — adds a test verifying that tool output is rendered from result.content in events.jsonl format.

Copilot AI and others added 2 commits June 18, 2026 16:58
Parse Wire request blocks in Copilot CLI debug logs to extract actual
tool output content, enabling formatResultPreview() to show the first
2 non-empty lines of each tool call result — matching the behaviour
already present for Claude runs.

Wire requests contain the full conversation history including tool
results from prior turns. The new extractWireRequestToolResults()
function scans these blocks line-by-line (avoiding expensive full JSON
parsing of the 11 MB log) and builds a Map<tool_call_id → content>.
After all entries are built, non-error tool results have their content
field populated from that map before rendering.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Show tool output preview lines in Copilot debug log rendering Show tool output previews in Copilot CLI debug log rendering Jun 18, 2026
Copilot AI requested a review from pelikhan June 18, 2026 17:00
@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot render the copilot cli vent conversation from the events.jsonl log file, not the raw logs.

- Revert extractWireRequestToolResults (raw [DEBUG] Wire request: parsing)
- Fix convertCopilotEventsToLegacyLogEntries to extract tool output from
  data.result.content (native Copilot CLI events.jsonl format)
- Fix error message extraction to use data.error.message (not String(obj))
- Update copilot_sdk_session.cjs to include result in tool.execution_complete
  events so SDK-based sessions also get tool output previews in events.jsonl
- Add test: renders tool output preview from result.content in events.jsonl

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot render the copilot cli vent conversation from the events.jsonl log file, not the raw logs.

Done in the latest commit. The approach is now:

  1. log_parser_shared.cjsconvertCopilotEventsToLegacyLogEntries now extracts tool output from data.result.content in tool.execution_complete events. This is the concise LLM-facing output field in the native Copilot CLI events.jsonl format (per the SDK's ToolExecutionCompleteResult.content).

  2. copilot_sdk_session.cjs – the tool.execution_complete events written to events.jsonl now include the result object from the SDK event, so SDK-based sessions also surface tool output via the same path.

The extractWireRequestToolResults raw debug-log parsing has been removed entirely.

Copilot AI changed the title Show tool output previews in Copilot CLI debug log rendering Show tool output previews in Copilot CLI conversation rendering Jun 18, 2026
@pelikhan pelikhan marked this pull request as ready for review June 18, 2026 18:13
Copilot AI review requested due to automatic review settings June 18, 2026 18:13
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #40116 does not have the 'implementation' label and has 0 new lines of code in business logic directories (≤100 threshold). The 3 changed files are not in monitored business logic paths.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes missing tool-output previews when rendering Copilot CLI conversations by extracting LLM-facing tool results from the structured events.jsonl format (instead of falling back to a "success" placeholder), and by ensuring SDK-based sessions write the needed result fields into events.jsonl.

Changes:

  • Update Copilot events → legacy trace normalization to extract tool output from data.result.content, and improve error text extraction via data.error.message.
  • Include the SDK tool result object in serialized tool.execution_complete JSONL events so the same parser path can render previews.
  • Add a regression test covering tool output extraction from result.content in events.jsonl.
Show a summary per file
File Description
actions/setup/js/log_parser_shared.cjs Extracts tool output from Copilot CLI events.jsonl result.content and improves error message extraction.
actions/setup/js/copilot_sdk_session.cjs Extends SDK JSONL serialization to include tool execution result for downstream rendering.
actions/setup/js/parse_copilot_log.test.cjs Adds a test intended to prevent regression in rendering tool output from events.jsonl.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 3

Comment on lines +931 to 935
} else if (data.result && typeof data.result.content === "string") {
// Native Copilot CLI events.jsonl format: result.content is the concise
// tool result text sent to the LLM (may be truncated for token efficiency).
output = data.result.content;
} else if (data.error) {
Comment on lines +127 to +139
it("renders tool output preview from result.content in Copilot CLI events.jsonl", () => {
const eventsLog = [
'{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}',
'{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}',
'{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":true,"result":{"content":"file1.txt\\nfile2.txt\\nfile3.txt"}}}',
'{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Done"}}',
].join("\n");

const result = parseCopilotLog(eventsLog);

expect(result.markdown).toContain("bash");
expect(result.markdown).toContain("file1.txt");
});
Comment on lines +268 to 272
// Include result.content (concise LLM-facing output) so that the log
// parser can render tool output previews from events.jsonl directly.
const result = event.data?.result ?? undefined;
// max-tool-denials intentionally tracks permission denials only.
// Tool execution failures are still logged, but do not increment the guardrail counter.
@github-actions github-actions Bot mentioned this pull request Jun 18, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — two issues to fix before merge

The happy-path fix (rendering result.content from events.jsonl) is correct and the core logic in log_parser_shared.cjs is sound. Two things need to be addressed:

Issues

1. (Medium) Unbounded result object written to JSONLcopilot_sdk_session.cjs serializes the entire ToolExecutionCompleteResult to disk, including detailedContent (full diffs) and uiResource, even though the parser only ever reads result.content. This is a performance regression on any tool-heavy run that edits files. Fix: write { content: result.content } only.

2. (Medium) Error-extraction fix is untestedlog_parser_shared.cjs line 936 changes String(data.error) to data.error.message, a meaningful behavior fix ([object Object] → real message), but no test covers a failed tool call with an Error-shaped object. A future refactor could silently revert it.

🔎 Code quality review by PR Code Quality Reviewer

// max-tool-denials intentionally tracks permission denials only.
// Tool execution failures are still logged, but do not increment the guardrail counter.
writeEvent("tool.execution_complete", { toolName, mcpServerName, success }, event.timestamp);
writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full ToolExecutionCompleteResult object written to JSONL, but only result.content is ever consumed: detailedContent and uiResource are serialized unnecessarily, bloating events.jsonl for tool-heavy runs.

💡 Suggested fix

Write only the fields the log parser actually reads:

// current
const result = event.data?.result ?? undefined;
writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp);

// suggested
const resultContent = event.data?.result?.content;
const result = resultContent !== undefined ? { content: resultContent } : undefined;
writeEvent("tool.execution_complete", { toolName, mcpServerName, success, result }, event.timestamp);

The SDK's ToolExecutionCompleteResult has four fields: content (concise, truncated), contents (structured blocks), detailedContent (the full tool output including complete diffs — explicitly documented as potentially very large), and uiResource. Only result.content is ever read in log_parser_shared.cjs. Serializing the whole object via JSON.stringify means every tool call that edits a file will write an unbounded detailedContent string to disk, and the parser has to deserialize all of it for nothing.

output = data.result.content;
} else if (data.error) {
output = String(data.error);
output = typeof data.error === "object" && typeof data.error.message === "string" ? data.error.message : String(data.error);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error-extraction fix (String(data.error).message) has no test coverage: this is a real behavior change — before, an Error object would render as [object Object]; after, it renders as the human-readable message — but the test suite only adds a happy-path case.

💡 Suggested test

Add a case to the existing describe block covering failed tool execution with an Error object:

it("renders error.message from a failed tool call in events.jsonl", () => {
  const eventsLog = [
    '{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}',
    '{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}',
    '{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":false,"error":{"message":"Permission denied"}}}',
    '{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Failed"}}',
  ].join("\n");

  const result = parseCopilotLog(eventsLog);

  expect(result.markdown).toContain("Permission denied");
  // Must not contain the old serialization artifact
  expect(result.markdown).not.toContain("[object Object]");
});

Without this, a future refactor could revert the fix silently.

@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

⚠️ Test Quality Score: 70/100 — Acceptable

Analyzed 1 test: 1 design, 0 implementation, 0 guideline violations.

📊 Metrics & Test Classification (1 test analyzed)
Metric Value
New/modified tests analyzed 1
✅ Design tests (behavioral contracts) 1 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 0 (0%)
Duplicate test clusters 0
Test inflation detected No (1.4:1 ratio — 14 test lines vs ~10 production lines)
🚨 Coding-guideline violations 0
Test File Classification Issues Detected
renders tool output preview from result.content in Copilot CLI events.jsonl actions/setup/js/parse_copilot_log.test.cjs:127 ✅ Design Happy-path only

Go: 0 (*_test.go); JavaScript: 1 (*.test.cjs). Other languages detected but not scored.

⚠️ Flagged Tests — Non-blocking Issues (1)

renders tool output preview from result.content in Copilot CLI events.jsonl (actions/setup/js/parse_copilot_log.test.cjs:127) — ⚠️ Happy-path only: the test verifies that the tool name ("bash") and one output line ("file1.txt") appear in the rendered markdown — but does not cover the fallback paths added in the same PR. Specifically: (1) result.content being absent, null, or non-string; (2) the updated error.message extraction (data.error.message) in log_parser_shared.cjs; (3) multi-line output — only file1.txt is asserted, not file2.txt or file3.txt. Suggested fix: add a second it() passing result: undefined or error: { message: "some error" } to verify the fallback rendering path.

Verdict

Check passed. 0% implementation tests (threshold: 30%). The new test covers the behavioral contract — observable markdown output contains the tool name and tool output when result.content is present. Score reduced to 70/100 due to zero edge-case coverage; error fallback paths added in the same PR (in log_parser_shared.cjs) are untested.

References:

🧪 Test quality analysis by Test Quality Sentinel ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 70/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose and /tdd — clean targeted fix with good PR hygiene. A few suggestions to make it more robust.

📋 Key Themes & Highlights

Key Themes

  • Potential array content gap: data.result.content is only extracted when it's a string; if the SDK emits MCP-style content-block arrays ([{type,text}]) for some tool types, those still fall through to "success". The same pattern is already handled in log_parser_format.cjs line 250.
  • Missing tests for the error fix: The data.error.message extraction improvement has no regression test — a future refactor could restore String(data.error) without any warning.
  • Thin test assertions: The new test doesn't assert the absence of the "success" placeholder or verify second/third output lines.

Positive Highlights

  • ✅ Root cause is clearly identified and the two-layer fix (write path + parse path) is correct
  • ✅ Small, focused diff (+24/-3) — easy to reason about
  • ✅ Good that a regression test was added alongside the fix
  • ✅ Error extraction improvement (data.error.message) is a nice bonus fix
  • ✅ Inline comment explains the purpose of result.content for future readers

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer

output = data.output;
} else if (typeof data.result === "string") {
output = data.result;
} else if (data.result && typeof data.result.content === "string") {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] This branch only handles string content, but the Copilot SDK and MCP protocol can deliver result.content as an array of content blocks ([{type:"text",text:"..."}]). When that shape appears the branch silently falls through to "success" — the regression this PR aims to fix.

💡 Suggested guard

Add an array case below the string check:

} else if (data.result && Array.isArray(data.result.content)) {
  // MCP-style content blocks: [{type:"text",text:"..."}]
  output = data.result.content.map(c => (typeof c === "string" ? c : c.text || "")).join("\n");
}

This mirrors the pattern in log_parser_format.cjs line 250.

output = data.result.content;
} else if (data.error) {
output = String(data.error);
output = typeof data.error === "object" && typeof data.error.message === "string" ? data.error.message : String(data.error);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The error extraction fix (data.error.message instead of String(data.error)) has no test coverage. Without a test, the "[object Object]" regression could silently return.

💡 Suggested test

Add alongside the existing result.content test:

it("renders error message from data.error.message in tool.execution_complete", () => {
  const eventsLog = [
    '{"type":"user.message","timestamp":"2026-06-05T00:44:01.367Z","data":{}}',
    '{"type":"tool.execution_start","timestamp":"2026-06-05T00:44:04.520Z","data":{"toolName":"bash","mcpServerName":""}}',
    '{"type":"tool.execution_complete","timestamp":"2026-06-05T00:44:04.700Z","data":{"toolName":"bash","mcpServerName":"","success":false,"error":{"message":"Permission denied"}}}',
    '{"type":"assistant.message","timestamp":"2026-06-05T00:44:59.769Z","data":{"content":"Done"}}',
  ].join("\n");
  const result = parseCopilotLog(eventsLog);
  expect(result.markdown).toContain("Permission denied");
  expect(result.markdown).not.toContain("[object Object]");
});

const result = parseCopilotLog(eventsLog);

expect(result.markdown).toContain("bash");
expect(result.markdown).toContain("file1.txt");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The two assertions cover the happy path but leave important behaviour untested: (1) no check that the old "success" placeholder is absent, and (2) only the first line is verified — file2.txt / file3.txt could silently be dropped.

💡 Strengthen the test
expect(result.markdown).toContain("file1.txt");
expect(result.markdown).toContain("file2.txt");  // verify multiline output reaches markdown
expect(result.markdown).not.toContain("success"); // guard against fallthrough to placeholder

A spec-style name would also help: "renders multiline tool output from result.content without a 'success' placeholder".

const success = event.data?.success ?? !event.data?.error;
// Include result.content (concise LLM-facing output) so that the log
// parser can render tool output previews from events.jsonl directly.
const result = event.data?.result ?? undefined;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] event.data?.result ?? undefined is a no-op: optional chaining already returns undefined when the property is absent. Just event.data?.result is identical and clearer.

const result = event.data?.result;

@github-actions

Copy link
Copy Markdown
Contributor

@copilot review all comments and address the unresolved review feedback on this PR.

Generated by 👨‍🍳 PR Sous Chef · 46.5 AIC · ⌖ 0.993 AIC · ⊞ 17.2K ·

@pelikhan pelikhan merged commit 3dbc243 into main Jun 19, 2026
46 of 54 checks passed
@pelikhan pelikhan deleted the copilot/improve-agent-conversation-rendering branch June 19, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants