[copilot-cli-research] Copilot CLI Deep Research - 2026-05-04 #30063

2026-05-04T05:00:39Z

github-actions[bot]
Bot May 4, 2026

Analysis Date: 2026-05-04
Repository: github/gh-aw
Scope: 211 total workflows, 95 using Copilot engine (+ extended engine object form)
Previous Analysis: 2026-05-03 (Run §25270216421)

📊 Executive Summary

Research Topic: Copilot CLI Optimization Opportunities
Key Findings: 14th consecutive run confirming two critical zero-usage features (startup-timeout, tool-timeout); 5 custom agent files remain unused; max-continuations adoption stays at just 2 workflows despite being Copilot-exclusive; mcp-scripts only adopted in 1 workflow.
Primary Recommendation: Adopt startup-timeout and tool-timeout — these protect against hung workflows and have been available since early in the project with zero adoption.

This repository has 95 Copilot workflows (45% of 211 total), with solid patterns in GitHub MCP tool usage (59 workflows, 62%), strict mode (59, 62%), and network configuration (45, 47%). However, several powerful Copilot-exclusive features remain completely unused after 14 analysis runs, representing a persistent opportunity gap.

Since the previous analysis (2026-05-03), workflow count has grown by 2 (209→211), engine.version pinning showed a small uptick (0→6), and engine.agent usage increased to 7. The fundamental persistent gaps — startup-timeout, tool-timeout, api-target, engine.harness — remain unchanged.

Critical Findings

🔴 High Priority Issues

1. startup-timeout — Zero adoption (14th consecutive run)
No workflow sets a startup-timeout to limit agent startup time. If the Copilot CLI hangs during initialization (network issue, resource contention), the workflow consumes the full timeout-minutes budget — often 15–45 minutes. A startup cap of 2–5 minutes would fail fast and free resources.

2. tool-timeout — Zero adoption (14th consecutive run)
No workflow sets a tool-timeout to limit individual tool call duration. Long-running bash commands or MCP calls can silently stall without this guard.

3. max-continuations — Only 2 workflows
max-continuations is Copilot-exclusive (not available in Claude, Codex, or other engines). It enables autopilot mode where the agent can autonomously chain multiple continuation runs to complete complex, long-horizon tasks. Only test-quality-sentinel (max: 40) and smoke-copilot (max: 2) use this feature — despite many workflows handling multi-step analytical or coding tasks that would benefit.

🟡 Medium Priority Opportunities

4. Five unused custom agent files
These .github/agents/ files exist but no workflow references them:

grumpy-reviewer.agent.md — code review agent
w3c-specification-writer.agent.md — spec writing agent
create-safe-output-type.agent.md — safe output tooling
custom-engine-implementation.agent.md — engine dev tooling
interactive-agent-designer.agent.md — workflow design assistant

5. mcp-scripts — Only 1 workflow
daily-performance-summary is the only Copilot workflow using mcp-scripts. This feature provides richer scripting capabilities (Python, Node, etc.) within agent workflows and is well-suited for data analysis and transformation tasks.

View Full Analysis

1️⃣ Current State Analysis

View Copilot CLI Capabilities Inventory

Copilot CLI Capabilities Inventory

Copilot-Exclusive Features (not available in other engines):

engine.agent — Reference a .github/agents/*.agent.md for custom persona/instructions
max-continuations — Autopilot mode: chain multiple consecutive agent runs (--autopilot --max-autopilot-continues N)
engine.harness — Replace the built-in retry/harness script wrapping Copilot CLI
BYOK mode (COPILOT_PROVIDER_BASE_URL) — Route to external LLM provider (OpenAI, Anthropic, Azure, local)

General Engine Features (Copilot supports all):

engine.model — Model override (e.g., gpt-5-mini, gpt-5)
engine.version — Pin CLI version (e.g., "0.0.422")
engine.args — Pass raw CLI arguments
engine.env — Custom environment variables
engine.api-target — Custom API endpoint (GHEC/GHES)
engine.bare — Disable context loading (--no-custom-instructions)
startup-timeout — Limit startup phase duration
tool-timeout — Limit individual tool call duration
network.allowed — Firewall domain allowlist
sandbox.agent: awf — AWF sandbox for process isolation
cache-memory — Persist data between runs
tools.web-fetch — HTTP fetch capability
tools.mcp-scripts — Scripting via MCP
tools.playwright — Browser automation
tools.github — GitHub MCP server

View Usage Statistics

Usage Statistics

Feature	Count	% of 95 Copilot
`timeout-minutes`	91	96%
`strict:`	59	62%
`tools.github`	59	62%
`network:`	45	47%
`features.copilot-requests`	38	40%
`cache-memory`	30	32%
`engine.env`	12	13%
`sandbox.agent: awf`	11	12%
`tools.web-fetch`	8	8%
`engine.agent` (custom)	7	7%
`tools.playwright`	7	7%
`engine.version`	6	6%
`engine.model`	4	4%
`engine.bare`	2	2%
`max-continuations`	2	2%
`mcp-scripts`	1	1%
`startup-timeout`	0	0%
`tool-timeout`	0	0%
`engine.api-target`	0	0%
`engine.harness`	0	0%
BYOK (`COPILOT_PROVIDER_*`)	0	0%

2️⃣ Feature Usage Matrix

Feature Category	Available	Used	Not Used	Usage Rate
Core Timeout Guards	`timeout-minutes`, `startup-timeout`, `tool-timeout`	`timeout-minutes`	`startup-timeout`, `tool-timeout`	33%
Engine Config	`model`, `version`, `args`, `env`, `agent`, `bare`, `harness`, `api-target`	`model`, `version`, `env`, `agent`, `bare`	`args`, `harness`, `api-target`	63%
Tools	`github`, `bash`, `playwright`, `web-fetch`, `mcp-scripts`, `web-search`	`github`, `bash`, `playwright`, `web-fetch`, `mcp-scripts`	`web-search`	83%
Sandbox	AWF, SRT	AWF	SRT	50%
Copilot-Exclusive	`max-continuations`, `engine.agent`, `engine.harness`, BYOK	`max-continuations` (2), `engine.agent` (7)	`engine.harness`, BYOK	40%

3️⃣ Missed Opportunities

View High Priority Opportunities

🔴 High Priority

Opportunity 1: `startup-timeout` — The Hanging Workflow Risk

What: Sets a maximum time for the agent startup phase. Without it, a hung initialization blocks the runner for the full timeout-minutes (often 15–45 min)
Why It Matters: GitHub Actions runners are expensive compute resources. A 5-min startup guard on a 30-min workflow can save 25 min of wasted runner time per failure.
Where: Every workflow with timeout-minutes ≥ 15 (91 workflows)
How to Implement:

timeout-minutes: 30
startup-timeout: 5   # Fail fast if agent doesn't start within 5 min

Opportunity 2: `tool-timeout` — Silent Tool Stalls

What: Caps how long a single tool call (bash, MCP, file read) can run
Why It Matters: Complex bash commands, especially those using gh api or jq on large datasets, can stall indefinitely. tool-timeout kills the stalled call and lets the agent either retry or fail gracefully.
Where: Workflows with bash: "*" or broad shell permissions (11+ workflows), data-heavy analysis workflows
How to Implement:

tool-timeout: 3   # Kill individual tool calls that exceed 3 minutes
timeout-minutes: 30

Opportunity 3: `max-continuations` for Complex Multi-Step Workflows

What: Enables Copilot autopilot mode — the agent can chain N continuation runs to complete long-horizon tasks
Why It Matters: This is Copilot-exclusive. Workflows like daily-repo-chronicle, agent-performance-analyzer, code-scanning-fixer do multi-phase work that would benefit from continuation chains
Where: Any analytical workflow exceeding a single agent run, especially those with complex data gathering + analysis + reporting phases
How to Implement:

engine:
  id: copilot
  max-continuations: 5   # Allow up to 5 continuation runs

View Medium Priority Opportunities

🟡 Medium Priority

Opportunity 4: Activate Unused Custom Agent Files

Five agent files in .github/agents/ have no referencing workflows:

Agent File	Purpose	Potential Workflow
`grumpy-reviewer.agent.md`	Critical code review	PR review workflow
`w3c-specification-writer.agent.md`	Technical spec writing	docs/spec generation
`create-safe-output-type.agent.md`	Safe output tooling development	internal tooling
`custom-engine-implementation.agent.md`	Engine dev	engine builder workflow
`interactive-agent-designer.agent.md`	Workflow design	workflow scaffolding

engine:
  id: copilot
  agent: grumpy-reviewer  # .github/agents/grumpy-reviewer.agent.md

Opportunity 5: Expand `mcp-scripts` Adoption

What: MCP Scripts provide Python/Node scripting capabilities beyond shell commands
Why It Matters: Workflows doing data analysis, chart generation, or complex data transformation can leverage Python libraries (pandas, matplotlib, etc.) via mcp-scripts
Where: daily-repo-chronicle, api-consumption-report, agent-performance-analyzer, ci-coach
Current: Only daily-performance-summary uses it

Opportunity 6: Selective Model Overrides

What: Use gpt-5-mini or equivalent for cost-efficient, high-frequency lightweight tasks
Current: Only 4 workflows set engine.model; auto-triage-issues correctly uses gpt-5-mini
Where: Scheduling/triage workflows, simple notification bots, summary generators that don't need top-tier reasoning

engine:
  id: copilot
  model: gpt-5-mini   # Cost-efficient for simple classification/triage

View Low Priority Opportunities

🟢 Low Priority

Opportunity 7: Version Pinning for Stability-Critical Workflows

Current: 6 workflows pin the engine version
Recommendation: Pin versions for workflows where unexpected CLI behavior changes would break output format (e.g., structured JSON output, chart generation)

Opportunity 8: `engine.bare` for Prompt-Focused Workflows

What: Disables AGENTS.md, copilot-instructions.md, and .github/ context loading — agent sees only the workflow prompt
Why: For standalone utilities (poem-bot, daily-fact, constraint-solving), loading full repo context wastes tokens and can introduce confusion
Current: Only 2 Copilot workflows use bare: true

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`daily-repo-chronicle.md`

Current: timeout-minutes: 45, AWF sandbox, github tool, bash: *
Recommended: Add startup-timeout: 5, tool-timeout: 5, consider max-continuations: 3 for large repo days

`code-scanning-fixer.md`

Candidate for: max-continuations — fixing multiple code scanning alerts benefits from chained runs

`auto-triage-issues.md`

Good pattern: Already uses model: gpt-5-mini — this is the right approach for classification tasks
Gap: Missing startup-timeout, tool-timeout

`agent-performance-analyzer.md`

Candidate for: mcp-scripts for data analysis, max-continuations for large datasets

Future PR review workflow using `grumpy-reviewer`

The grumpy-reviewer.agent.md agent is ready — no workflow uses it. A PR trigger workflow with this agent would fill the gap.

5️⃣ Trends & Historical Context

View Historical Trends (14 runs)

Metric	Apr 16	Apr 29	May 1	May 3	May 4	Trend
Total workflows	~195	205	205	209	211	↑ growing
Copilot workflows	~85	110*	110*	91	95	→ stable
`startup-timeout`	0	0	0	0	0	🔴 no change
`tool-timeout`	0	0	0	0	0	🔴 no change
`max-continuations`	2	2	2	2	2	🟡 stable
`engine.agent` (custom)	~4	6	6	6	7	↑ growing
`cache-memory`	~15	79*	62*	19	30	~ methodology varies
`mcp-scripts`	0	6*	1	1	1	→ stable
`sandbox AWF`	11	17	17	15	11	~ fluctuates
Unused agent files	5	5	5	5	5	🔴 no change

*Count methodology varied across runs (some counts included all engine forms)

Key observations:

startup-timeout and tool-timeout have been at 0% for 14 consecutive analysis runs spanning ~18 days — this is the longest-running persistent gap
5 unused custom agent files have persisted across all 14 runs
Workflow count is steadily growing (+16 since first analysis)
max-continuations frozen at exactly 2 workflows since the feature was first tracked

6️⃣ Best Practice Guidelines

Based on this research:

Always set startup-timeout and tool-timeout: Protect against hung workflows. Recommended: startup-timeout: 5, tool-timeout: 3–5 for typical workflows.
Use model: gpt-5-mini for simple tasks: Triage, labeling, classification, and simple summary workflows don't need top-tier models. See auto-triage-issues.md for the reference pattern.
Leverage max-continuations for multi-phase work: When a workflow needs to gather data → analyze → report → act, chain continuations instead of cramming everything in one run.
Reference custom agent files for specialized personas: Don't embed agent persona in the workflow markdown — use .github/agents/ files. 5 agent files await workflows.
Use engine.bare for utilities and standalone bots: When the workflow prompt is self-contained (poem-bot, daily-fact, schedulers), bare: true saves tokens and avoids repo-context confusion.

7️⃣ Action Items

Immediate Actions (this week):

Add startup-timeout: 5 and tool-timeout: 5 to workflows with timeout-minutes ≥ 15
Create a workflow using grumpy-reviewer.agent.md for PR code review

Short-term (this month):

Add max-continuations: 3–5 to complex multi-phase analytical workflows (daily-repo-chronicle, agent-performance-analyzer, code-scanning-fixer)
Expand mcp-scripts to at least 3 data-analysis workflows
Apply model: gpt-5-mini to all triage/classification/simple-summary workflows

Long-term (this quarter):

Create workflows for remaining unused agent files (w3c-spec-writer, interactive-agent-designer)
Pin engine versions for stability-critical output-format workflows
Investigate engine.bare for all standalone/utility workflows

View Supporting Evidence & Methodology

📚 References

Research Methodology

Analysis performed on all .md files in .github/workflows/ using grep pattern matching on frontmatter fields. Copilot workflow count uses simple engine: copilot form only (excludes engine:\n id: copilot object form). Feature presence detected via substring match. Historical data from repo-memory branch tracking 14 runs.

Generated by Copilot CLI Deep Research Agent (Run: §25301640113)

Generated by Copilot CLI Deep Research Agent · ● 3.9M · ◷

expires on May 5, 2026, 5:00 AM UTC

2026-05-05T04:53:28Z

github-actions[bot]
Bot May 5, 2026
Author

This discussion has been marked as outdated by Copilot CLI Deep Research Agent.

A newer discussion is available at Discussion #30270.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-05-04 #30063

Uh oh!

{{title}}

Uh oh!

1️⃣ Current State Analysis

Copilot CLI Capabilities Inventory

Usage Statistics

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

🔴 High Priority

Opportunity 1: `startup-timeout` — The Hanging Workflow Risk

Opportunity 2: `tool-timeout` — Silent Tool Stalls

Opportunity 3: `max-continuations` for Complex Multi-Step Workflows

🟡 Medium Priority

Opportunity 4: Activate Unused Custom Agent Files

Opportunity 5: Expand `mcp-scripts` Adoption

Opportunity 6: Selective Model Overrides

🟢 Low Priority

Opportunity 7: Version Pinning for Stability-Critical Workflows

Opportunity 8: `engine.bare` for Prompt-Focused Workflows

4️⃣ Specific Workflow Recommendations

`daily-repo-chronicle.md`

`code-scanning-fixer.md`

`auto-triage-issues.md`

`agent-performance-analyzer.md`

Future PR review workflow using `grumpy-reviewer`

5️⃣ Trends & Historical Context

6️⃣ Best Practice Guidelines

📚 References

Research Methodology

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-05-04 #30063

Uh oh!

github-actions[bot] Bot May 4, 2026

📊 Executive Summary

Critical Findings

🔴 High Priority Issues

🟡 Medium Priority Opportunities

1️⃣ Current State Analysis

Copilot CLI Capabilities Inventory

Usage Statistics

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

🔴 High Priority

Opportunity 1: startup-timeout — The Hanging Workflow Risk

Opportunity 2: tool-timeout — Silent Tool Stalls

Opportunity 3: max-continuations for Complex Multi-Step Workflows

🟡 Medium Priority

Opportunity 4: Activate Unused Custom Agent Files

Opportunity 5: Expand mcp-scripts Adoption

Opportunity 6: Selective Model Overrides

🟢 Low Priority

Opportunity 7: Version Pinning for Stability-Critical Workflows

Opportunity 8: engine.bare for Prompt-Focused Workflows

4️⃣ Specific Workflow Recommendations

daily-repo-chronicle.md

code-scanning-fixer.md

auto-triage-issues.md

agent-performance-analyzer.md

Future PR review workflow using grumpy-reviewer

5️⃣ Trends & Historical Context

6️⃣ Best Practice Guidelines

7️⃣ Action Items

📚 References

Research Methodology

Replies: 1 comment

Uh oh!

github-actions[bot] Bot May 5, 2026 Author

github-actions[bot]
Bot May 4, 2026

Opportunity 1: `startup-timeout` — The Hanging Workflow Risk

Opportunity 2: `tool-timeout` — Silent Tool Stalls

Opportunity 3: `max-continuations` for Complex Multi-Step Workflows

Opportunity 5: Expand `mcp-scripts` Adoption

Opportunity 8: `engine.bare` for Prompt-Focused Workflows

`daily-repo-chronicle.md`

`code-scanning-fixer.md`

`auto-triage-issues.md`

`agent-performance-analyzer.md`

Future PR review workflow using `grumpy-reviewer`

github-actions[bot]
Bot May 5, 2026
Author