Skip to content

Add build-duty skill for PR triage across dotnet repos#53311

Open
marcpopMSFT wants to merge 5 commits intomainfrom
marcpopMSFT-buildreport
Open

Add build-duty skill for PR triage across dotnet repos#53311
marcpopMSFT wants to merge 5 commits intomainfrom
marcpopMSFT-buildreport

Conversation

@marcpopMSFT
Copy link
Copy Markdown
Member

New version of #52678 using the new skills that were introduced.

Creates a new skill under .claude/skills/build-duty/ that helps build duty engineers triage automated PRs across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories.

The skill includes:

  • Get-BuildDutyReport.ps1: PowerShell script that queries GitHub via gh CLI for PRs from monitored authors (dotnet-maestro[bot], github-actions[bot] merge PRs, vseanreesermsft, dotnet-bot), classifies them into categories (Ready to Merge, Branch Lockdown, Changes Requested, Failing/Blocked), and outputs both human-readable tables and structured JSON.
  • SKILL.md: Skill definition with workflow instructions for running the script, interpreting results, investigating failures via the ci-analysis skill, and generating a formatted triage report.

Key design decisions:

  • Uses gh CLI for deterministic, testable queries (no LLM drift)
  • Uses GraphQL for efficient batching of mergeStateStatus and statusCheckRollup in a single call per PR
  • Delegates CI failure analysis to the existing ci-analysis skill
  • Script handles filtering/classification; agent handles presentation

Supersedes the draft approach in PR #52678 which used a pure prompt-based approach that was unreliable.

Creates a new skill under .claude/skills/build-duty/ that helps build duty
engineers triage automated PRs across dotnet/sdk, dotnet/installer,
dotnet/templating, and dotnet/dotnet repositories.

The skill includes:
- Get-BuildDutyReport.ps1: PowerShell script that queries GitHub via gh CLI
  for PRs from monitored authors (dotnet-maestro[bot], github-actions[bot]
  merge PRs, vseanreesermsft, dotnet-bot), classifies them into categories
  (Ready to Merge, Branch Lockdown, Changes Requested, Failing/Blocked),
  and outputs both human-readable tables and structured JSON.
- SKILL.md: Skill definition with workflow instructions for running the
  script, interpreting results, investigating failures via the ci-analysis
  skill, and generating a formatted triage report.

Key design decisions:
- Uses gh CLI for deterministic, testable queries (no LLM drift)
- Uses GraphQL for efficient batching of mergeStateStatus and
  statusCheckRollup in a single call per PR
- Delegates CI failure analysis to the existing ci-analysis skill
- Script handles filtering/classification; agent handles presentation

Supersedes the draft approach in PR #52678 which used a pure prompt-based
approach that was unreliable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 6, 2026 18:30
@marcpopMSFT
Copy link
Copy Markdown
Member Author

Example report. It is a good first step but will need to utilize the helix/CI skills better as some of the next step details are not great.

🔧 Build Duty Triage Report

Date: March 6, 2026
Repositories: dotnet/sdk, dotnet/installer, dotnet/templating, dotnet/dotnet


✅ Ready to Merge (0)

No PRs are currently ready to merge.


🔒 Branch Lockdown (0)

No branches are currently locked down.


⚠️ Changes Requested (0)

No PRs have pending change requests.


❌ Failing / Blocked (17)

dotnet/sdk — Codeflow PRs (4)

# Title Target Age Checks Issue
#53250 Source code updates from dotnet/dotnet main 2d FAILURE StaticWebAssets tests failing cross-platform (Windows, macOS, Linux). Dependency update likely broke SWA tests.
#53284 Source code updates from dotnet/dotnet release/10.0.2xx <1d FAILURE 425+ test failures across NetAnalyzers, EndToEnd, Watch, dotnet.Tests. Broad dependency/version issue. Known issue #40006 (watch tests) partially applies.
#53267 Source code updates from dotnet/dotnet release/11.0.1xx-preview2 1d FAILURE 7/17 jobs failing — StaticWebAssets, Blazor WASM AoT, FullFramework. Cascading from Version.Details/NuGet.config changes.
#53310 Source code updates from dotnet/dotnet release/10.0.1xx <1d PENDING CI still running.

dotnet/sdk — Merge PRs (5)

# Title Target Age Checks Issue
#53288 Merge release/10.0.3xx => main main <1d PENDING CI still running. Blocked on upstream merge chain.
#53236 Merge release/9.0.3xx => release/10.0.1xx release/10.0.1xx 2d PENDING CI pending.
#53223 Merge release/10.0.2xx => release/10.0.3xx release/10.0.3xx 3d PENDING CI pending — likely waiting on upstream.
#53197 Merge release/10.0.1xx => release/10.0.2xx release/10.0.2xx 6d PENDING CI pending — likely waiting on upstream.
#53175 ⚠️ Merge main => release/dnup release/dnup 7d FAILURE Razor, Blazor WASM, watch tests, EndToEnd, Build, Pack, Publish tests failing. Changes to eng/common infra + MSBuildSdkResolver correlate. Known issue #40006 (watch) partially applies.

dotnet/templating — Codeflow PRs (2)

# Title Target Age Checks Issue
#9934 Source code updates from dotnet/dotnet release/11.0.1xx-preview2 <1d FAILURE CI rerun triggered — templating tests can be flaky.
#9935 Source code updates from dotnet/dotnet release/10.0.3xx <1d FAILURE CI rerun triggered — templating tests can be flaky.

dotnet/dotnet (VMR) — SDK-owned PRs (6)

# Title Target Age Checks Issue
#5268 Source code updates from dotnet/sdk main <1d FAILURE Not yet investigated.
#5177 Source code updates from dotnet/sdk release/10.0.3xx 3d FAILURE Not yet investigated.
#5208 Source code updates from dotnet/source-build-reference-packages release/10.0.1xx 2d SUCCESS ✅ Checks pass — needs review approval only.
#5207 Source code updates from dotnet/source-build-reference-packages release/10.0.2xx 2d SUCCESS ✅ Checks pass — needs review approval only.
#5179 Source code updates from dotnet/sdk release/10.0.1xx 3d SUCCESS ✅ Checks pass — needs review approval only.
#5175 Source code updates from dotnet/templating release/10.0.3xx 3d SUCCESS ✅ Checks pass — needs review approval only.

dotnet/installer (0)

No open PRs from monitored authors.


📊 Summary

Category Count
Ready to Merge 0
Branch Lockdown 0
Changes Requested 0
Failing/Blocked 17
Stale (>7d) 1
Total 17

📋 Recommended Actions

  1. 🟢 Quick wins — VMR PRs needing review only: #5208, #5207, #5179, #5175 all have passing CI but need a review approval. Approving and merging these 4 PRs would clear a chunk of the queue.

  2. 🔴 Common root cause — StaticWebAssets test failures: PRs [main] Source code updates from dotnet/dotnet #53250, [release/11.0.1xx-preview2] Source code updates from dotnet/dotnet #53267, and [release/10.0.2xx] Source code updates from dotnet/dotnet #53284 all show StaticWebAssets test failures correlated with dependency version changes flowing from dotnet/dotnet. Investigate whether a specific package version bump broke the SWA tests. Fixing this at the source will unblock multiple PRs.

  3. 🔴 Broad test failures on [release/10.0.2xx] Source code updates from dotnet/dotnet #53284 (release/10.0.2xx): 425+ test failures across NetAnalyzers, EndToEnd, Watch, etc. suggest a more fundamental dependency compatibility issue on the 10.0.2xx branch. This needs priority investigation.

  4. 🟡 Merge chain is stalled: The merge chain (9.0.3xx → 10.0.1xx → 10.0.2xx → 10.0.3xx → main) has all 5 PRs in PENDING state. These are likely blocked because upstream codeflow PRs haven't merged yet. Unblocking the codeflow PRs ([main] Source code updates from dotnet/dotnet #53250, [release/11.0.1xx-preview2] Source code updates from dotnet/dotnet #53267, [release/10.0.2xx] Source code updates from dotnet/dotnet #53284) should unstall the chain.

  5. ⚠️ Stale PR [automated] Merge branch 'main' => 'release/dnup' #53175 (7 days): Merge main → release/dnup has widespread failures (Razor, Blazor, watch, EndToEnd). Known issue dotnet-watch tests failing #40006 partially applies. Investigate whether this branch is still actively maintained — if not, consider closing this PR.

  6. 🟡 Templating PRs (Update the developer guide following changes introduced by Arcade. #9934, dotnet test --blame does not always find the right test to blame #9935): CI reruns triggered. If they pass on retry, merge them. If they fail again, investigate further.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new .claude “build-duty” skill intended to help build duty engineers triage automated PRs across key dotnet repos using deterministic gh CLI queries plus a structured JSON summary for downstream reporting.

Changes:

  • Introduces Get-BuildDutyReport.ps1 to query monitored PR authors across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet, then classify and emit tables + a JSON summary block.
  • Adds SKILL.md documenting how to run the script and how to turn results into a build duty triage report (including using ci-analysis for failures).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 12 comments.

File Description
.claude/skills/build-duty/scripts/Get-BuildDutyReport.ps1 New PowerShell script that queries PRs via gh, classifies them, prints human-readable tables, and emits a structured JSON summary.
.claude/skills/build-duty/SKILL.md New skill documentation defining intended classification rules and the recommended triage workflow.

marcpopMSFT and others added 3 commits March 6, 2026 11:50
…y recommendations

- Detect PRs with 0 changed files and recommend closing (CLOSE_EMPTY_PR)
- Detect merge PRs with CONFLICTING mergeable state (FIX_MERGE_CONFLICTS)
- Detect templating PRs with single failed CI leg (RETRY_SINGLE_LEG)
- Add per-check-run details to GraphQL query (name, conclusion, status)
- Add Quick Actions section to human-readable output
- Add recommendation column to Failing/Blocked table
- Update SKILL.md with recommendation codes documentation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- dotnet-bot PRs with 0 changes: recommend merging (MERGE_EMPTY_CODEFLOW)
  as these are inter-branch codeflow with merge commits; completing them
  reduces churn in the next codeflow PR
- maestro PRs with 0 changes: check last 5 comments for darc conflict
  indicator (FIX_DARC_CONFLICT) and direct engineer to run
  'darc vmr resolve-conflict' per PR comment instructions
- Other empty PRs: keep existing CLOSE_EMPTY_PR behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove dead $variables code in Get-PrDetails
- Remove unused -SkipCIDetails parameter and its help text
- Bump labels GraphQL fetch from 20 to 100 to avoid missing labels
- Fix Get-Date -AsUTC portability: use [DateTimeOffset]::UtcNow
- Fix Sort-Object to use per-key sort direction (repo asc, age desc)
- Clarify Failing/Blocked doc to match actual mergeStateStatus logic

Declined feedback #2: Get-PrCategory correctly uses mergeStateStatus
which already accounts for required checks (CLEAN = all required pass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lbussell
Copy link
Copy Markdown
Member

Since this is cross-repo, should it go in https://github.com/dotnet/arcade-skills?

@marcpopMSFT
Copy link
Copy Markdown
Member Author

Since this is cross-repo, should it go in https://github.com/dotnet/arcade-skills?
Maybe but it's pretty specific to the Prs that we track with our build duty. There could potentially be a generic skill for this and then a specific sdk skill sitting on top of it. Note that there are details about the specific repos we care about and specific labels that are SDK.

Copy link
Copy Markdown
Member

@lbussell lbussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This skill is very long. I bet if you tell Copilot to review it according to Claude's skill authoring best practices, it would improve the efficiency of the skill quite a bit.

@@ -0,0 +1,279 @@
---
name: build-duty
description: Generate a build duty PR triage report across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories. Use when asked about "build duty", "triage PRs", "build duty report", "merge queue", "dependency PRs", "what PRs need merging", "build duty status", or "check open PRs for build duty".
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to waste these tokens when making actual code changes -- I assume you only want to run this skill on demand. You can add a couple of other frontmatter fields to make it only invocable via the /build-duty slash command:

Suggested change
description: Generate a build duty PR triage report across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories. Use when asked about "build duty", "triage PRs", "build duty report", "merge queue", "dependency PRs", "what PRs need merging", "build duty status", or "check open PRs for build duty".
description: Triage dependency update and code flow PRs for the .NET SDK team build duty.
user-invocable: true
disable-model-invocation: true

Comment on lines +14 to +21
## When to Use This Skill

Use this skill when:
- Checking build duty status ("what PRs need merging?", "build duty report")
- Triaging automated PRs across dotnet repos
- Generating a daily build duty triage report
- Checking if dependency update or codeflow PRs are ready to merge
- Asked "what's the merge queue look like?" or "any stuck PRs?"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is pointless, since the agent will only read this if the skill has already been triggered.

Comment on lines +149 to +155
```powershell
# For dotnet/sdk PRs
./.claude/skills/ci-analysis/scripts/Get-CIStatus.ps1 -PRNumber <number> -ShowLogs

# For other repos
./.claude/skills/ci-analysis/scripts/Get-CIStatus.ps1 -PRNumber <number> -Repository "dotnet/installer" -ShowLogs
```
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ci-analysis skill is invoked here, then it will already include this information. This should be removed since it's an implementation detail of the other skill.


> 🚨 **NEVER** use `gh pr merge` or approve PRs automatically. Merging and approval are human-only actions. This skill only generates reports.

**Workflow**: Run the script (Step 1) → Read the output + JSON summary (Step 2) → Investigate failing PRs with ci-analysis (Step 3) → Synthesize the final report (Step 4).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information duplicated with the "Analysis Workflow" section below.

Comment on lines +45 to +68
## What the Script Does

1. **Queries 4 repositories** via `gh` CLI for open PRs from monitored authors:
- `dotnet-maestro[bot]` — Dependency updates and codeflow PRs
- `github-actions[bot]` — **Only** inter-branch merge PRs (titles containing "Merge branch"); excludes backport PRs
- `vseanreesermsft` — Release management PRs
- `dotnet-bot` — Automated bot PRs

2. **Applies special VMR filtering** for `dotnet/dotnet`: Only includes PRs from `dotnet-maestro[bot]` whose titles reference SDK-owned repos (`dotnet/sdk`, `dotnet/templating`, `dotnet/deployment-tools`, `dotnet/source-build-reference-packages`).

3. **Fetches detailed status** for each PR via GitHub GraphQL:
- `mergeStateStatus` (CLEAN, BLOCKED, UNSTABLE)
- `statusCheckRollup` (SUCCESS, FAILURE, PENDING)
- `reviewDecision` (APPROVED, CHANGES_REQUESTED, REVIEW_REQUIRED)
- `mergeable` (MERGEABLE, CONFLICTING, UNKNOWN)
- `changedFiles` count (0 = empty PR with no actual changes)
- Individual check run results (name, conclusion, status)
- Labels, age, draft status

4. **Classifies each PR** into categories (see below).

5. **Generates a recommendation** for each PR (see Recommendation Codes below).

6. **Outputs** both human-readable tables and a `[BUILD_DUTY_SUMMARY]` JSON block.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the script is to hide context from the LLM, so it doesn't make sense to me why we would put the script's implementation details into the LLM's context window.

**Action:** Investigate failures. Use the ci-analysis skill for detailed failure information.

### ⏳ Stale (cross-cutting flag)
Any PR older than 7 days (configurable) that does NOT have `Branch Lockdown` label. These may be stuck or forgotten and need attention.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aren't telling the agent to configure it, then it doesn't make sense for the skill to mention this or for the script to have the option. The staleness could just as easily be a const at the top of the script.

Comment on lines +112 to +121
| `MERGE_EMPTY_CODEFLOW` | PR from dotnet-bot with 0 changed files — inter-branch codeflow with merge commits | Recommend merging. Completing these PRs reduces churn in the next codeflow PR. |
| `CLOSE_EMPTY_PR` | PR has 0 changed files — no actual code changes after merge/sync | Recommend closing or merging trivially. Provide `gh pr close` command. |
| `FIX_DARC_CONFLICT` | PR from maestro with 0 changed files and a darc merge conflict comment | Flag for manual resolution using `darc vmr resolve-conflict`. Direct the engineer to check the PR comments for step-by-step instructions. |
| `FIX_MERGE_CONFLICTS` | Merge PR has unresolved conflicts | Flag for manual conflict resolution. Cannot be auto-fixed. |
| `RETRY_SINGLE_LEG` | Only 1 CI leg failed out of many (likely flaky, common in templating) | Comment `/azp run` on the PR to trigger a retry. |
| `MERGE` | PR is ready to merge | List as quick win. Do NOT auto-merge — human action only. |
| `WAIT_FOR_LOCKDOWN` | Branch is locked for servicing | No action needed; queue for when lockdown lifts. |
| `ADDRESS_REVIEW` | Changes were requested by a reviewer | Note the reviewer; requires upstream action. |
| `NEEDS_REVIEW` | CI is passing but PR needs review approval | List as needing review. Common for VMR PRs. |
| `INVESTIGATE_FAILURE` | Multiple legs failing or complex failure | Run ci-analysis skill to diagnose. |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the script just output the meaning/recommendation directly? Then we can delete most of this table.


### Step 2: Read the Results

Parse the `[BUILD_DUTY_SUMMARY]` JSON. Key fields:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON is pretty token inefficient. Have you considered other formats, like markdown or plain text?

@nagilson
Copy link
Copy Markdown
Member

Note that there are details about the specific repos we care about and specific labels that are SDK.

I think the best way to handle this is to do a configuration that tells it what repos / urls to look at so other repo owners can use this. This is what the other build-duty has proposed doing.

@@ -0,0 +1,782 @@
<#
.SYNOPSIS
Queries and classifies pull requests across .NET SDK repositories for build duty triage.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 dotnet/sdk#53592 — release/10.0.2xx  │ <1d │ ✅ Actually passing! 18/19 jobs succeeded, 1 pending. │ ⏳ Wait for │
│                                      │     │                                                       │ last job,   │
│                                      │     │                                                       │ then        │
│                                      │     │                                                       │ approve    

It added this to the failed category instead of a waiting category - we might want to have a more deterministic way of deciding failure or not.

Copy link
Copy Markdown
Member

@nagilson nagilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the separation flow you have here - and it was easy to get copilot to approve the PRs which were passing. I think there's a lot of use for a tool like this - I'm wondering if we can make it generate the ci analysis reports in the background and show me the available ones so I don't have to wait for it to finish doing everything.

- Add 'Waiting for CI' category to distinguish pending checks (no failures)
  from actual CI failures (fixes Noah's misclassification issue)
- Slim SKILL.md from 280 to 53 lines: remove implementation details, duplicate
  content, recommendation codes table, CI analysis internals per Logan's feedback
- Add disable-model-invocation: true to prevent auto-triggering during coding
- Move report template to separate REPORT_TEMPLATE.md (progressive disclosure)
- Replace JSON output with human-readable tables only (token efficiency)
- Script now returns human-readable recommendation text directly
- Remove dead parameters (-DaysStale, -OutputJson, -SkipCIDetails)
- Make stale threshold a script constant (7 days)
- Remove ConvertTo-PrSummaryObject (no longer needed without JSON)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Build Duty PR Triage

Monitor and classify pull requests across .NET SDK repositories for build duty engineers. Produces a structured triage report with PR status, age, classification, and failure details.
> **NEVER** use `gh pr merge` or approve PRs automatically. Merging and approval are human-only actions. This skill only generates reports.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompts are not a security boundary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants