Add build-duty skill for PR triage across dotnet repos#53311
Add build-duty skill for PR triage across dotnet repos#53311marcpopMSFT wants to merge 5 commits intomainfrom
Conversation
Creates a new skill under .claude/skills/build-duty/ that helps build duty engineers triage automated PRs across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories. The skill includes: - Get-BuildDutyReport.ps1: PowerShell script that queries GitHub via gh CLI for PRs from monitored authors (dotnet-maestro[bot], github-actions[bot] merge PRs, vseanreesermsft, dotnet-bot), classifies them into categories (Ready to Merge, Branch Lockdown, Changes Requested, Failing/Blocked), and outputs both human-readable tables and structured JSON. - SKILL.md: Skill definition with workflow instructions for running the script, interpreting results, investigating failures via the ci-analysis skill, and generating a formatted triage report. Key design decisions: - Uses gh CLI for deterministic, testable queries (no LLM drift) - Uses GraphQL for efficient batching of mergeStateStatus and statusCheckRollup in a single call per PR - Delegates CI failure analysis to the existing ci-analysis skill - Script handles filtering/classification; agent handles presentation Supersedes the draft approach in PR #52678 which used a pure prompt-based approach that was unreliable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Example report. It is a good first step but will need to utilize the helix/CI skills better as some of the next step details are not great. 🔧 Build Duty Triage ReportDate: March 6, 2026 ✅ Ready to Merge (0)No PRs are currently ready to merge. 🔒 Branch Lockdown (0)No branches are currently locked down.
|
| # | Title | Target | Age | Checks | Issue |
|---|---|---|---|---|---|
| #53250 | Source code updates from dotnet/dotnet | main | 2d | FAILURE | StaticWebAssets tests failing cross-platform (Windows, macOS, Linux). Dependency update likely broke SWA tests. |
| #53284 | Source code updates from dotnet/dotnet | release/10.0.2xx | <1d | FAILURE | 425+ test failures across NetAnalyzers, EndToEnd, Watch, dotnet.Tests. Broad dependency/version issue. Known issue #40006 (watch tests) partially applies. |
| #53267 | Source code updates from dotnet/dotnet | release/11.0.1xx-preview2 | 1d | FAILURE | 7/17 jobs failing — StaticWebAssets, Blazor WASM AoT, FullFramework. Cascading from Version.Details/NuGet.config changes. |
| #53310 | Source code updates from dotnet/dotnet | release/10.0.1xx | <1d | PENDING | CI still running. |
dotnet/sdk — Merge PRs (5)
| # | Title | Target | Age | Checks | Issue |
|---|---|---|---|---|---|
| #53288 | Merge release/10.0.3xx => main | main | <1d | PENDING | CI still running. Blocked on upstream merge chain. |
| #53236 | Merge release/9.0.3xx => release/10.0.1xx | release/10.0.1xx | 2d | PENDING | CI pending. |
| #53223 | Merge release/10.0.2xx => release/10.0.3xx | release/10.0.3xx | 3d | PENDING | CI pending — likely waiting on upstream. |
| #53197 | Merge release/10.0.1xx => release/10.0.2xx | release/10.0.2xx | 6d | PENDING | CI pending — likely waiting on upstream. |
| #53175 |
Merge main => release/dnup | release/dnup | 7d | FAILURE | Razor, Blazor WASM, watch tests, EndToEnd, Build, Pack, Publish tests failing. Changes to eng/common infra + MSBuildSdkResolver correlate. Known issue #40006 (watch) partially applies. |
dotnet/templating — Codeflow PRs (2)
| # | Title | Target | Age | Checks | Issue |
|---|---|---|---|---|---|
| #9934 | Source code updates from dotnet/dotnet | release/11.0.1xx-preview2 | <1d | FAILURE | CI rerun triggered — templating tests can be flaky. |
| #9935 | Source code updates from dotnet/dotnet | release/10.0.3xx | <1d | FAILURE | CI rerun triggered — templating tests can be flaky. |
dotnet/dotnet (VMR) — SDK-owned PRs (6)
| # | Title | Target | Age | Checks | Issue |
|---|---|---|---|---|---|
| #5268 | Source code updates from dotnet/sdk | main | <1d | FAILURE | Not yet investigated. |
| #5177 | Source code updates from dotnet/sdk | release/10.0.3xx | 3d | FAILURE | Not yet investigated. |
| #5208 | Source code updates from dotnet/source-build-reference-packages | release/10.0.1xx | 2d | SUCCESS | ✅ Checks pass — needs review approval only. |
| #5207 | Source code updates from dotnet/source-build-reference-packages | release/10.0.2xx | 2d | SUCCESS | ✅ Checks pass — needs review approval only. |
| #5179 | Source code updates from dotnet/sdk | release/10.0.1xx | 3d | SUCCESS | ✅ Checks pass — needs review approval only. |
| #5175 | Source code updates from dotnet/templating | release/10.0.3xx | 3d | SUCCESS | ✅ Checks pass — needs review approval only. |
dotnet/installer (0)
No open PRs from monitored authors.
📊 Summary
| Category | Count |
|---|---|
| Ready to Merge | 0 |
| Branch Lockdown | 0 |
| Changes Requested | 0 |
| Failing/Blocked | 17 |
| Stale (>7d) | 1 |
| Total | 17 |
📋 Recommended Actions
-
🟢 Quick wins — VMR PRs needing review only: #5208, #5207, #5179, #5175 all have passing CI but need a review approval. Approving and merging these 4 PRs would clear a chunk of the queue.
-
🔴 Common root cause — StaticWebAssets test failures: PRs [main] Source code updates from dotnet/dotnet #53250, [release/11.0.1xx-preview2] Source code updates from dotnet/dotnet #53267, and [release/10.0.2xx] Source code updates from dotnet/dotnet #53284 all show StaticWebAssets test failures correlated with dependency version changes flowing from dotnet/dotnet. Investigate whether a specific package version bump broke the SWA tests. Fixing this at the source will unblock multiple PRs.
-
🔴 Broad test failures on [release/10.0.2xx] Source code updates from dotnet/dotnet #53284 (release/10.0.2xx): 425+ test failures across NetAnalyzers, EndToEnd, Watch, etc. suggest a more fundamental dependency compatibility issue on the 10.0.2xx branch. This needs priority investigation.
-
🟡 Merge chain is stalled: The merge chain (9.0.3xx → 10.0.1xx → 10.0.2xx → 10.0.3xx → main) has all 5 PRs in PENDING state. These are likely blocked because upstream codeflow PRs haven't merged yet. Unblocking the codeflow PRs ([main] Source code updates from dotnet/dotnet #53250, [release/11.0.1xx-preview2] Source code updates from dotnet/dotnet #53267, [release/10.0.2xx] Source code updates from dotnet/dotnet #53284) should unstall the chain.
-
⚠️ Stale PR [automated] Merge branch 'main' => 'release/dnup' #53175 (7 days): Merge main → release/dnup has widespread failures (Razor, Blazor, watch, EndToEnd). Known issue dotnet-watch tests failing #40006 partially applies. Investigate whether this branch is still actively maintained — if not, consider closing this PR. -
🟡 Templating PRs (Update the developer guide following changes introduced by Arcade. #9934,
dotnet test --blamedoes not always find the right test to blame #9935): CI reruns triggered. If they pass on retry, merge them. If they fail again, investigate further.
There was a problem hiding this comment.
Pull request overview
Adds a new .claude “build-duty” skill intended to help build duty engineers triage automated PRs across key dotnet repos using deterministic gh CLI queries plus a structured JSON summary for downstream reporting.
Changes:
- Introduces
Get-BuildDutyReport.ps1to query monitored PR authors acrossdotnet/sdk,dotnet/installer,dotnet/templating, anddotnet/dotnet, then classify and emit tables + a JSON summary block. - Adds
SKILL.mddocumenting how to run the script and how to turn results into a build duty triage report (including usingci-analysisfor failures).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
| .claude/skills/build-duty/scripts/Get-BuildDutyReport.ps1 | New PowerShell script that queries PRs via gh, classifies them, prints human-readable tables, and emits a structured JSON summary. |
| .claude/skills/build-duty/SKILL.md | New skill documentation defining intended classification rules and the recommended triage workflow. |
…y recommendations - Detect PRs with 0 changed files and recommend closing (CLOSE_EMPTY_PR) - Detect merge PRs with CONFLICTING mergeable state (FIX_MERGE_CONFLICTS) - Detect templating PRs with single failed CI leg (RETRY_SINGLE_LEG) - Add per-check-run details to GraphQL query (name, conclusion, status) - Add Quick Actions section to human-readable output - Add recommendation column to Failing/Blocked table - Update SKILL.md with recommendation codes documentation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- dotnet-bot PRs with 0 changes: recommend merging (MERGE_EMPTY_CODEFLOW) as these are inter-branch codeflow with merge commits; completing them reduces churn in the next codeflow PR - maestro PRs with 0 changes: check last 5 comments for darc conflict indicator (FIX_DARC_CONFLICT) and direct engineer to run 'darc vmr resolve-conflict' per PR comment instructions - Other empty PRs: keep existing CLOSE_EMPTY_PR behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove dead $variables code in Get-PrDetails - Remove unused -SkipCIDetails parameter and its help text - Bump labels GraphQL fetch from 20 to 100 to avoid missing labels - Fix Get-Date -AsUTC portability: use [DateTimeOffset]::UtcNow - Fix Sort-Object to use per-key sort direction (repo asc, age desc) - Clarify Failing/Blocked doc to match actual mergeStateStatus logic Declined feedback #2: Get-PrCategory correctly uses mergeStateStatus which already accounts for required checks (CLEAN = all required pass). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Since this is cross-repo, should it go in https://github.com/dotnet/arcade-skills? |
|
lbussell
left a comment
There was a problem hiding this comment.
This skill is very long. I bet if you tell Copilot to review it according to Claude's skill authoring best practices, it would improve the efficiency of the skill quite a bit.
| @@ -0,0 +1,279 @@ | |||
| --- | |||
| name: build-duty | |||
| description: Generate a build duty PR triage report across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories. Use when asked about "build duty", "triage PRs", "build duty report", "merge queue", "dependency PRs", "what PRs need merging", "build duty status", or "check open PRs for build duty". | |||
There was a problem hiding this comment.
You don't want to waste these tokens when making actual code changes -- I assume you only want to run this skill on demand. You can add a couple of other frontmatter fields to make it only invocable via the /build-duty slash command:
| description: Generate a build duty PR triage report across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories. Use when asked about "build duty", "triage PRs", "build duty report", "merge queue", "dependency PRs", "what PRs need merging", "build duty status", or "check open PRs for build duty". | |
| description: Triage dependency update and code flow PRs for the .NET SDK team build duty. | |
| user-invocable: true | |
| disable-model-invocation: true |
.claude/skills/build-duty/SKILL.md
Outdated
| ## When to Use This Skill | ||
|
|
||
| Use this skill when: | ||
| - Checking build duty status ("what PRs need merging?", "build duty report") | ||
| - Triaging automated PRs across dotnet repos | ||
| - Generating a daily build duty triage report | ||
| - Checking if dependency update or codeflow PRs are ready to merge | ||
| - Asked "what's the merge queue look like?" or "any stuck PRs?" |
There was a problem hiding this comment.
This section is pointless, since the agent will only read this if the skill has already been triggered.
| ```powershell | ||
| # For dotnet/sdk PRs | ||
| ./.claude/skills/ci-analysis/scripts/Get-CIStatus.ps1 -PRNumber <number> -ShowLogs | ||
|
|
||
| # For other repos | ||
| ./.claude/skills/ci-analysis/scripts/Get-CIStatus.ps1 -PRNumber <number> -Repository "dotnet/installer" -ShowLogs | ||
| ``` |
There was a problem hiding this comment.
If the ci-analysis skill is invoked here, then it will already include this information. This should be removed since it's an implementation detail of the other skill.
.claude/skills/build-duty/SKILL.md
Outdated
|
|
||
| > 🚨 **NEVER** use `gh pr merge` or approve PRs automatically. Merging and approval are human-only actions. This skill only generates reports. | ||
|
|
||
| **Workflow**: Run the script (Step 1) → Read the output + JSON summary (Step 2) → Investigate failing PRs with ci-analysis (Step 3) → Synthesize the final report (Step 4). |
There was a problem hiding this comment.
This information duplicated with the "Analysis Workflow" section below.
.claude/skills/build-duty/SKILL.md
Outdated
| ## What the Script Does | ||
|
|
||
| 1. **Queries 4 repositories** via `gh` CLI for open PRs from monitored authors: | ||
| - `dotnet-maestro[bot]` — Dependency updates and codeflow PRs | ||
| - `github-actions[bot]` — **Only** inter-branch merge PRs (titles containing "Merge branch"); excludes backport PRs | ||
| - `vseanreesermsft` — Release management PRs | ||
| - `dotnet-bot` — Automated bot PRs | ||
|
|
||
| 2. **Applies special VMR filtering** for `dotnet/dotnet`: Only includes PRs from `dotnet-maestro[bot]` whose titles reference SDK-owned repos (`dotnet/sdk`, `dotnet/templating`, `dotnet/deployment-tools`, `dotnet/source-build-reference-packages`). | ||
|
|
||
| 3. **Fetches detailed status** for each PR via GitHub GraphQL: | ||
| - `mergeStateStatus` (CLEAN, BLOCKED, UNSTABLE) | ||
| - `statusCheckRollup` (SUCCESS, FAILURE, PENDING) | ||
| - `reviewDecision` (APPROVED, CHANGES_REQUESTED, REVIEW_REQUIRED) | ||
| - `mergeable` (MERGEABLE, CONFLICTING, UNKNOWN) | ||
| - `changedFiles` count (0 = empty PR with no actual changes) | ||
| - Individual check run results (name, conclusion, status) | ||
| - Labels, age, draft status | ||
|
|
||
| 4. **Classifies each PR** into categories (see below). | ||
|
|
||
| 5. **Generates a recommendation** for each PR (see Recommendation Codes below). | ||
|
|
||
| 6. **Outputs** both human-readable tables and a `[BUILD_DUTY_SUMMARY]` JSON block. |
There was a problem hiding this comment.
The point of the script is to hide context from the LLM, so it doesn't make sense to me why we would put the script's implementation details into the LLM's context window.
.claude/skills/build-duty/SKILL.md
Outdated
| **Action:** Investigate failures. Use the ci-analysis skill for detailed failure information. | ||
|
|
||
| ### ⏳ Stale (cross-cutting flag) | ||
| Any PR older than 7 days (configurable) that does NOT have `Branch Lockdown` label. These may be stuck or forgotten and need attention. |
There was a problem hiding this comment.
If we aren't telling the agent to configure it, then it doesn't make sense for the skill to mention this or for the script to have the option. The staleness could just as easily be a const at the top of the script.
.claude/skills/build-duty/SKILL.md
Outdated
| | `MERGE_EMPTY_CODEFLOW` | PR from dotnet-bot with 0 changed files — inter-branch codeflow with merge commits | Recommend merging. Completing these PRs reduces churn in the next codeflow PR. | | ||
| | `CLOSE_EMPTY_PR` | PR has 0 changed files — no actual code changes after merge/sync | Recommend closing or merging trivially. Provide `gh pr close` command. | | ||
| | `FIX_DARC_CONFLICT` | PR from maestro with 0 changed files and a darc merge conflict comment | Flag for manual resolution using `darc vmr resolve-conflict`. Direct the engineer to check the PR comments for step-by-step instructions. | | ||
| | `FIX_MERGE_CONFLICTS` | Merge PR has unresolved conflicts | Flag for manual conflict resolution. Cannot be auto-fixed. | | ||
| | `RETRY_SINGLE_LEG` | Only 1 CI leg failed out of many (likely flaky, common in templating) | Comment `/azp run` on the PR to trigger a retry. | | ||
| | `MERGE` | PR is ready to merge | List as quick win. Do NOT auto-merge — human action only. | | ||
| | `WAIT_FOR_LOCKDOWN` | Branch is locked for servicing | No action needed; queue for when lockdown lifts. | | ||
| | `ADDRESS_REVIEW` | Changes were requested by a reviewer | Note the reviewer; requires upstream action. | | ||
| | `NEEDS_REVIEW` | CI is passing but PR needs review approval | List as needing review. Common for VMR PRs. | | ||
| | `INVESTIGATE_FAILURE` | Multiple legs failing or complex failure | Run ci-analysis skill to diagnose. | |
There was a problem hiding this comment.
Should the script just output the meaning/recommendation directly? Then we can delete most of this table.
.claude/skills/build-duty/SKILL.md
Outdated
|
|
||
| ### Step 2: Read the Results | ||
|
|
||
| Parse the `[BUILD_DUTY_SUMMARY]` JSON. Key fields: |
There was a problem hiding this comment.
JSON is pretty token inefficient. Have you considered other formats, like markdown or plain text?
I think the best way to handle this is to do a configuration that tells it what repos / urls to look at so other repo owners can use this. This is what the other build-duty has proposed doing. |
| @@ -0,0 +1,782 @@ | |||
| <# | |||
| .SYNOPSIS | |||
| Queries and classifies pull requests across .NET SDK repositories for build duty triage. | |||
There was a problem hiding this comment.
dotnet/sdk#53592 — release/10.0.2xx │ <1d │ ✅ Actually passing! 18/19 jobs succeeded, 1 pending. │ ⏳ Wait for │
│ │ │ │ last job, │
│ │ │ │ then │
│ │ │ │ approve
It added this to the failed category instead of a waiting category - we might want to have a more deterministic way of deciding failure or not.
nagilson
left a comment
There was a problem hiding this comment.
I like the separation flow you have here - and it was easy to get copilot to approve the PRs which were passing. I think there's a lot of use for a tool like this - I'm wondering if we can make it generate the ci analysis reports in the background and show me the available ones so I don't have to wait for it to finish doing everything.
- Add 'Waiting for CI' category to distinguish pending checks (no failures) from actual CI failures (fixes Noah's misclassification issue) - Slim SKILL.md from 280 to 53 lines: remove implementation details, duplicate content, recommendation codes table, CI analysis internals per Logan's feedback - Add disable-model-invocation: true to prevent auto-triggering during coding - Move report template to separate REPORT_TEMPLATE.md (progressive disclosure) - Replace JSON output with human-readable tables only (token efficiency) - Script now returns human-readable recommendation text directly - Remove dead parameters (-DaysStale, -OutputJson, -SkipCIDetails) - Make stale threshold a script constant (7 days) - Remove ConvertTo-PrSummaryObject (no longer needed without JSON) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| # Build Duty PR Triage | ||
|
|
||
| Monitor and classify pull requests across .NET SDK repositories for build duty engineers. Produces a structured triage report with PR status, age, classification, and failure details. | ||
| > **NEVER** use `gh pr merge` or approve PRs automatically. Merging and approval are human-only actions. This skill only generates reports. |
There was a problem hiding this comment.
Prompts are not a security boundary.
New version of #52678 using the new skills that were introduced.
Creates a new skill under .claude/skills/build-duty/ that helps build duty engineers triage automated PRs across dotnet/sdk, dotnet/installer, dotnet/templating, and dotnet/dotnet repositories.
The skill includes:
Key design decisions:
Supersedes the draft approach in PR #52678 which used a pure prompt-based approach that was unreliable.