incident-management: tighten IR template structure and pipeline runbook by frameworks-volunteer · Pull Request #424 · security-alliance/frameworks

frameworks-volunteer · 2026-03-21T18:56:19Z

Summary

This PR is a first pass on the recently added Incident Response Template section.

The goal is not to expand the section broadly, but to make it clearer, tighter, and more operationally credible without adding filler or speculative content.

This pass focuses on three things:

clarifying the distinction between framework guidance, templates, runbooks, and playbooks
reducing a few over-absolute statements
upgrading the weakest runbook in the set (build-pipeline-compromise) into something more responder-oriented

What changed

1) Clarified content taxonomy

Added concise framing so readers can understand what each layer is for:

incident-management/overview.mdx
clarifies that the section now contains both:
framework guidance
operational templates
incident-management/playbooks/overview.mdx
reframes playbooks as reference material, not drop-in internal operating procedures
points readers to the template/runbook sections for copy-and-adapt operational docs
incident-response-template/overview.mdx
clarifies that the broader incident-management pages explain concepts/practices
clarifies that the template section is intended to be copied/customized for internal use
distinguishes:
policy / roles / communications / contacts
templates
runbooks
incident-response-template/templates/overview.mdx
clarifies when to use templates vs runbooks vs policy pages
incident-response-template/runbooks/overview.mdx
clarifies that runbooks are operational procedures, distinct from framework playbooks and blank templates

2) Tightened a few absolute statements

incident-response-template/incident-response-policy.mdx
changed:
"Monitor for at least a week"
to:
"Monitor based on residual risk, blast radius, and incident type"
incident-response-template/roles-and-staffing.mdx
changed:
"These people should be reachable 24/7"
to:
"There should be a 24/7 escalation path to these people"

These changes are meant to make the guidance more realistic and less doctrinal.

3) Upgraded the build pipeline compromise runbook

incident-response-template/runbooks/build-pipeline-compromise.mdx was previously a thin stub. This PR upgrades it into a more credible example runbook by adding:

better identification criteria
scope questions
differentiation from adjacent incident classes
immediate actions that reflect actual responder priorities:
freeze pipeline
preserve evidence
rotate credentials by blast radius
stop trusting recent outputs
investigation questions focused on access path, permissions, credential exposure, and affected outputs
containment / recovery options:
rebuild from known-good commit using clean pipeline
rollback to last known-good release
keep service paused until trust is re-established
a verification gate before normal delivery resumes
a concise hardening checklist after the incident

What this PR does not do

Intentionally out of scope for this first pass:

broad content expansion
adding new Web3-specific runbooks just to fill gaps
renaming sections or restructuring the sidebar deeply
inventing protocol-specific operational steps without high confidence

I would rather leave gaps visible than fill them with weak or speculative guidance.

Why this scope

The Incident Response Template addition is already valuable, but right now it mixes:

framework/reference material
internal templates
runbooks

This first pass tries to make that structure easier to understand, while also strengthening one page that felt materially underdeveloped.

Follow-up ideas (not included here)

Possible future passes, if useful:

strengthen frontend-compromise and dependency-attack
add battle-tested Web3-native scenarios only where confidence is high
revisit naming/IA if the team wants clearer labels than the current playbook/runbook/template split

github-actions · 2026-03-21T18:58:53Z

built with Refined Cloudflare Pages Action

⚡ Cloudflare Pages Deployment

Name	Status	Preview	Last Commit
frameworks	✅ Ready (View Log)	Visit Preview	`43f0e45`

frameworks-volunteer · 2026-03-21T19:31:53Z

Second pass update

This second pass keeps the same philosophy as the first one:

no broad expansion
no filler
no speculative Web3-specific guidance
only tighten pages where the operational value is clear and high-confidence

What changed in this pass

This pass focuses on the two runbooks that still felt materially underpowered:

incident-response-template/runbooks/frontend-compromise.mdx
incident-response-template/runbooks/dependency-attack.mdx

1) Strengthened frontend-compromise

This page now better reflects how frontend incidents actually behave in practice, especially in Web3 where a frontend compromise often becomes a user-signing or
approval-theft incident very quickly.

Changes include:

clearer identification and scope questions
stronger focus on stopping service quickly
explicit emphasis on warning users early and clearly
preserving evidence before cleanup
tighter framing around identifying the real trust-boundary failure:
DNS
CDN/hosting
dependency
build pipeline
improved recovery conditions before restoring service
more practical affected-user support guidance

The goal here was to make the page more useful during the first minutes of an actual incident, not just more complete on paper.

2) Strengthened dependency-attack

This page was still too close to a stub. It now better distinguishes between a generic vulnerable package and a dependency incident that may have affected real build
outputs, releases, or users.

Changes include:

better scope questions:
production vs build-only exposure
build-time vs runtime execution
possible credential / artifact impact
clearer differentiation from:
frontend compromise
build pipeline compromise
stronger immediate actions:
freeze releases
identify the exact package/version path
stop trusting recent outputs
preserve evidence
improved investigation questions
more credible containment / recovery options
a verification gate before resuming normal delivery
tighter prevention guidance focused on dependency discipline and build trust

What I intentionally did not change

Still intentionally out of scope:

adding new runbooks just to close every possible gap
speculative guidance for scenarios that need deeper expertise or stronger repo context
touching pages that did not clearly benefit from high-confidence tightening

For example, I left key-compromise unchanged in this pass rather than make lower-confidence edits.

Why this is the last pass

At this point, the highest-value weak spots in the imported IR template section have been addressed without turning the PR into a broad rewrite.

This keeps the contribution focused on:

clearer information architecture
more realistic wording
stronger responder-oriented runbooks where they were obviously too thin

scode2277 · 2026-03-24T17:49:30Z

Left some comments @mattaereal

mattaereal · 2026-04-14T14:38:00Z

@frameworks-volunteer can you address @scode2277's comments, solving the conversation and potential conflicts? Take all her feedback seriously.

frameworks-volunteer · 2026-04-14T14:52:59Z

Model: z-ai/glm-5.1 Reasoning: high Provider: openrouter

Addressed all of @scode2277 feedback and resolved merge conflicts with develop. Here is what was done:

dependency-attack.mdx -- Fixed pnpm frozen lockfile note: changed `pnpm install --frozen-lockfile` to `pnpm install` (frozen lockfile is default in CI), and added `yarn install --immutable` for completeness.
runbooks/overview.mdx -- Added internal links to Playbooks and Templates pages, per suggestion.
templates/overview.mdx -- Added internal links to template, runbook, and policy pages, per suggestion.
incident-response-template/overview.mdx -- Added internal links to Incident Management and Incident Response Templates in the first distinction block, and added links to Policy/roles/communications/contacts/Templates/Runbooks in the second block.
incident-management/overview.mdx -- Changed "This section" to "This framework", and renamed labels from "Framework guidance/Operational templates" to "Incident Management core knowledge/Incident Response templates", per suggestion.
Merge conflicts resolved -- Rebased onto latest develop. All 6 conflicted files (roles-and-staffing, build-pipeline-compromise, dependency-attack, frontend-compromise, runbooks/overview, templates/overview) resolved by keeping the improved PR content while integrating develop changes.

scode2277

gtg for me @mattaereal!

frameworks-volunteer

Model: moonshotai/kimi-k2.6 Reasoning: high Provider: openrouter

Security: clean. No secrets, injection vectors, or unsafe patterns. Content is MDX docs only.

QA: all checks pass.

CI green (build, lint, sidebar-reminder)
All internal link targets exist and resolve correctly
Frontmatter valid and consistent across files
No leftover debug content, TODOs, or placeholder text
"Stub runbook" language properly replaced with "Example runbook"

Substantive notes:

Build pipeline compromise runbook: major upgrade from thin stub to credible operational document. The addition of scope questions, differentiation from adjacent incident classes, evidence preservation, blast-radius-ordered credential rotation, and a verification gate before resuming delivery are all well-chosen. The three recovery options (rebuild / rollback / pause) with explicit "When" and "Impact" framing is practical.
Dependency attack runbook: similarly upgraded with proper scope questions, differentiation (correctly cross-linking to build-pipeline-compromise and frontend-compromise), and a "Verification Before Resuming" gate. Good that it distinguishes malicious vs vulnerable packages.
Frontend compromise runbook: tightened throughout. The user warning message template is improved ("If you have not signed new transactions, your funds in the protocol remain unaffected" adds important clarity). Evidence preservation moved before cleanup is the right call. The trust boundary failure step is a useful addition.
Taxonomy clarifications across overview pages are concise and well-placed -- they give readers a mental model for framework guidance vs templates vs runbooks vs playbooks without being repetitive.
Policy/staffing softening ("Monitor based on residual risk..." and "24/7 escalation path") are more realistic and less doctrinal. Good changes.
One minor note: templates/overview.mdx line "use a template" self-links to the current page. Not broken, but slightly odd for a directory page. Very low priority.

Approving -- this is a solid, well-scoped first pass that delivers exactly what it promises.

hexnickk4997 · 2026-04-23T11:06:47Z

- [ ] CI/CD configuration changed without approval
- [ ] Secrets accessed or exfiltrated
- [ ] Unauthorized workflow runs
+- [ ] Unexpected workflow runs or releases


These probably should be list items, as it's a bit unclear what is the purpose of a checklists here. Will team need to click them through? Why? I see that they were checklists before, it probably slipped through a previous review iteration

hexnickk4997 · 2026-04-23T11:07:24Z

+- [ ] Deployments reference an unexpected commit, artifact, or builder identity

-### Confirm Compromise
+### Likely Scope Questions


Even though these questions are meaningful, it's a bit unclear how do they align with the purpose of this document? Is that a step to follow? Who should follow? Why do they go before Immediate actions?

hexnickk4997 · 2026-04-23T11:09:25Z

+- Did the pipeline have deploy permissions, signing authority, or production credentials?
+- Were any releases, containers, frontend bundles, or packages published during the exposure window?
+
+### Differentiation


Feels like this section is excessive in runbook, but belongs to some educational material/policy

hexnickk4997 · 2026-04-23T11:11:30Z

-2. [ ] Rotate all secrets and tokens
-3. [ ] Take down potentially compromised deployments
-4. [ ] Audit recent builds and deployments
+### Step 1: Freeze the pipeline


This doesn't say anything regarding keys revocation/rotation, as some keys may be used to push & approve

hexnickk4997 · 2026-04-23T11:12:41Z

+- [ ] Revoke or pause auto-deploy jobs
+- [ ] Block manual approvals until scope is understood
+
+### Step 2: Preserve evidence


This is too excessive for "Immediate Actions" most of these evidences can be collected later, it's too inefficient to do that during an incident itself, when we need to limit a damage as fast as we can

hexnickk4997 · 2026-04-23T11:33:16Z

+These playbooks are reference material: they help teams think through common incident types, decision points, and
+response patterns. They are not drop-in internal operating procedures.
+
+For copy-and-adapt operational documentation, see


Same as comment as above

hexnickk4997 · 2026-04-23T11:33:23Z

 thinking about incident management prior to actually experiencing an incident, you can help increase the likelihood of a
 timely recovery.

+This framework contains two different kinds of content:


Same comment as above

hexnickk4997 · 2026-04-23T12:01:56Z

- [ ] Lockfile changes you didn't make
+- [ ] Malicious code found in installed dependencies or build output
+- [ ] Lockfile changes you did not expect
+- [ ] Frontend bundle or released artifact changed more than the source diff would explain


Dependency can be in different parts of a system, not only frontend

hexnickk4997 · 2026-04-23T12:03:08Z

+- [ ] UI behaves differently than expected
+- [ ] Wallet drainer behavior detected
+- [ ] Injected scripts or unexpected external resources appear in page source
+- [ ] Official domain or subdomain resolves unexpectedly


This isn't 1 to 1 migration, as potential issue can be due to MX change or smth like that, or even NS change, which doesn't require resolution of IP change right away

hexnickk4997 · 2026-04-23T12:04:27Z


-These people should be reachable 24/7 for critical incidents. Consider:
-
+There should be a 24/7 escalation path to these people for critical incidents. Consider:


Why was it changed to escalation path? Not sure if it makes a lot of sense

github-actions Bot deployed to Preview March 21, 2026 18:58 View deployment

github-actions Bot deployed to Preview March 21, 2026 19:06 View deployment

mattaereal requested a review from scode2277 March 22, 2026 00:54

mattaereal self-assigned this Mar 22, 2026