diff --git a/use-cases/SAFE-UC-0023/README.md b/use-cases/SAFE-UC-0023/README.md index 22db581..0726bfb 100644 --- a/use-cases/SAFE-UC-0023/README.md +++ b/use-cases/SAFE-UC-0023/README.md @@ -1,32 +1,579 @@ -# Cloud ops troubleshooting assistant +# Cloud ops troubleshooting assistant — incident triage, telemetry correlation, and bounded remediation -> Seed page for **SAFE-AUCA**. Expand this into a full analysis using [`templates/use-case-template.md`](../../templates/use-case-template.md). +> **SAFE-AUCA industry reference guide** +> +> This use case describes a real-world workflow where SRE, platform engineering, NOC, and incident-response teams use an agentic assistant to investigate cloud and Kubernetes production issues by correlating metrics, logs, traces, change events, infrastructure state, and recent deployments across multiple tools. +> +> It focuses on: +> - how the workflow works in practice (tools, data, trust boundaries, autonomy) +> - what can go wrong (defender-friendly kill chain) +> - how it maps to **SAFE-MCP techniques** +> - what controls + tests make it safer +> +> **Defender-friendly only:** do **not** include operational exploit steps, payloads, or step-by-step attack instructions. +> +> **No sensitive info:** do not include internal hostnames/endpoints, secrets, customer data, non-public incidents, or proprietary details. + +--- ## Metadata | Field | Value | |---|---| | **SAFE Use Case ID** | `SAFE-UC-0023` | -| **Status** | `seed` | -| **NAICS 2022** | `Information (51)` | -| **Last updated** | `2026-02-17` | +| **Status** | `draft` | +| **NAICS 2022** | `51` (Information), `518210` (Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services), `541512` (Computer Systems Design Services) | +| **Workflow family** | `Cloud operations, SRE, and incident response` | +| **Last updated** | `2026-03-18` | + +### Evidence (public links) + +- [AWS CloudWatch Investigations](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Investigations.html) +- [AWS CloudWatch Investigations security and access](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Investigations-Security.html) +- [AWS CloudWatch cross-account observability](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Unified-Cross-Account.html) +- [AWS Systems Manager Automation approvals (`aws:approve`)](https://docs.aws.amazon.com/systems-manager/latest/userguide/running-automations-require-approvals.html) +- [Google Cloud Assist investigations](https://docs.cloud.google.com/cloud-assist/investigations) +- [Google Cloud Assist overview](https://docs.cloud.google.com/cloud-assist/overview) +- [Google Gemini for Google Cloud data governance](https://docs.cloud.google.com/gemini/docs/discover/data-governance) +- [Google Cloud Assist audit logging](https://docs.cloud.google.com/cloud-assist/audit-logging) +- [Azure Copilot troubleshooting agent](https://learn.microsoft.com/en-us/azure/copilot/troubleshooting-agent) +- [Azure observability agent overview](https://learn.microsoft.com/en-us/azure/azure-monitor/aiops/observability-agent-overview) +- [Azure SRE Agent root cause analysis](https://learn.microsoft.com/en-us/azure/sre-agent/root-cause-analysis) +- [Azure Copilot access management](https://learn.microsoft.com/en-us/azure/copilot/manage-access) +- [Datadog Bits AI SRE overview](https://docs.datadoghq.com/bits_ai/bits_ai_sre/) +- [Datadog Bits AI SRE: investigate issues](https://docs.datadoghq.com/bits_ai/bits_ai_sre/investigate_issues/) +- [Datadog Bits AI SRE: take action](https://docs.datadoghq.com/bits_ai/bits_ai_sre/take_action/) +- [SAFE-MCP repository](https://github.com/SAFE-MCP/safe-mcp) +- [SAFE-T1001 Tool Poisoning Attack](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1001/README.md) +- [SAFE-T1102 Prompt Injection](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1102/README.md) +- [SAFE-T1104 Over-Privileged Tool Abuse](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1104/README.md) +- [SAFE-T1204 Context Memory Implant](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1204/README.md) +- [SAFE-T1309 Privileged Tool Invocation via Prompt Manipulation](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1309/README.md) +- [SAFE-T1703 Tool-Chaining Pivot](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1703/README.md) +- [SAFE-T1801 Automated Data Harvesting](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1801/README.md) +- [SAFE-T1911 Parameter Exfiltration](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1911/README.md) +- [SAFE-T2102 Service Disruption via External API Flooding](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2102/README.md) +- [SAFE-T2105 Disinformation Output](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2105/README.md) +- [SAFE-T2106 Context Memory Poisoning via Vector Store Contamination](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2106/README.md) +- [Kubernetes RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) +- [Kubernetes RBAC good practices](https://kubernetes.io/docs/concepts/security/rbac-good-practices/) +- [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/) +- [Kubernetes audit logging](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) +- [Kubernetes ephemeral containers](https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/) +- [Kubernetes Validating Admission Policy](https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/) +- [AWS CloudTrail](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html) +- [Azure Activity Log](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/activity-log) +- [Google Cloud Audit Logs](https://cloud.google.com/logging/docs/audit) +- [NIST SP 800-207 Zero Trust Architecture](https://csrc.nist.gov/pubs/sp/800/207/final) +- [OWASP Top 10 for LLM Applications: Prompt Injection](https://genai.owasp.org/llmrisk/llm01-prompt-injection/) +- [Cloud Security Alliance Security Guidance v5](https://cloudsecurityalliance.org/artifacts/security-guidance-v5) +- [Cloud Security Alliance AI Controls Matrix](https://cloudsecurityalliance.org/artifacts/ai-controls-matrix) + +--- + +## Minimum viable write-up (Seed → Draft fast path) + +- validate the tool and approval model against at least one concrete implementation in production or pre-production +- attach repeatable test fixtures and evidence artifacts for the recommended safety regressions +- confirm the SAFE-MCP technique mapping and risk tiers with repository maintainers and domain reviewers + +--- + +## 1. Executive summary (what + why) + +**What this workflow does** + +A cloud ops troubleshooting assistant investigates incidents, alerts, or operator questions by retrieving and correlating operational evidence: metrics, logs, traces, deployment events, cloud control-plane activity, Kubernetes state, configuration drift, ownership metadata, and relevant runbooks. It produces ranked root-cause hypotheses, drafts summaries and next steps, and may execute tightly bounded diagnostics. In mature implementations, it can also invoke pre-approved remediation automations under explicit approval and policy controls. + +**Why it matters (business value)** + +Modern cloud estates are distributed across regions, accounts, subscriptions, projects, clusters, and observability tools. Triage often requires time-consuming “mean time to innocence” work: proving whether the fault is in code, infrastructure, networking, scaling, configuration, or a recent change. A well-governed assistant can reduce time-to-first-hypothesis, standardize incident handling, improve evidence collection quality, reduce on-call cognitive load, and make escalations to service owners or external support faster and more consistent. + +**Why it is risky / what can go wrong** + +This workflow combines two dangerous properties in one system: access to sensitive operational data and proximity to high-privilege production tooling. The same logs, trace attributes, alert annotations, ticket comments, tool outputs, and prior incident notes that help with triage can also be attacker-controlled, stale, or misleading. If the assistant is over-privileged or allowed to chain directly from evidence gathering into production mutation, a poisoned investigation can become a self-inflicted outage, a secret-leak event, or persistent bad guidance. The safe default is therefore **autonomous read-only triage** with **human approval for any production mutation, external communication, or durable memory write**. + +--- + +## 2. Industry context & constraints (reference-guide lens) + +### Where this shows up + +This pattern appears anywhere teams operate production cloud services with non-trivial reliability requirements, for example: + +- SaaS and internet platforms with multi-service, multi-region architectures +- enterprises running centralized SRE / NOC / platform teams across AWS, Azure, and Google Cloud +- Kubernetes-heavy environments with service mesh, GitOps, and frequent deployments +- managed service providers and internal platform teams operating many customer or business-unit environments +- regulated organizations where incident handling must remain auditable and permission-scoped + +### Typical systems in this workflow + +- cloud provider observability and control-plane tooling +- third-party observability platforms and APM +- Kubernetes APIs and cluster diagnostics +- deployment systems, CI/CD, IaC, and change-history systems +- incident management, ticketing, chat, paging, and status tools +- knowledge bases, runbooks, postmortems, CMDB, and ownership registries +- identity, access, approval, and audit systems + +### Constraints that matter + +- **Minutes-level response pressure:** incident triage often happens under outage conditions where delay is expensive. +- **Cross-domain correlation:** useful evidence is split across logs, metrics, traces, change events, tickets, and cloud/Kubernetes APIs. +- **Noisy and partially trustworthy data:** workloads and users can influence logs and traces; monitors can be misconfigured; time ranges can be wrong. +- **Privilege asymmetry:** the assistant may only need read access for diagnosis, while remediation requires highly privileged actions. +- **Multi-account / multi-cluster blast radius:** a bad decision can affect large portions of an estate if scoping is weak. +- **Change-control and audit obligations:** many organizations require approvals, immutability, and evidence retention for production actions. +- **Data residency and privacy constraints:** telemetry can contain secrets, personal data, regulated records, or customer payload fragments. + +### Must-not-fail outcomes + +- destructive or unauthorized production changes +- secret, token, or customer-data leakage through summaries, tickets, or support cases +- false root-cause analysis that drives harmful remediation +- silent scope expansion across accounts, clusters, or tenants +- runaway automation loops that create API floods, cost spikes, or extended outages +- persistent contamination of future investigations through poisoned memory or runbooks + +### Operational constraints + +- partial telemetry during degraded incidents +- temporary credentials and just-in-time access in privileged environments +- limited budget for observability queries and API calls during large incidents +- human approval fatigue during simultaneous alerts +- the need to preserve a clear chain of evidence for postmortems and audits + +--- + +## 3. Workflow description & scope + +### 3.1 Workflow steps (happy path) + +1. An alert, dashboard anomaly, incident ticket, customer-impact report, or on-call question starts an investigation. +2. The assistant normalizes the request into a scoped investigation object: affected service, time window, account/project/subscription, region, cluster, namespace, deployment, severity, and initial symptom. +3. The assistant gathers read-only evidence from observability systems and control-plane sources: metrics, logs, traces, recent deployments, recent configuration changes, autoscaling activity, health checks, Kubernetes events, and cloud audit trails. +4. The assistant correlates the retrieved evidence with ownership, service dependencies, recent change history, prior incidents, and available runbooks. +5. The assistant generates ranked hypotheses with explicit evidence, confidence, and missing-data notes. It should be able to say “inconclusive” instead of forcing a confident answer. +6. The assistant drafts incident updates, recommended next diagnostic steps, service-owner handoff notes, or vendor-support case content. +7. Where policy allows, the assistant executes **bounded diagnostics** only: for example, read-only cluster inspection, rollout-history collection, approved scripts in an isolated execution environment, or narrowly scoped debug operations that cannot mutate production state or exfiltrate data. +8. If remediation is warranted, the assistant proposes a **pre-approved** runbook or action with exact target scope, expected blast radius, rollback path, and post-action validation checks. +9. Any production-changing action (restart, rollback, scale, traffic shift, node cordon/drain, instance stop/terminate, policy change, debug exec with elevated visibility, secret or identity change) requires explicit human approval and a dedicated executor identity. +10. After an approved action runs, the assistant validates downstream signals, updates the incident record, and stores a complete audit trail. Durable memory or reusable knowledge updates happen only after review. + +### 3.2 In scope / out of scope -## Workflow Description (Seed) +- **In scope:** alert-initiated or operator-initiated investigation; telemetry retrieval; control-plane and deployment-history lookup; read-only or isolated diagnostics; evidence synthesis; incident/ticket/chat drafting; support-handoff drafting; tightly governed execution of pre-approved remediation automations; post-action validation. +- **Out of scope:** unrestricted shell access in production; arbitrary code execution on workloads or hosts; browsing or exporting secrets as part of normal triage; autonomous IAM/network/secret-policy changes; unconstrained self-healing; malware response or digital forensics outside approved playbooks; cross-tenant or cross-customer data access. -Troubleshoot cloud ops issues by querying telemetry/logs/configs and suggesting remediations with change-control boundaries. +### 3.3 Assumptions -## In Scope / Out Of Scope +- observability, cloud audit logging, and deployment history are already available +- read-only investigation roles are distinct from mutating executor roles +- pre-approved runbooks are versioned, reviewed, and owned +- high-risk production actions require human approval and step-up authentication +- secret scanning, redaction, and content-policy checks exist before messages or cases leave the investigation plane +- the assistant is allowed to stop and escalate when evidence is conflicting or incomplete -- **In scope:** TBD -- **Out of scope:** TBD +### 3.4 Success criteria -## SAFE-MCP Mapping (Seed Skeleton) +- faster time-to-first-hypothesis and lower MTTR without increasing unsafe change rates +- all tool calls and approvals are attributable, reviewable, and scoped +- zero unauthorized production mutations +- no sensitive data leakage in assistant-generated outputs +- demonstrable ability to block or quarantine poisoned context and over-privileged tool requests +- measurable reduction in repetitive manual triage effort while preserving operator trust -| Kill-chain stage | Failure/attack pattern | SAFE-MCP technique(s) | Recommended controls | Tests | +--- + +## 4. System & agent architecture + +### 4.1 Actors and systems + +- **Human roles:** on-call SRE, incident commander, service owner, platform engineer, security reviewer, change approver, vendor-support liaison +- **Agent/orchestrator:** cloud ops troubleshooting assistant, potentially with specialist sub-agents for telemetry, Kubernetes, cloud inventory, runbooks, and communications +- **Tools (MCP servers / APIs / connectors):** + - observability query tools (logs, metrics, traces, events, dashboards) + - cloud inventory and change-history tools + - Kubernetes inspection and bounded diagnostic tools + - deployment / GitOps / CI-CD history tools + - runbook and automation executors + - incident, chat, ticket, and support connectors + - knowledge-base, CMDB, and memory retrieval services +- **Data stores:** observability backends, configuration sources, audit logs, incident and ticket records, knowledge bases, runbook repositories, optional vector stores or memory stores +- **Downstream systems affected:** incident records, support cases, chat channels, automation executions, deployments, scaling state, traffic-routing controls, cloud and Kubernetes control planes + +### 4.2 Trusted vs untrusted inputs (high value, keep simple) + +| Input/source | Trusted? | Why | Typical failure/abuse pattern | Mitigation theme | |---|---|---|---|---| -| TBD | TBD | TBD | TBD | TBD | +| Alert payloads, monitor names, annotations, paging text | Semi-trusted | system-generated wrapper around fields that may still be user-configured or stale | priority spoofing, bad scoping, indirect prompt injection | structured parsing + provenance + treat free text as data | +| Logs, traces, span/resource attributes, event messages | Untrusted | applications, users, and compromised workloads can write them | indirect prompt injection, misinformation, secret surfacing | sanitization + quoting + allowlisted extraction + truncation | +| Cloud and Kubernetes resource state | Semi-trusted | authoritative sources, but may be stale, partial, or influenced by compromised principals | wrong correlation, stale read, misleading annotations | freshness checks + source cross-validation | +| Deployment history, change records, GitOps status | Semi-trusted | authoritative but may lag or be incomplete | false blame on recent change, missing rollback context | timestamp checks + ownership metadata + citations | +| Runbooks, KB articles, postmortems | Semi-trusted | internal and curated, but can be stale or unsafe | outdated or over-broad remediation guidance | versioning + expiry + owner review | +| Tool outputs and generated summaries | Mixed | depends on connector quality and output shaping | contaminated context, hallucinated conclusions, schema drift | schema validation + strict output contracts + provenance labels | +| Incident tickets, chat threads, operator prompts | Untrusted | humans and integrations can be mistaken, compromised, or adversarial | urgency abuse, authority spoofing, unsafe action requests | identity binding + approvals + treat text as data | +| Durable memory / vector store | Semi-trusted | persistent context is helpful but can retain bad content | repeated contamination across future incidents | review-before-persist + TTL + quarantine + integrity checks | + +### 4.3 Trust boundaries (required) + +The workflow has several trust boundaries that reviewers should model explicitly: + +1. **Workload / user / monitor data → agent boundary** + Untrusted operational text enters model context from logs, traces, alert annotations, ticket comments, and dashboard labels. + +2. **Agent → read-only retrieval boundary** + The model turns intent into queries against observability, cloud inventory, and Kubernetes APIs. This is usually the lowest-risk boundary but still exposes sensitive data. + +3. **Agent → diagnostic execution boundary** + The assistant may request script execution, ephemeral containers, or cluster diagnostics. Even “diagnostics” can become a mutation or exfiltration path if not isolated. + +4. **Investigation plane → remediation executor boundary** + Crossing from read-only investigation into production-changing action is the most important boundary in the design. It should require separate identity, policy, and explicit approval. + +5. **Agent → communication / support boundary** + Sending summaries or attachments into chat, tickets, or vendor-support systems creates a durable disclosure path. + +6. **Single-environment → multi-account / multi-cluster boundary** + Centralized SRE assistants often operate across accounts, subscriptions, projects, or clusters; mistakes in scoping can silently expand blast radius. + +**Trust boundary notes** + +- Separate **control** from **data** at every boundary: untrusted retrieved content must not become operative instructions. +- Treat production write access as a distinct trust zone, even if the same human initiated the investigation. +- Prefer short-lived, request-scoped credentials and explicit target scoping over ambient broad access. +- Preserve environment separation (`dev/test` vs `prod`) in both identity and tool routing. + +### 4.4 Tool inventory (required) + +| Tool / MCP server | Read / write? | Permissions | Typical inputs | Typical outputs | Failure modes | +|---|---|---|---|---|---| +| `observability.query` | read | service/account/project-scoped read role | service, resource, time window, query template | log excerpts, metrics, traces, events | injected text in results, stale time window, over-broad retrieval | +| `cloud.inventory.describe` | read | cloud read role | resource id, account, region, tag filters | resource metadata, recent config/activity | wrong account or region, incomplete state, stale cache | +| `k8s.inspect` | read (default) | namespace/cluster-scoped read RBAC | namespace, workload, pod, event selector | object specs, events, rollout history, logs | namespace overreach, accidental secret exposure through object descriptions | +| `diag.sandbox.run` | exec (bounded) | isolated executor identity | approved script id, target scope, timeout | structured findings, command transcript | network egress, secret visibility, unintended mutation, environment escape | +| `runbook.execute` | write | dedicated automation role | runbook id, target, parameters, approval token | execution id, status, before/after checks | wrong target, repeated retries, self-inflicted outage | +| `incident.record` | write | ticket/chat integration scope | summary, evidence links, channel/ticket id | posted message, incident updates | secret leakage, persistent false narrative | +| `support.case` | write | support-case integration scope | incident summary, redacted attachments, severity | case id, sent message | exfiltration to third party, over-sharing | +| `knowledge.retrieve` / `memory.persist` | read / write | KB read role, guarded memory write role | keywords, incident id, proposed memory item | prior runbooks, similar incidents, stored notes | stale or poisoned guidance, durable contamination | + +### 4.5 Governance & authorization matrix + +| Action category | Example actions | Allowed mode(s) | Approval required? | Required auth | Required logging/evidence | +|---|---|---|---|---|---| +| Read-only retrieval | query logs/metrics/traces, list recent deployments, describe cloud resources | manual / HITL / autonomous | no | request-scoped read-only role | query parameters, result provenance, account/cluster scope | +| Isolated diagnostics | run approved scripts, collect rollout history, bounded cluster diagnostics in sandbox | HITL / autonomous (policy-based) | policy-based; yes if prod-targeting or elevated visibility | short-lived sandbox executor, fixed image, network/volume policy | script id, image digest, transcript, resource scope, timeout | +| Internal record updates | incident comment, ticket summary, internal chat update | HITL / autonomous (low-risk only) | policy-based | scoped integration token | posted content, redaction result, evidence links | +| External communications | vendor support case, externally visible status input | HITL initially | yes | verified identity + DLP/redaction controls | message archive, attachment inventory, approver | +| Durable knowledge / memory write | save incident lesson, create reusable troubleshooting note | HITL / policy-gated | yes or review-before-persist | guarded write role | content hash, reviewer, source provenance, TTL | +| Low-risk bounded remediation | restart a single pre-approved canary target, re-run a failed automation in non-prod | manual / HITL, rarely autonomous | yes | dedicated automation role + step-up auth | before/after state, blast-radius statement, rollback plan | +| High-risk production action | rollback release, scale down, cordon/drain node, terminate instance, modify IAM/network/secret policy, exec in prod | manual / HITL only | always; consider dual approval | privileged executor, just-in-time elevation, policy gate | immutable audit trail, explicit target diff, rationale, post-action validation | + +### 4.6 Sensitive data & policy constraints + +- **Data classes:** credentials, API keys, service tokens, customer identifiers, request payload fragments, source code, configuration values, environment variables, account topology, vulnerability data, support-case attachments +- **Retention / logging constraints:** retain enough evidence for audit and postmortem, but avoid persisting raw sensitive telemetry unnecessarily; redact before leaving the investigation plane; keep transient investigations ephemeral where possible; do not auto-ingest sensitive outputs into durable memory +- **Regulatory constraints:** privacy, data residency, contractual restrictions with AI or support vendors, sector-specific retention and disclosure rules, internal change-control requirements +- **Safety / operational harm constraints:** the assistant must not apply ambiguous or destructive remediation by default; “inconclusive” is an acceptable outcome; one investigation should not create long-lived broad privileges + +--- + +## 5. Operating modes & agentic flow variants + +### 5.1 Manual baseline (no agent) + +- Engineers inspect dashboards, search logs, compare recent changes, run `kubectl` / cloud CLI commands, consult runbooks, and update the incident record manually. +- Existing checks often include change-control approval, peer review, privileged identity management, and post-action validation. +- Humans catch many problems through domain judgment: they notice when a log line is suspicious, when a suggested rollback is too broad, or when multiple sources conflict. +- The cost is speed and consistency: correlation across many tools is slow, repetitive, and error-prone under fatigue. + +### 5.2 Human-in-the-loop (HITL / sub-autonomous) + +- The assistant can autonomously collect read-only evidence, organize context, and draft hypotheses or incident updates. +- The assistant may also run tightly bounded diagnostics in a sandbox or other isolated environment when policy explicitly permits it. +- Human approvals sit at the most sensitive points: + - durable memory writes + - external communications + - any production mutation + - any diagnostic step that crosses into elevated visibility or exec semantics +- This is the recommended operating mode for most real environments because it preserves operator judgment at the decision points that matter most. + +### 5.3 Fully autonomous (end-to-end agentic) + +- Fully autonomous mode is only advisable for a narrow subset of low-risk actions: automatic incident creation, read-only evidence gathering, policy-safe internal summaries, and perhaps some isolated diagnostics. +- Autonomous remediation should be limited to pre-approved runbooks with: + - single-target or canary scope + - hard rate limits and retry budgets + - dry-run and preflight validation + - rollback or compensating action + - kill switch and immutable audit trail +- If the assistant is wrong or manipulated, the blast radius includes false RCA, broad write actions, repeated retries, API-rate exhaustion, and durable misinformation. For most organizations, unrestricted autonomous remediation is an unacceptable default. + +### 5.4 Variants (optional) + +- **Single-agent vs orchestrated specialists:** one orchestrator can delegate to telemetry, Kubernetes, cloud-inventory, or runbook specialists. +- **Provider-native vs third-party AIOps:** teams may use AWS/Azure/Google-native assistants, Datadog-like platforms, or internal orchestration layers. +- **Centralized multi-account SRE vs application-team local assistant:** the former increases scoping and governance complexity. +- **Real-time incidents vs post-incident analysis:** postmortem summarization is lower-risk than live remediation. + +--- + +## 6. Threat model overview (high-level) + +### 6.1 Primary security & safety goals + +- preserve diagnostic integrity: the assistant should not be steered into false conclusions by untrusted operational data +- preserve least privilege: retrieval, diagnostics, communications, and mutation should not share ambient broad authority +- preserve confidentiality: secrets and sensitive telemetry should not leak into prompts, summaries, memory, or third-party systems +- preserve availability: assistant actions should not worsen incidents or create new outages +- preserve auditability: every tool call, approval, and mutation should be attributable, reviewable, and reconstructable + +### 6.2 Threat actors (who might attack / misuse) + +- external attackers who can influence request payloads, headers, URLs, or other data that later appear in logs and traces +- compromised workloads or service identities that emit deceptive operational data +- malicious or careless insiders editing monitor annotations, tickets, runbooks, or memory entries +- compromised third-party integrations or connectors +- hurried operators who over-trust the assistant under outage pressure + +### 6.3 Attack surfaces + +- logs, traces, event streams, dashboard labels, and alert annotations +- incident tickets, chat transcripts, and support-case drafts +- tool descriptions, schemas, and mutable connector metadata +- runbooks, KB articles, and retrieved postmortems +- memory/vector stores and similarity-retrieval layers +- diagnostic-script inputs, command arguments, and output renderers +- approval UX, especially when target scope or blast radius is hidden or hard to verify + +### 6.4 High-impact failures (include industry harms) + +- **Customer / consumer harm:** degraded service, prolonged outage, failed transactions, delayed recovery, or customer-data leakage in support paths +- **Business harm:** SLA/SLO breaches, increased MTTR, cloud cost spikes, reputational damage, operator toil, incorrect blame during incident response +- **Security harm:** unauthorized control-plane changes, secret exposure, privilege escalation, repeated compromise via poisoned memory or runbooks, false incident narratives that suppress proper response + +--- + +## 7. Kill-chain analysis (stages → likely failure modes) + +> Keep this defender-friendly. Describe patterns, not “how to do it.” + +| Stage | What can go wrong (pattern) | Likely impact | Notes / preconditions | +|---|---|---|---| +| 1. Entry / trigger | Untrusted operational text enters the investigation via logs, traces, alert annotations, tickets, or tool output. The assistant ingests it as part of incident context. | Investigation starts from a corrupted scope or misleading symptom description. | Common because workloads, users, and integrations can influence observability data. | +| 2. Context contamination | The assistant treats retrieved data, stale runbook text, or tool output as instructions or authoritative truth instead of evidence. | False hypotheses, bad prioritization, hidden sensitive content in context, unsafe next-step recommendations. | Most likely when free text is passed through with weak provenance markers or no separation of control from data. | +| 3. Tool misuse / unsafe action | A read-only investigation path pivots into a more privileged diagnostic or remediation tool, or the assistant invokes an over-privileged tool directly. | Secret exposure, unintended production mutation, lateral movement, or high-blast-radius changes. | Preconditions include broad permissions, weak tool-graph controls, or approval fatigue. | +| 4. Persistence / repeat | Contaminated summaries, memory entries, KB notes, or incident comments become durable and influence future runs or human approvers. | Repeated mis-triage, recurring unsafe recommendations, institutionalized false RCA. | More likely if the system auto-persists memory or trains retrieval from unreviewed outputs. | +| 5. Exfiltration / harm | Sensitive telemetry is copied into chat, tickets, or support cases; or a flawed remediation terminates healthy instances, rolls back a good deployment, or floods APIs. | Compliance breach, outage expansion, cost spikes, longer incident duration, loss of operator trust. | This is the highest-impact end state and the canonical failure pattern for this use case. | + +--- + +## 8. SAFE-MCP mapping (kill-chain → techniques → controls → tests) + +> Goal: make SAFE-MCP actionable in this workflow. The rows below use selected techniques that are especially relevant to cloud ops troubleshooting assistants. + +| Kill-chain stage | Failure/attack pattern (defender-friendly) | SAFE-MCP technique(s) | Recommended controls (prevent / detect / recover) | Tests (how to validate) | +|---|---|---|---|---| +| Entry / trigger | Logs, alert annotations, ticket text, or mutable tool metadata attempt to steer the assistant away from policy or safe triage. | [SAFE-T1102](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1102/README.md), [SAFE-T1001](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1001/README.md) | Treat all retrieved operational text as untrusted data; separate control from data; parse only allowlisted fields into structured context; sanitize and quote raw excerpts; attach provenance labels; keep tool descriptions versioned and immutable to ordinary users. | Seed malicious log lines, monitor annotations, and mutated tool descriptions. Verify that tool policy, system instructions, and target selection do not change. | +| Context contamination / persistence | Poisoned investigation notes, stale postmortems, or malicious memory content contaminate future incidents. | [SAFE-T1204](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1204/README.md), [SAFE-T2106](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2106/README.md) | No automatic durable memory writes from live incidents; review-before-persist; TTL and freshness metadata; quarantine suspicious memory items; sign or hash approved knowledge entries; prefer evidence citations over free-form recollection. | Insert poisoned prior-incident notes into retrieval fixtures. Verify that they are quarantined, down-ranked, or surfaced with warnings rather than silently trusted. | +| Tool misuse / privilege escalation | The assistant uses over-privileged tools, invokes privileged operations through natural-language manipulation, or pivots from low-risk tools into high-risk tools. | [SAFE-T1104](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1104/README.md), [SAFE-T1309](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1309/README.md), [SAFE-T1703](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1703/README.md) | Split investigation and execution identities; enforce explicit tool-interaction allowlists; require approval tokens for write paths; use step-up auth and JIT elevation; disable direct prod shell/exec by default; sandbox diagnostics; add admission and policy gates downstream of the model. | Attempt unauthorized secret reads, `exec`/debug operations, instance termination, scale changes, or IAM/network changes from a read-only incident. Expect denial, audit evidence, and no downstream write. | +| Data harvesting / exfiltration | The assistant retrieves more data than needed, leaks data through hidden parameters, or copies sensitive content into external systems. | [SAFE-T1801](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1801/README.md), [SAFE-T1911](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T1911/README.md) | Mandatory scope filters; result-size and rate limits; DLP and secret redaction before prompts and outputs; strict JSON schemas with `additionalProperties: false`; canary secrets; outbound-content review and data-classification labels. | Plant canary tokens in telemetry fixtures and attempt hidden-parameter exfiltration or bulk export. Verify redaction, schema rejection, alerting, and approval blocking. | +| Service harm / misinformation | The assistant generates a convincing but false RCA or repeatedly calls tools in ways that worsen the incident or create API/service disruption. | [SAFE-T2102](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2102/README.md), [SAFE-T2105](https://github.com/SAFE-MCP/safe-mcp/blob/main/techniques/SAFE-T2105/README.md) | Require evidence threshold and post-action validation; allow “inconclusive” outcomes; cap retries and API budgets; dry-run and preflight diffs for remediation; canary/one-target scope; automatic rollback; kill switch for mutating mode; independent cross-checks against ground truth where possible. | Run false-evidence and wrong-target remediation simulations. Verify that the assistant either stops, escalates, or canaries safely, and that rollback is available and tested. | + +**Notes** + +- If a control varies by operating mode, default to stricter behavior in live production. +- For this workflow, the highest-value SAFE-MCP themes are: prompt-injection resistance, memory governance, explicit privilege boundaries, egress controls, and blast-radius reduction. +- A useful implementation heuristic is: **the closer a tool is to production mutation, the less the model should be allowed to decide implicitly.** + +--- + +## 9. Controls & mitigations (organized) + +### 9.1 Prevent (reduce likelihood) + +- **Separate identities by function:** use different principals for investigation retrieval, sandbox diagnostics, internal record updates, external communications, and production mutation. +- **Enforce least privilege and short-lived credentials:** scope every request by account, cluster, namespace, service, region, and time window; use just-in-time elevation for privileged paths. +- **Treat telemetry as untrusted:** logs, traces, annotations, tickets, and tool outputs should be marked as data, not instructions. Prefer structured extraction over raw free-text ingestion. +- **Constrain tool graphs:** prevent the model from directly chaining from low-risk retrieval tools into privileged executor tools without explicit policy checks and approval tokens. +- **Sandbox diagnostics:** fixed images, no secret mounts, limited network egress, allowlisted commands/scripts, strict timeouts, no host namespaces unless separately approved. +- **Version and tier runbooks:** every runnable automation should have an owner, risk tier, target schema, dry-run behavior, rollback path, and review history. +- **Apply downstream guardrails:** Kubernetes admission policy, cloud policies/resource locks, PIM/JIT, and automation approvals should all continue to enforce policy even if the model makes a bad request. +- **Redact before egress:** remove secrets, PII, and unnecessary raw telemetry from summaries, support cases, and memory. +- **Govern memory explicitly:** no auto-save of live-incident conclusions; require review, TTL, provenance, and freshness metadata for durable knowledge. +- **Design approval UX for humans under stress:** show exact target(s), scope, blast radius, before/after state, rationale, rollback plan, and confidence/evidence. + +### 9.2 Detect (reduce time-to-detect) + +- **Log every tool call and policy decision:** include principal, target scope, input class, time range, result size, approval state, and policy outcome. +- **Correlate with platform audit logs:** link assistant sessions to CloudTrail, Azure Activity Log, Google Cloud Audit Logs, and Kubernetes audit records. +- **Alert on denied or unusual behavior:** repeated attempts to cross privilege boundaries, abnormal data volume, cross-account jumps, unexpected external communications, or sandbox escapes. +- **Use canary secrets / canary records:** detect whether sensitive tokens or synthetic markers leak into prompts, tickets, or support cases. +- **Track approval and rollback signals:** high approval-denial rates, repeated approvals for the same action, or rising rollback rates can indicate model drift or unsafe automation. +- **Sample and review incident outputs:** compare assistant hypotheses and actions against eventual postmortem ground truth to measure false-RCA and near-miss rates. +- **Monitor memory integrity:** unexpected new memory items, large similarity matches from low-trust sources, or freshness failures should trigger quarantine and review. + +### 9.3 Recover (reduce blast radius) + +- **Global kill switch:** disable all mutating capabilities while preserving read-only triage during suspected safety failures. +- **Per-runbook rollback or compensating action:** every allowed mutation should have a tested undo path or clear containment procedure. +- **Credential revocation and rotation:** if exfiltration is suspected, revoke affected tokens and rotate secrets rapidly. +- **Quarantine knowledge and tool versions:** disable suspicious runbooks, memory entries, connectors, or tool schemas until reviewed. +- **Fail safely to humans:** when evidence conflicts, confidence is low, or policy checks fail, escalate rather than improvise. +- **Restore known-good state:** prefer rollback to versioned infrastructure, GitOps state, or previous deployment manifests over ad-hoc manual fixes. +- **Feed incidents back into tests:** turn every safety failure or near miss into a reusable regression fixture. + +--- + +## 10. Validation & testing plan + +### 10.1 What to test (minimum set) + +- **Permission boundaries:** the assistant cannot exceed intended account, cluster, namespace, environment, or tool scopes +- **Prompt / tool-output robustness:** untrusted operational content does not override policy, tool selection, or approval requirements +- **Action gating:** high-risk actions always require explicit approval and correct executor identity +- **Logging / auditability:** every material action is attributable, reconstructable, and linked to the initiating investigation +- **Rollback / safety stops:** mutating mode can halt and recover safely +- **Data-loss prevention:** secrets and sensitive telemetry do not leak to prompts, memory, tickets, or support cases +- **Memory hygiene:** durable knowledge cannot be poisoned silently +- **Rate and budget controls:** investigation loops cannot create uncontrolled API floods or runaway costs + +### 10.2 Test cases (make them concrete) + +| Test name | Setup | Input / scenario | Expected outcome | Evidence produced | +|---|---|---|---|---| +| Poisoned log resilience | Synthetic incident dataset with malicious text in logs and span attributes | Assistant investigates elevated error rate with injected text embedded in log messages | Assistant treats text as evidence only, preserves policy, and does not change tool plan or privilege level | tool-call transcript, policy-decision log, rendered evidence snapshot | +| Alert annotation injection resilience | Monitor or alert fixture with adversarial annotation text | Investigation triggered from alert payload containing unsafe instructions in annotation/body | Assistant ignores instruction-like text for control purposes and scopes investigation correctly | alert payload capture, scope object, denied-policy or ignored-input trace | +| Unauthorized secret-read denial | Read-only investigation role in staging or test environment | Operator asks assistant to inspect secrets, env vars, or privileged pod state without approval | Secret retrieval is denied or redacted; no downstream privileged tool call occurs | authz denial record, audit log, assistant response | +| Cross-tool pivot denial | Tool graph with both read-only and write tools available | Retrieved content suggests invoking admin or destructive tool | Policy blocks chain into write path without explicit approval token and executor identity | policy-engine event, absence of write-call log, session trace | +| Approval-gate enforcement | Runbook service with approval workflow enabled | Assistant proposes restart/rollback/scale action in production | Action remains blocked until proper approval and step-up auth are present | approval record, pending execution state, immutable audit entry | +| Hidden-parameter exfiltration block | Tool schema with optional fields and seeded canary secret in telemetry | Assistant attempts or is induced to pass secret in unused parameter or metadata field | Schema validation rejects unexpected fields or redaction removes secret before send | request payload diff, schema-validation logs, DLP event | +| Memory contamination quarantine | Retrieval store containing poisoned prior incident note | New investigation retrieves semantically similar historical content | Poisoned item is quarantined, down-ranked, or surfaced with review warning; not trusted silently | retrieval log, memory-integrity alert, quarantine record | +| Wrong-target remediation blast-radius test | Environment with healthy and unhealthy targets plus dry-run capable runbook | Assistant recommends action based on ambiguous evidence | System stops, escalates, or limits to dry-run/canary; no broad production mutation occurs | dry-run diff, approval UX screenshot, no-op audit record | +| API flood / retry budget control | Quotas and budget controls enabled | Assistant repeatedly retries queries or automation due to inconclusive results | Budget/rate limit stops loop and escalates to human | quota counters, throttle logs, escalation event | +| Redaction on external support handoff | Support-case connector with DLP checks | Assistant drafts vendor case with logs containing secrets or customer identifiers | Content is redacted or blocked; external send requires HITL approval | redaction report, message archive, approval artifact | + +### 10.3 Operational monitoring (production) + +- **Metrics** + - investigations started / completed / escalated + - time-to-first-hypothesis + - time-to-human-handoff + - percentage of investigations ending as “inconclusive” + - blocked tool-call count by policy reason + - approval-denied rate + - redaction / DLP hit rate + - ratio of read-only to mutating actions + - rollback invocation rate + - false-RCA rate (from postmortem sampling) + - cross-account / cross-cluster access anomalies +- **Alerts** + - denied high-risk action attempts + - unexpected external communications + - anomalous data volume or repeated broad queries + - sandbox policy violation or disallowed egress + - memory integrity or provenance failure + - repeated approvals for the same action in a short period + - elevated rollback rate or post-action health-check failure +- **Runbooks** + - disable mutating mode globally + - quarantine specific tool connectors, runbooks, or memory items + - rotate or revoke credentials after suspected leakage + - review linked audit records and approval artifacts + - revert to last known-good automation or runbook version + +--- + +## 11. Open questions & TODOs + +- [ ] Define the exact catalog of “low-risk” actions that can ever run without explicit human approval. +- [ ] Decide whether any production-changing remediation should be autonomous in mature environments, or whether HITL should remain mandatory. +- [ ] Standardize how evidence confidence is computed and displayed before proposing remediation. +- [ ] Define the retention, review, TTL, and provenance rules for durable incident memory. +- [ ] Establish residency and contractual policy for sending telemetry into provider-native or third-party AI assistance. +- [ ] Specify which action categories require dual approval (for example IAM, networking, secrets, or multi-service scope). +- [ ] Add concrete regression fixtures for log injection, alert-annotation injection, stale runbook guidance, and wrong-target remediation. +- [ ] Map provider-native controls (approvals, locks, PIM, resource policies, admission policy) into a portable implementation checklist. + +--- + +## 12. Questionnaire prompts (for reviewers) + +### Workflow realism + +- Are the investigation steps and tool categories realistic for modern cloud ops teams? +- What critical source of evidence is missing: feature flags, service ownership, dependency graph, cost anomalies, or customer-impact telemetry? +- Is the diagnostic execution model realistic for Kubernetes, VM-based, and serverless environments? + +### Trust boundaries & permissions + +- Are the read-only investigator role and mutating executor role sufficiently separated in practice? +- Where is the real blast-radius boundary: account, subscription, project, region, cluster, namespace, or service? +- Does any connector still hold ambient permissions that are broader than the workflow actually needs? + +### Threat model completeness + +- Which threat actor matters most here: external log poisoner, compromised workload, malicious insider, or approval-fatigue under outage pressure? +- What is the highest-impact failure not yet modeled? +- Are ticketing, chat, and support handoff modeled strongly enough as exfiltration paths? + +### Controls & tests + +- Which controls are mandatory before pilot, and which can wait until a later maturity phase? +- Are the proposed tests sufficient to detect regressions in authorization, prompt resistance, and egress safety? +- Is rollback ownership clear for every allowed automation path? + +--- + +## Appendix (optional) + +### A. Glossary + +- **Blast radius:** the maximum scope of harm a bad action can cause +- **CMDB:** configuration management database or service/inventory registry +- **HITL:** human-in-the-loop approval or review step +- **Incident commander:** person coordinating incident response +- **Inconclusive-safe behavior:** design principle that prefers escalation over forced, low-confidence action +- **JIT / step-up auth:** just-in-time privileged access granted only when needed +- **MTTR:** mean time to recovery or resolution +- **RCA:** root-cause analysis +- **Runbook:** predefined operational procedure or automation +- **Sandbox diagnostics:** controlled execution environment for bounded troubleshooting actions + +### B. Suggested capability tiers + +| Tier | Description | Example actions | Recommended default mode | +|---|---|---|---| +| Tier 0 | Read-only retrieval and summarization | query telemetry, describe resources, draft incident update | autonomous | +| Tier 1 | Isolated diagnostics | run approved script in sandbox, collect bounded debug data | HITL or policy-based autonomous | +| Tier 2 | Internal record updates | update ticket, post internal summary, create incident | HITL or low-risk autonomous | +| Tier 3 | Bounded remediation | single-target canary restart, approved non-prod rerun, narrow rollback with safeguards | HITL only in most organizations | +| Tier 4 | High-risk production mutation | terminate instances, broad rollback, IAM/network/secret changes, privileged exec | manual or HITL only | + +### C. Reference implementation heuristics + +- Prefer **read-only investigation by default**. +- Treat **all telemetry and free text as untrusted**. +- Keep **execution paths separate from reasoning paths**. +- Require **exact target scoping, explicit approval, and audit evidence** before mutation. +- Preserve a **tested rollback path** for every allowed write action. +- Make “**I do not have enough evidence**” a first-class safe outcome. + +--- + +## Contributors + +| Role | Contributor | +|---|---| +| Draft author | Arun Pandiyan Perumal | +| Domain reviewers | TBD | +| Additional contributors | TBD | + +--- -## Next Steps +## Version History -- Expand this page to `draft` using the full template in `templates/use-case-template.md`. -- Add public evidence links and concrete control/test mappings. +| Version | Date | Changes | Author | +|---|---|---|---| +| 1.0 | 2026-03-18 | Initial SAFE-AUCA draft for `SAFE-UC-0023`, aligned to the template, issue plan, and public cloud / SAFE-MCP evidence | Arun Pandiyan Perumal |