Skip to content

rmednitzer/sentinel

Repository files navigation

Sentinel

Ask DeepWiki

Continuous HAT failure-mode classification of a production agent runtime.

Purpose

sentinel observes a private fleet of scheduled AI agents and classifies their runtime behavior against a Human-Autonomy Teaming (HAT) failure-mode taxonomy. The substrate is a homelab agent runtime producing roughly 150 timeline events and 14 staging artifacts per week. sentinel runs six times per day, samples the most recent window of agent activity, classifies what it sees, and commits the result to this repository.

For a one-page map of how the pieces fit together (substrate, routine, artifacts, CI, site, audit loops), see docs/ARCHITECTURE.md.

Cadence

Twice daily, UTC, aligned to the operator's Vertex agent firing schedule (single cron 0 8,22 * * *). Windows are 10-14 hours.

Run UTC Catches
1 08:00 Full morning Vertex cluster: deadline_awareness (06:00), briefing-morning (06:15), briefing-enrichment (06:25), cross_feed_correlation (06:30), cve_triage (07:00). 1h buffer after cluster completes
2 22:00 briefing-evening (20:15), end-of-day slot. 1h45m buffer after cluster completes

A 6-run schedule was tried first; 4 of 6 windows hit zero agent activity (structural empty stretches between morning and evening clusters), so the schedule was tightened to match where signal actually exists.

What it produces

Per-run artifacts in reports/YYYY/MM/DD/HHMM-pulse.{md,json}. Each artifact contains structured classifications of observed agent behavior against six HAT failure modes plus a honesty disclosure.

The corpus accumulates over time. Patterns visible in any single artifact are not signal; patterns visible across the corpus are weak signal.

Example artifact

See reports/2026/05/02/0800-pulse.md for the first signal-bearing artifact. It surfaces three operational findings about the Vertex intel pipeline: vendor-marketing-as-evidence in the briefing agent, novelty-over-authority bias in the pre-filter, and speculative-research-to-operational-prescription projection.

What it is not

This is not a peer-reviewed evaluation. Classifications are LLM-derived and human-coder validation is not part of v1.

This is not a controlled experiment. The fleet is N=1, the operator is one person, and no second deployment exists for comparison.

This is not a claim of agent failure or success. Classifications are observations through a specific theoretical lens, not value judgments.

This is not an EU AI Act Article 72 compliance artifact. Article 72 applies to providers of high-risk AI systems on the EU market; sentinel is reference work that may inform such providers, not a binding monitoring system.

Methodology, honestly

Every per-run artifact is generated by Claude executing the routine prompt in CLAUDE.md against the Vertex MCP intelligence substrate. The taxonomy is the operator's synthesis of prior work in Human-Autonomy Teaming literature, multi-agent system failure analysis, and AI evaluation-awareness research. Classifications are best-effort interpretive and may shift as the codebook evolves. Codebook revisions are tracked in Git. Replays of historical runs under newer codebooks may yield different labels.

Sensitive operational details (specific CTI items, host names, secrets) are sanitized before publication. The contribution is the framework plus corpus structure plus observed patterns, not the underlying operational data.

Codebook

The active HAT failure-mode codebook is codebook/v1.1.md, referenced from CLAUDE.md Phase 3. Eight active modes plus none_observed, with two v1.0 modes deprecated and retained in the schema enum for historical pulse compatibility. Confidence is low, medium, or high.

Codebook changes land as new versioned files under codebook/ (e.g. codebook/v1.2.md) rather than in-place edits, with the schema_version field in schemas/pulse.schema.json bumped accordingly. See docs/SCHEMA-VERSIONS.md for how schema and codebook versions are coupled and how validate_artifacts.py dispatches historical pulses to their original schemas.

Public site

The corpus is browsable at https://rmednitzer.github.io/sentinel/ (Pages deployment from main). The site is regenerated on every push that changes the corpus, codebook, or routine. To cite the corpus academically, see CITATION.cff; tagged releases (corpus-YYYY-QN) are stable reference points with sealed content-hash manifests under releases/.

Repository structure

.
├── CLAUDE.md                    # The routine prompt (Claude reads this twice/day).
├── README.md                    # This file.
├── ROADMAP.md                   # Deferred work + dated "Recently shipped" table.
├── requirements.txt             # Python deps for local + CI (just jsonschema).
├── reports/                     # Primary artifacts.
│   ├── INDEX.md                 # Reverse-chronological run index.
│   ├── heartbeat.log            # One line per skip-empty run.
│   └── YYYY/MM/DD/HHMM-pulse.{md,json}
├── reports-cross/               # Frozen archive (see reports-cross/README.md).
├── codebook/
│   ├── v1.0.md                  # Frozen v1.0 codebook.
│   └── v1.1.md                  # Active HAT failure-mode codebook.
├── audits/                      # Quarterly + codebook-drift audit records.
│   ├── README.md
│   ├── templates/
│   │   └── quarterly.md         # Blind-reclassification protocol template.
│   ├── quarterly/
│   │   └── 2026-Q2.md           # Placeholder; first audit 2026-Q3.
│   └── codebook/
│       ├── audit-2026-05-07.md  # v1.1 fitness review.
│       └── drift-v1.1.md        # v1.0 → v1.1 reclassification table.
├── external/                    # External-benchmark anchor (Wave 3 scaffolding).
│   └── README.md
├── CITATION.cff                 # Academic citation metadata.
├── FALSIFIERS.md                # Falsifier registry (research integrity).
├── PRE-REGISTRATION.md          # Pre-committed methodology decisions (HARKing guard).
├── schemas/
│   ├── pulse.schema.json        # Active artifact schema (v1.2).
│   ├── pulse.v1_1.schema.json   # Frozen v1.1 schema for legacy pulses.
│   └── pulse.v1_0.schema.json   # Frozen v1.0 schema for legacy pulses.
└── .github/
    ├── labeler.yml              # Path-based PR labeling rules.
    ├── dependabot.yml           # Weekly bump for actions + pip groups.
    ├── scripts/                 # Reusable Python; runnable locally.
    │   ├── validate_artifacts.py   # Schema + heartbeat format check.
    │   ├── check_consistency.py    # MD/JSON/INDEX coherence (extract verbatim, etc.).
    │   ├── check_provenance.py     # Codebook_hash / routine_hash vs git history (F-methodology-002).
    │   ├── check_falsifiers.py     # F-codebook-001 mode-overreach + F-codebook-003 unmapped-recurrence.
    │   ├── check_integrity.py      # Re-validate corpus + manage GH issue.
    │   ├── check_heartbeat.py      # Rolling heartbeat-rate alert.
    │   ├── check_stale_routine.py  # Routine-freshness alert (no output for ~18h).
    │   ├── check_links.py          # Internal markdown link integrity.
    │   ├── compute_stats.py        # Generate STATS.md with Wilson 95% CIs (transient; consumed by build_site.py, not tracked).
    │   ├── surface_codebook_drift.py # Codebook-revision PR comment generator.
    │   ├── build_site.py             # Static-site generator for GitHub Pages.
    │   └── build_snapshot_manifest.py # Sealed manifest at release-tag time.
    └── workflows/
        ├── validate.yml         # PR + push: schema check.
        ├── consistency.yml      # PR: MD/JSON/INDEX coherence.
        ├── provenance.yml       # PR + push: codebook_hash / routine_hash vs git (blocking).
        ├── falsifiers.yml       # PR + push + daily: F-codebook-001/-003 candidates (informational).
        ├── integrity.yml        # Weekly + on schema change: re-validate corpus.
        ├── heartbeat-monitor.yml # Daily heartbeat-rate alert.
        ├── stale-routine.yml    # Daily routine-freshness alert.
        ├── markdown-links.yml   # PR + weekly: relative-link integrity.
        ├── labeler.yml          # PR auto-labeling.
        ├── auto-resolve-bot-threads.yml  # Resolve outdated bot review threads.
        ├── codebook-drift.yml   # Codebook-revision PR comment surface.
        ├── pages.yml            # Build + deploy GitHub Pages site.
        └── release-snapshot.yml # Build manifest on corpus-* tags.

All Python checks run locally without secrets:

pip install -r requirements.txt
python3 .github/scripts/validate_artifacts.py
python3 .github/scripts/check_consistency.py
python3 .github/scripts/check_provenance.py  # needs full git history; clone or `git fetch --unshallow`
python3 .github/scripts/check_falsifiers.py  # informational; reports F-codebook-001 and -003 candidates
python3 .github/scripts/check_links.py
python3 .github/scripts/compute_stats.py     # writes STATS.md locally (not tracked; Pages regenerates on deploy)
python3 .github/scripts/check_heartbeat.py   # gh-aware; safe to run locally
python3 .github/scripts/check_stale_routine.py

The two issue-managing checks (check_heartbeat.py, check_stale_routine.py) detect when gh is unavailable and skip the issue side-effects, so a local run only prints its verdict.

License

Apache License, Version 2.0. See LICENSE.

Contributors

Languages