Skip to content

feat: add experiment registry and flight experiment verbs#9

Merged
lewisnsmith merged 1 commit intomainfrom
feat/experiment-registry
Apr 18, 2026
Merged

feat: add experiment registry and flight experiment verbs#9
lewisnsmith merged 1 commit intomainfrom
feat/experiment-registry

Conversation

@lewisnsmith
Copy link
Copy Markdown
Owner

Summary

  • New src/experiments.ts registry module with race-safe file-per-experiment storage at ~/.flight/experiments/<name>.json
  • New flight experiment {new,list,show,diff,export} command group for managing experiments and cross-run analysis
  • Auto-registers unknown experiments on flight run --experiment with a one-line stderr hint

What this PR adds

src/experiments.ts

  • ExperimentEntry type with name, created_at, tags, optional description, baseline_run_id, model_config, notes
  • ensureExperimentRegistered — race-safe via { flag: "wx" } (O_EXCL); exactly one concurrent caller wins created: true
  • createOrUpdateExperiment — merges patch (arrays replace, not append); preserves created_at
  • getExperiment, listExperiments — graceful reads; invalid files skipped with a log warning
  • experimentsDir — configurable via opts.dir for test isolation

src/cli.ts additions

  • experiment new <name> — registers/updates an experiment
  • experiment list — table with run counts from SQLite
  • experiment show <name> — metadata + recent runs
  • experiment diff <a> <b> — delegates to compareCommand with sessions from both experiments
  • experiment export <name> — streams research JSONL per run to stdout (unbuffered)
  • flight run --experiment now calls ensureExperimentRegistered and prints a hint on first use

Tests

  • test/experiments.test.ts — 16 unit tests covering race safety, idempotency, merge semantics, invalid-file skipping
  • test/cli-rename.test.ts — extended with 3 CLI-spawn tests (self-skip when tsx unavailable in worktrees)

Docs

  • README, ARCHITECTURE, CLAUDE.md: added Experiment Registry section with schema, workflow, and CLI reference
  • CHANGELOG: Added entry under 1.5.0

Note on pre-existing test failures

test/integration.test.ts and test/benchmark.test.ts fail in this worktree with ENOENT: spawn tsx — the worktree has no local node_modules. These failures pre-exist on main (verified). All 23 tests introduced in this PR pass.

Dependencies

This branch includes the PR 1 changes (feat/cli-logs-rename) bundled together. Should be rebased/merged after the feat/cli-logs-rename PR lands (or this can be rebased onto it).

Test plan

  • npm run typecheck passes (exit 0)
  • npm run lint passes (exit 0)
  • npx vitest run test/experiments.test.ts — 16 tests pass
  • flight experiment new smoke --description "hi" --tags a,b creates ~/.flight/experiments/smoke.json
  • flight experiment list shows registry
  • flight run --agent x --experiment new-one auto-creates + prints hint
  • flight experiment diff a b exits 0 with a "no runs" message for zero-run case

Hard rename of the `log` command group to `logs` (plural) — no deprecation
shim per user decision. Bare `flight logs` now lists sessions.

Adds `flight run` as the happy-path entry point (human-friendly output:
"Started run <id> session <id>") and `flight show <session>` as a one-liner
delegating to viewSession. Both mirror the existing session start machinery.
@lewisnsmith lewisnsmith merged commit 4807161 into main Apr 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant