Performance snapshot system for comparing benchmarks across commits

## Summary

We need a system for capturing performance benchmark timings at specific commits and comparing them against the stored baseline or other snapshots.

## Problem

Currently `test_perf.py` only asserts pass/fail against hardcoded baselines with a 2x multiplier. There's no way to:
- See actual timings vs baseline in a table
- Save a snapshot at a commit for later comparison
- Compare two arbitrary snapshots (e.g. before/after a change)

## Features

- `poetry run python bench_snapshot.py` — run benchmarks, print comparison table against baselines
- `poetry run python bench_snapshot.py --save` — save timings as a named snapshot (keyed by git short hash)
- `poetry run python bench_snapshot.py --compare <commit>` — compare a saved snapshot to baselines
- `poetry run python bench_snapshot.py --list` — list saved snapshots

## Technical Considerations

- pytest `--durations` includes test setup/teardown overhead (~0.3-0.5s per test for Textual app init), so timings won't match the pure operation baselines exactly
- Ideally instrument the tests to print actual operation timings (the `elapsed` variable in each test) rather than relying on `--durations`
- Consider a pytest plugin or conftest fixture that captures and reports the measured `elapsed` values
- Snapshots should be gitignored (machine-specific)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance snapshot system for comparing benchmarks across commits #59

Summary

Problem

Features

Technical Considerations

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Performance snapshot system for comparing benchmarks across commits #59

Description

Summary

Problem

Features

Technical Considerations

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions