Skip to content

fix(benchmark): run typedframes checker against actual corpus#19

Merged
w-martin merged 6 commits into
mainfrom
worktree-agent-aa90d36e163638947
May 30, 2026
Merged

fix(benchmark): run typedframes checker against actual corpus#19
w-martin merged 6 commits into
mainfrom
worktree-agent-aa90d36e163638947

Conversation

@w-martin

@w-martin w-martin commented May 10, 2026

Copy link
Copy Markdown
Owner

Summary

  • Fixes fix(benchmark): typedframes binary always targets examples/ not the benchmark corpus #1: switches typedframes to use the Python CLI (uv run typedframes check) so the benchmark targets the actual corpus directory rather than a single hardcoded example file
  • Removes the dead find_binary helper and binary_path parameter
  • Removes the invalid "10-100x faster than mypy" claim
  • Re-runs the benchmark against both corpora with the corrected code; the great_expectations column now shows the real figure (310ms ±10ms, not the bogus 930µs)
  • Clones great_expectations to a tempfile.mkdtemp() dir and removes it after copying, so no large directory is left behind in /tmp
  • Omits tools from the README table if they couldn't be found at benchmark time (e.g. pyright not installed locally)

Stacked on #18 (cache-clearing path fixes) — target that branch, not main.

Test plan

  • run_codebase_benchmarks no longer special-cases typedframes
  • typedframes uses uv run typedframes check as its command
  • great_expectations clone is cleaned up after the benchmark
  • README updated with corrected benchmark figures
  • Tools unavailable locally are omitted from the table (not shown as "Tool not found")

@w-martin w-martin changed the base branch from main to worktree-agent-a8ec2ca0fb3645d90 May 17, 2026 10:41
Base automatically changed from worktree-agent-a8ec2ca0fb3645d90 to main May 26, 2026 10:24
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

w-martin added 4 commits May 30, 2026 09:39
…fter copy

Cloning to a persistent path in BENCH_DIR left a large directory in /tmp
between runs. Switch to tempfile.mkdtemp() and rmtree in a finally block
so the clone is always removed once its contents have been copied out.
…corrected corpus

- generate_markdown_table now omits rows where all runs failed (e.g. pyright
  not installed), so the README never shows "Tool not found" cells
- Re-run with the corrected benchmark (typedframes now targets the full corpus
  rather than a single example file); great_expectations column updated from
  the invalid 930µs to the actual 310ms ±10ms
npx is not universally available; pyright is already in the dev dependency
group so uv run pyright works everywhere the dev env is installed.
Re-run includes pyright numbers: 822ms (13 files), 3.43s (488 files).
@w-martin w-martin merged commit 1394aeb into main May 30, 2026
6 checks passed
@w-martin w-martin deleted the worktree-agent-aa90d36e163638947 branch May 30, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(benchmark): typedframes binary always targets examples/ not the benchmark corpus

2 participants