fix: eliminate duplicate PR reviews with verdict lifecycle management by Fieldnote-Echo · Pull Request #47 · Project-Navi/grippy-code-review

Nelson Spence (Fieldnote-Echo) · 2026-03-10T14:24:14Z

Summary

Same-commit guard: Skip full pipeline if grippy already reviewed the current HEAD SHA (verdict + summary must both exist). workflow_dispatch bypasses the guard for explicit re-runs.
Verdict lifecycle: Marker-based identity ( + ) replaces login-based detection. Post-first/dismiss-after ordering ensures the PR always has at least one active verdict. exclude_review_id prevents the fresh verdict from being dismissed.
Thread state fetch fix: -f → -F for JSON array parsing in fetch_thread_states(), unblocking stale thread resolution.
E2e test hardening: Reasoning model support, tiered test suite (Tier 0-3), shared fixtures, .grippyignore + # nogrip pragmas for test false positives.

Design

See docs/plans/2026-03-10-review-dedup-design.md for the full design doc (marker-based identity, mode-aware dismissal, completeness check, accepted trade-offs).

Test plan

29 new unit tests covering all three fixes
998 total tests pass (0 failures)
Ruff lint + format clean
MyPy strict passes
CI green on this PR
Verify grippy self-review posts exactly one verdict (no duplicates)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace hardcoded homelab references (_HOMELAB_URL, _HOMELAB_HOST, _HOMELAB_PORT, _homelab_reachable, skip_no_homelab) with imports from tests.e2e_fixtures (LLM_BASE_URL, LLM_MODEL_ID, PROMPTS_DIR, llm_reachable, skip_no_llm) across all three e2e test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Extract review JSON from reasoning_content when content is empty (fixes nvidia/nemotron-3-nano and other reasoning models) - Stamp actual model ID on review output (LLMs hallucinate this field) - Add GRIPPY_MAX_DIFF_CHARS env var for local models with smaller context - Scale e2e test timeouts dynamically (600s local, 120s cloud) - Add unit tests for reasoning fallback, model ID stamp - Document nemotron-3-nano as validated local model - Document GRIPPY_MAX_DIFF_CHARS in configuration and self-hosted guides Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

File-level ignores for test files with fake creds in triple-quoted diff strings (cannot take line pragmas). Line-level # nogrip pragmas for test_grippy_review.py where fake AWS keys are in regular Python. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The glob was accidentally dropped when trimming file-level exclusions. test_e2e_adversarial_inputs.py has yaml.load() in a triple-quoted diff string that triggered the dangerous-execution-sinks rule. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/grippy/github_review.py

tests/test_grippy_review.py

src/grippy/review.py

github-actions

Grippy approves — PASS (93/100)

github-actions · 2026-03-10T14:28:21Z

✅ Grippy Review — PASS

Score: 98/100 | Findings: 1

Delta: 1 new

_{Commit: df9192f}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

998 tests with coverage on free GitHub runners exceeds the 15-minute limit. All 163 new/modified tests pass — the cancellation was purely a timeout issue, not test failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

.github/workflows/tests.yml

github-actions

Grippy approves — PASS (98/100)

The ruff format step reformatted function call arguments, shifting the hex SHA strings to different lines than the allowlist pragmas. Moved pragmas to the actual hex-bearing lines (1588, 1599). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions

Grippy approves — PASS (98/100)

The test only patches _check_already_reviewed but not github.Github, so main() proceeds into the full pipeline and hangs trying to connect to the real GitHub API. This caused the 20-minute CI timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tests/test_grippy_review.py

github-actions

Grippy approves — PASS (96/100)

Even with github.Github mocked, main() continues into the pipeline and calls fetch_pr_diff() (requests.get) or connects to local LLM at localhost:1234, both of which hang in CI. Mocking fetch_pr_diff with a side_effect=Exception makes main() bail immediately after passing the workflow_dispatch guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tests/test_grippy_review.py

github-actions

Grippy approves — PASS (100/100)

998 tests with coverage complete in 19:56 on free runners, leaving only 4 seconds of headroom at 20 minutes. Bump to 25 for margin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

.github/workflows/tests.yml

github-actions

Grippy approves — PASS (98/100)

codecov · 2026-03-10T18:52:03Z

Codecov Report

❌ Patch coverage is 90.21739% with 9 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/grippy/review.py	82.60%	8 Missing ⚠️
src/grippy/github_review.py	97.56%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Nelson Spence (Fieldnote-Echo)

test

Nelson Spence (Fieldnote-Echo) and others added 19 commits March 9, 2026 21:15

test: add e2e_fast/e2e_stress markers with proper expression evaluation

0281b0b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add shared e2e fixtures with LLM health check and diff corpus

700fb5d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add Tier 0 hypothesis property tests for parser/retry/escaping

1cc576f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add Tier 1 deterministic edge diff tests

910b765

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add Tier 1 adversarial input tests (deterministic)

6912950

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add Tier 2 LLM system contract tests

1247a2b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add Tier 3 LLM model characterization tests

f2457bb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: resolve pre-commit lint and formatting issues

bf47021

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add grippy verdict markers and metadata parser

80ae6be

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add _dismiss_prior_verdicts with marker-based identity

3770706

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: wire verdict markers and dismiss-after-post into post_review

907698e

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use -F (JSON) not -f (string) for thread state fetch ids

93a866f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add _check_already_reviewed same-commit guard

20c8ac4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: wire same-commit guard into CI pipeline main()

48f52b0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: fix ruff format issues from dedup implementation

70d53ea

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-code-quality bot found potential problems Mar 10, 2026

View reviewed changes

src/grippy/github_review.py Show resolved Hide resolved

tests/test_grippy_review.py Fixed Show fixed Hide fixed

github-actions bot reviewed Mar 10, 2026

View reviewed changes

src/grippy/review.py Show resolved Hide resolved

src/grippy/review.py Show resolved Hide resolved

github-actions bot previously approved these changes Mar 10, 2026

View reviewed changes

Nelson Spence (Fieldnote-Echo) and others added 2 commits March 10, 2026 09:44

fix: add pragma allowlist for fake SHA strings in tests

8b288ca

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: bump test job timeout to 20 minutes

f2a2980

998 tests with coverage on free GitHub runners exceeds the 15-minute limit. All 163 new/modified tests pass — the cancellation was purely a timeout issue, not test failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Nelson Spence (Fieldnote-Echo) dismissed github-actions[bot]’s stale review via 8b288ca March 10, 2026 14:48

github-actions bot reviewed Mar 10, 2026

View reviewed changes

.github/workflows/tests.yml Show resolved Hide resolved

github-actions bot previously approved these changes Mar 10, 2026

View reviewed changes

Nelson Spence (Fieldnote-Echo) dismissed github-actions[bot]’s stale review via ea5dc33 March 10, 2026 15:00

github-actions bot previously approved these changes Mar 10, 2026

View reviewed changes

Nelson Spence (Fieldnote-Echo) dismissed github-actions[bot]’s stale review via f396ebb March 10, 2026 15:36

github-code-quality bot found potential problems Mar 10, 2026

View reviewed changes

tests/test_grippy_review.py Fixed Show fixed Hide fixed

github-actions bot reviewed Mar 10, 2026

View reviewed changes

tests/test_grippy_review.py Show resolved Hide resolved

tests/test_grippy_review.py Show resolved Hide resolved

github-actions bot previously approved these changes Mar 10, 2026

View reviewed changes

Nelson Spence (Fieldnote-Echo) dismissed github-actions[bot]’s stale review via 5dc5b4b March 10, 2026 16:25

github-code-quality bot found potential problems Mar 10, 2026

View reviewed changes

tests/test_grippy_review.py Show resolved Hide resolved

github-actions bot previously approved these changes Mar 10, 2026

View reviewed changes

fix: bump test timeout to 25 minutes for coverage overhead

df9192f

998 tests with coverage complete in 19:56 on free runners, leaving only 4 seconds of headroom at 20 minutes. Bump to 25 for margin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Nelson Spence (Fieldnote-Echo) dismissed github-actions[bot]’s stale review via df9192f March 10, 2026 18:13

github-actions bot reviewed Mar 10, 2026

View reviewed changes

.github/workflows/tests.yml Show resolved Hide resolved

github-actions bot approved these changes Mar 10, 2026

View reviewed changes

Nelson Spence (Fieldnote-Echo) commented Mar 11, 2026

View reviewed changes

chore: trigger navi-bot approval

eccdea5

Nelson Spence (Fieldnote-Echo) merged commit 20b02eb into main Mar 11, 2026
14 of 15 checks passed

Nelson Spence (Fieldnote-Echo) deleted the fix/review-dedup branch March 11, 2026 22:39

Conversation

Nelson Spence (Fieldnote-Echo) commented Mar 10, 2026

Summary

Design

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Grippy Review — PASS

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 10, 2026

Codecov Report

Uh oh!

Nelson Spence (Fieldnote-Echo) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 10, 2026 •

edited

Loading