Skip to content

fix: eliminate duplicate PR reviews with verdict lifecycle management#47

Merged
Nelson Spence (Fieldnote-Echo) merged 26 commits intomainfrom
fix/review-dedup
Mar 11, 2026
Merged

fix: eliminate duplicate PR reviews with verdict lifecycle management#47
Nelson Spence (Fieldnote-Echo) merged 26 commits intomainfrom
fix/review-dedup

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Member

Summary

  • Same-commit guard: Skip full pipeline if grippy already reviewed the current HEAD SHA (verdict + summary must both exist). workflow_dispatch bypasses the guard for explicit re-runs.
  • Verdict lifecycle: Marker-based identity (<!-- grippy-verdict --> + <!-- grippy-meta -->) replaces login-based detection. Post-first/dismiss-after ordering ensures the PR always has at least one active verdict. exclude_review_id prevents the fresh verdict from being dismissed.
  • Thread state fetch fix: -f-F for JSON array parsing in fetch_thread_states(), unblocking stale thread resolution.
  • E2e test hardening: Reasoning model support, tiered test suite (Tier 0-3), shared fixtures, .grippyignore + # nogrip pragmas for test false positives.

Design

See docs/plans/2026-03-10-review-dedup-design.md for the full design doc (marker-based identity, mode-aware dismissal, completeness check, accepted trade-offs).

Test plan

  • 29 new unit tests covering all three fixes
  • 998 total tests pass (0 failures)
  • Ruff lint + format clean
  • MyPy strict passes
  • CI green on this PR
  • Verify grippy self-review posts exactly one verdict (no duplicates)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded homelab references (_HOMELAB_URL, _HOMELAB_HOST,
_HOMELAB_PORT, _homelab_reachable, skip_no_homelab) with imports
from tests.e2e_fixtures (LLM_BASE_URL, LLM_MODEL_ID, PROMPTS_DIR,
llm_reachable, skip_no_llm) across all three e2e test files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract review JSON from reasoning_content when content is empty
  (fixes nvidia/nemotron-3-nano and other reasoning models)
- Stamp actual model ID on review output (LLMs hallucinate this field)
- Add GRIPPY_MAX_DIFF_CHARS env var for local models with smaller context
- Scale e2e test timeouts dynamically (600s local, 120s cloud)
- Add unit tests for reasoning fallback, model ID stamp
- Document nemotron-3-nano as validated local model
- Document GRIPPY_MAX_DIFF_CHARS in configuration and self-hosted guides

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
File-level ignores for test files with fake creds in triple-quoted
diff strings (cannot take line pragmas). Line-level # nogrip pragmas
for test_grippy_review.py where fake AWS keys are in regular Python.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The glob was accidentally dropped when trimming file-level exclusions.
test_e2e_adversarial_inputs.py has yaml.load() in a triple-quoted diff
string that triggered the dangerous-execution-sinks rule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions bot previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (93/100)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 10, 2026

✅ Grippy Review — PASS

Score: 98/100 | Findings: 1

Delta: 1 new


Commit: df9192f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
998 tests with coverage on free GitHub runners exceeds the 15-minute
limit. All 163 new/modified tests pass — the cancellation was purely
a timeout issue, not test failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions bot previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (98/100)

The ruff format step reformatted function call arguments, shifting the
hex SHA strings to different lines than the allowlist pragmas. Moved
pragmas to the actual hex-bearing lines (1588, 1599).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions bot previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (98/100)

The test only patches _check_already_reviewed but not github.Github,
so main() proceeds into the full pipeline and hangs trying to connect
to the real GitHub API. This caused the 20-minute CI timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions bot previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (96/100)

Even with github.Github mocked, main() continues into the pipeline
and calls fetch_pr_diff() (requests.get) or connects to local LLM
at localhost:1234, both of which hang in CI. Mocking fetch_pr_diff
with a side_effect=Exception makes main() bail immediately after
passing the workflow_dispatch guard.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions bot previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (100/100)

998 tests with coverage complete in 19:56 on free runners, leaving
only 4 seconds of headroom at 20 minutes. Bump to 25 for margin.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (98/100)

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 90.21739% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/grippy/review.py 82.60% 8 Missing ⚠️
src/grippy/github_review.py 97.56% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) merged commit 20b02eb into main Mar 11, 2026
14 of 15 checks passed
@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) deleted the fix/review-dedup branch March 11, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant