fix: eliminate duplicate PR reviews with verdict lifecycle management#47
Merged
Nelson Spence (Fieldnote-Echo) merged 26 commits intomainfrom Mar 11, 2026
Merged
fix: eliminate duplicate PR reviews with verdict lifecycle management#47Nelson Spence (Fieldnote-Echo) merged 26 commits intomainfrom
Nelson Spence (Fieldnote-Echo) merged 26 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded homelab references (_HOMELAB_URL, _HOMELAB_HOST, _HOMELAB_PORT, _homelab_reachable, skip_no_homelab) with imports from tests.e2e_fixtures (LLM_BASE_URL, LLM_MODEL_ID, PROMPTS_DIR, llm_reachable, skip_no_llm) across all three e2e test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract review JSON from reasoning_content when content is empty (fixes nvidia/nemotron-3-nano and other reasoning models) - Stamp actual model ID on review output (LLMs hallucinate this field) - Add GRIPPY_MAX_DIFF_CHARS env var for local models with smaller context - Scale e2e test timeouts dynamically (600s local, 120s cloud) - Add unit tests for reasoning fallback, model ID stamp - Document nemotron-3-nano as validated local model - Document GRIPPY_MAX_DIFF_CHARS in configuration and self-hosted guides Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
File-level ignores for test files with fake creds in triple-quoted diff strings (cannot take line pragmas). Line-level # nogrip pragmas for test_grippy_review.py where fake AWS keys are in regular Python. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The glob was accidentally dropped when trimming file-level exclusions. test_e2e_adversarial_inputs.py has yaml.load() in a triple-quoted diff string that triggered the dangerous-execution-sinks rule. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
✅ Grippy Review — PASSScore: 98/100 | Findings: 1 Delta: 1 new Commit: df9192f |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
998 tests with coverage on free GitHub runners exceeds the 15-minute limit. All 163 new/modified tests pass — the cancellation was purely a timeout issue, not test failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ruff format step reformatted function call arguments, shifting the hex SHA strings to different lines than the allowlist pragmas. Moved pragmas to the actual hex-bearing lines (1588, 1599). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test only patches _check_already_reviewed but not github.Github, so main() proceeds into the full pipeline and hangs trying to connect to the real GitHub API. This caused the 20-minute CI timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Even with github.Github mocked, main() continues into the pipeline and calls fetch_pr_diff() (requests.get) or connects to local LLM at localhost:1234, both of which hang in CI. Mocking fetch_pr_diff with a side_effect=Exception makes main() bail immediately after passing the workflow_dispatch guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
998 tests with coverage complete in 19:56 on free runners, leaving only 4 seconds of headroom at 20 minutes. Bump to 25 for margin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Member
Author
Nelson Spence (Fieldnote-Echo)
left a comment
There was a problem hiding this comment.
test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
workflow_dispatchbypasses the guard for explicit re-runs.<!-- grippy-verdict -->+<!-- grippy-meta -->) replaces login-based detection. Post-first/dismiss-after ordering ensures the PR always has at least one active verdict.exclude_review_idprevents the fresh verdict from being dismissed.-f→-Ffor JSON array parsing infetch_thread_states(), unblocking stale thread resolution..grippyignore+# nogrippragmas for test false positives.Design
See
docs/plans/2026-03-10-review-dedup-design.mdfor the full design doc (marker-based identity, mode-aware dismissal, completeness check, accepted trade-offs).Test plan
🤖 Generated with Claude Code