fix(adapters): drop duplicate webhook deliveries at ingest#1987
fix(adapters): drop duplicate webhook deliveries at ingest#1987truffle-dev wants to merge 2 commits into
Conversation
…1951) A duplicate delivery of the same comment passed every guard and queued a byte-identical second workflow run behind the first: the lock manager orders per-conversation messages but never dedups them. Dual repo+App webhook subscriptions (different delivery GUIDs for one comment), LB double-forwards, and redeliveries all hit this. Adds a bounded TTL first-seen cache in core and gates GitHub webhook processing on a logical idempotency key (comment id + updated_at, so edited comments still re-trigger), falling back to the X-GitHub-Delivery GUID when the payload lacks comment identity. Fails open when neither is available.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdds a DeliveryDeduplicator utility, exports it from core, extends the GitHub webhook event type with comment id/updated_at, integrates deduplication into GitHubAdapter (handleWebhook now accepts deliveryId and drops repeats), passes x-github-delivery from the server, and adds tests covering dedup scenarios. ChangesWebhook Delivery Deduplication
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/adapters/src/forge/github/adapter.test.ts`:
- Around line 470-527: The tests currently swallow errors from unmocked Octokit
calls in deliver(), making dedup tests non-deterministic; update
createDedupAdapter() to attach a mocked octokit client on the GitHubAdapter
instance that stubs the minimal downstream API methods used by handleWebhook
(e.g., the issues/comments and any repo/pull methods your webhook flow calls) so
calls resolve successfully, and then change deliver() to await
adapter.handleWebhook(...) without catching/ignoring errors so the test
deterministically fails on unexpected behavior; reference createDedupAdapter,
GitHubAdapter, and deliver when locating where to add the stubbed octokit
methods and remove the try/catch.
In `@packages/adapters/src/forge/github/adapter.ts`:
- Around line 1005-1010: The dedup key construction (dedupKey) currently treats
a comment identity as present when event.comment?.id exists even if
event.comment.updated_at is missing; change the condition that builds the
`comment:` key to require both `event.comment.id` and `event.comment.updated_at`
(non-null/undefined) before using them, otherwise fall back to the `deliveryId`
branch or undefined; update the ternary/condition around `dedupKey` (the
expression referencing `event.comment?.id` and `event.comment.updated_at`) so
edited-comment deduping only occurs when both values exist.
In `@packages/core/src/utils/delivery-dedup.test.ts`:
- Around line 3-79: Tests rely on real sleep/timers causing flakiness; switch to
Jest fake timers in the tests that use sleep so expiry is deterministic: in the
"key expires after TTL and may run again", "expired entries are pruned on
insert", and "re-seeing a key after expiry refreshes its eviction position"
tests, replace async await sleep(30) calls with jest.useFakeTimers() at test
start, call jest.advanceTimersByTime(30) (or
jest.runOnlyPendingTimers()/jest.runAllTimers() as needed) instead of awaiting
sleep, then call jest.useRealTimers() at the end; keep references to
DeliveryDeduplicator, seen, size, and remove reliance on the sleep helper (or
stub it to call advanceTimersByTime) so TTL behavior is driven by the fake clock
rather than wall-clock waits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5448b689-2aee-409a-9ecd-b9cc1dbdfaff
📒 Files selected for processing (8)
packages/adapters/src/forge/github/adapter.test.tspackages/adapters/src/forge/github/adapter.tspackages/adapters/src/forge/github/types.tspackages/core/package.jsonpackages/core/src/index.tspackages/core/src/utils/delivery-dedup.test.tspackages/core/src/utils/delivery-dedup.tspackages/server/src/index.ts
| function createDedupAdapter(): GitHubAdapter { | ||
| const adapter = new GitHubAdapter( | ||
| { kind: 'pat', token: 'fake-token-for-testing' }, | ||
| 'fake-webhook-secret', | ||
| mockLockManager, | ||
| 'archon' | ||
| ); | ||
| // @ts-expect-error - accessing private method for testing | ||
| adapter.verifySignature = mock(() => true); | ||
| return adapter; | ||
| } | ||
|
|
||
| /** | ||
| * Comment payload carrying GitHub's comment identity (id + updated_at), | ||
| * as real issue_comment deliveries do. | ||
| */ | ||
| function createIdentifiedCommentPayload( | ||
| commentBody: string, | ||
| commentId: number | undefined, | ||
| updatedAt: string | undefined | ||
| ): string { | ||
| const comment: { | ||
| id?: number; | ||
| body: string; | ||
| user: { login: string }; | ||
| updated_at?: string; | ||
| } = { body: commentBody, user: { login: 'user123' } }; | ||
| if (commentId !== undefined) comment.id = commentId; | ||
| if (updatedAt !== undefined) comment.updated_at = updatedAt; | ||
| return JSON.stringify({ | ||
| action: 'created', | ||
| issue: { | ||
| number: 42, | ||
| title: 'Test Issue', | ||
| body: 'Description', | ||
| user: { login: 'user123' }, | ||
| labels: [], | ||
| state: 'open', | ||
| }, | ||
| comment, | ||
| repository: { | ||
| owner: { login: 'testuser' }, | ||
| name: 'testrepo', | ||
| full_name: 'testuser/testrepo', | ||
| html_url: 'https://github.com/testuser/testrepo', | ||
| default_branch: 'main', | ||
| }, | ||
| sender: { login: 'user123' }, | ||
| }); | ||
| } | ||
|
|
||
| async function deliver(adapter: GitHubAdapter, payload: string, deliveryId?: string) { | ||
| try { | ||
| await adapter.handleWebhook(payload, 'mock-signature', deliveryId); | ||
| } catch { | ||
| // Expected - Octokit API not mocked for the downstream message path. | ||
| } | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Dedup tests depend on unmocked downstream Octokit failures.
Line 521-Line 526 swallows expected errors from an unmocked webhook path, which makes these tests depend on external call behavior instead of only dedup logic. Mock the minimal Octokit methods in createDedupAdapter() and make deliver() await success deterministically.
Suggested fix
function createDedupAdapter(): GitHubAdapter {
const adapter = new GitHubAdapter(
{ kind: 'pat', token: 'fake-token-for-testing' },
'fake-webhook-secret',
mockLockManager,
'archon'
);
// `@ts-expect-error` - accessing private method for testing
adapter.verifySignature = mock(() => true);
+ // `@ts-expect-error` - accessing private property for testing
+ adapter.octokit = {
+ rest: {
+ repos: {
+ get: mock(async () => ({ data: { default_branch: 'main' } })),
+ },
+ issues: {
+ listComments: mock(async () => ({ data: [] })),
+ createComment: mock(async () => ({ data: { id: 1 } })),
+ },
+ },
+ };
return adapter;
}
async function deliver(adapter: GitHubAdapter, payload: string, deliveryId?: string) {
- try {
- await adapter.handleWebhook(payload, 'mock-signature', deliveryId);
- } catch {
- // Expected - Octokit API not mocked for the downstream message path.
- }
+ await adapter.handleWebhook(payload, 'mock-signature', deliveryId);
}As per coding guidelines, “Keep tests deterministic — no flaky timing or network dependence without guardrails” and “mock external dependencies (database, AI SDKs, platform APIs).”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/forge/github/adapter.test.ts` around lines 470 - 527,
The tests currently swallow errors from unmocked Octokit calls in deliver(),
making dedup tests non-deterministic; update createDedupAdapter() to attach a
mocked octokit client on the GitHubAdapter instance that stubs the minimal
downstream API methods used by handleWebhook (e.g., the issues/comments and any
repo/pull methods your webhook flow calls) so calls resolve successfully, and
then change deliver() to await adapter.handleWebhook(...) without
catching/ignoring errors so the test deterministically fails on unexpected
behavior; reference createDedupAdapter, GitHubAdapter, and deliver when locating
where to add the stubbed octokit methods and remove the try/catch.
Source: Coding guidelines
…stic TTL tests Keying on comment id alone (updated_at absent) would dedup an edit against the original within the TTL window — require both fields and use the delivery-GUID fallback otherwise. Dedup TTL tests now use an injected clock instead of wall-clock sleeps.
Summary
Fixes #1951 — @sbiitmc's diagnosis is exactly right: a duplicate delivery of the same comment passes every guard in
handleWebhook, reachesConversationLockManager.acquireLock, and gets queued behind the in-flight run asqueued-conversation. The lock manager orders per-conversation messages but never dedups them, so when run #1 completes,.finally → processQueuereplays the byte-identical message as a full second workflow run (full token cost, can push/open a second PR).The incident's duplicate source — dual repo-webhook + App-webhook subscriptions — delivers the same comment under different delivery GUIDs, so deduping on
X-GitHub-Deliveryalone would miss it. The trigger path always carries a comment (close events return early inparseEvent), so the fix keys on the comment's identity instead:id+updated_at. Dual-subscription duplicates and LB double-forwards share that identity and get dropped; an edited comment gets a newupdated_atand still re-triggers.Changes
DeliveryDeduplicatorinpackages/core/src/utils/delivery-dedup.ts(exported from core, reusable by the gitea/gitlab adapters later): a bounded first-seen cache — 10-minute TTL so deliberate manual redeliveries hours later still run, 10k-entry cap with oldest-first eviction so memory stays bounded.comment:{owner}/{repo}#{number}:{comment.id}:{comment.updated_at}after the @mention check and before the expensive path (user resolution, conversation/codebase creation, clone/sync, comment-history fetch). Drops log asgithub.duplicate_delivery_droppedat info.delivery:{deliveryId}when the payload lacks comment identity, and fails open when neither is available — never drops a webhook for want of a key. The dedup check sits after signature verification so unauthenticated junk can't poison the cache.X-GitHub-Deliverywas already extracted in the webhook route but only used in error logs; it's now threaded tohandleWebhookas an optional third parameter (per the issue's scope note). Gitea/gitlab adapters untouched.WebhookEvent.commenttype gains optionalid/updated_at(both present on real GitHub deliveries).Tests
delivery-dedup.test.ts(new, wired into core's test chain): first-seen/repeat, key independence, edit-forms-new-key, TTL expiry, prune-on-insert, max-size eviction, post-expiry refresh — 8 cases.adapter.test.ts, newwebhook delivery dedupblock: same-GUID repeat dropped, dual-subscription duplicate (same comment, different GUIDs) dropped — the Duplicate webhook delivery runs a workflow twice (no ingest idempotency) #1951 incident shape, edited comment re-processed, distinct comments independent, GUID fallback, fail-open with no key — 6 cases asserting on conversation-creation call counts.idempoten|dedup|deliveryId|x-github-delivery), and no test pinned the duplicate-processing behavior.Validation
bun run lint --max-warnings 0,format:checkpass repo-wide. Full adapters suite green (70-test github adapter lane + all chained lanes). Core chain green through the utils lanes including the new one (the long-tail orchestrator/credentials lanes are untouched by this diff and left to CI).tsc --noEmitclean on core/adapters/server apart from pre-existingpackages/providerscopilot-client drift unrelated to this change.Closes #1951
Summary by CodeRabbit
New Features
Tests
Chores