Fix processingTime semantics, schema caching, and restarts#83
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughBumps versions to 3.0.5; clarifies that persisted Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant LoggerMiddleware
participant WebhookProcessor
participant DuckDB
participant SyncService
rect rgba(200,220,255,0.5)
Client->>LoggerMiddleware: POST /webhook (body, schema)
LoggerMiddleware->>LoggerMiddleware: transform request & get validator (cache)
LoggerMiddleware->>LoggerMiddleware: measure processingTime (before responseDelay)
LoggerMiddleware->>WebhookProcessor: persist log (pushData)
LoggerMiddleware->>Client: apply responseDelayMs (simulate) -> respond
end
rect rgba(200,255,200,0.5)
SyncService->>DuckDB: scheduled sync: read persisted logs
SyncService->>DuckDB: batch insert / mark synced
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR refines latency/processing metrics and lifecycle behavior across the webhook ingestion pipeline, while aligning schemas/docs and adding coverage to prevent regressions in repeated start/stop flows.
Changes:
- Record
processingTimebefore anyresponseDelayMsand update schemas/docs/tests accordingly. - Cache compiled Ajv JSON Schema validators to avoid repeated compilation on stable configurations.
- Improve restart/teardown stability by resetting singleton state (webhook manager, sync limiter, DuckDB queue) and hardening test harness cleanup.
Reviewed changes
Copilot reviewed 40 out of 42 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/webhook_manager.test.js | Adds coverage for resetStateForTest() and its environment guard. |
| tests/unit/utils/custom_script_executor_mocked.test.js | Expands worker lifecycle tests (unref, listener cleanup, late events). |
| tests/unit/utils/custom_script_executor.test.js | Increases timeouts for coverage runs and refactors long-running tests. |
| tests/unit/services/sync_service.test.js | Adds restart/idempotency and stop-race coverage for sync scheduling/limiter cleanup. |
| tests/unit/security/logger_middleware_hardening.test.js | Raises Jest timeout for hardening suite stability. |
| tests/unit/routes/info.test.js | Updates expected version string. |
| tests/unit/main.test.js | Adds tests for disabling Crawlee polling, shutdown DB close, Actor.exit in tests, and webhook state reset. |
| tests/unit/logger_middleware.test.js | Adds validator cache tests, processingTime semantics tests, and more robust custom-script mocking. |
| tests/unit/db/duckdb.test.js | Adds shutdown/reset assertions for queue disposal and instance close behavior. |
| tests/unit/actor/input_schema.test.js | Verifies processingTime documentation alignment across Actor/OpenAPI schemas. |
| tests/setup/helpers/shared-mocks.js | Extends shared mocks (duckDb.closeDb, webhookManager.resetStateForTest). |
| tests/setup/helpers/e2e-process-harness.js | Improves child shutdown handling (close wait, stream cleanup, timeout unref). |
| tests/setup/helpers/app-utils.js | Hardens integration harness isolation (unique module import, env/listener cleanup, storage dir). |
| tests/setup/helpers/apify-mock.js | Extends Apify mock typing for config and formats pushData mock. |
| tests/integration/security.sanitation.integration.test.js | Adds log persistence assertion for invalid JSON case with polling. |
| tests/integration/logging.query.integration.test.js | Adds integration checks for persisted processingTime samples and responseDelay exclusion. |
| tests/integration/app.lifecycle.integration.test.js | Adds repeated boot/teardown cycles regression test and increases timeouts. |
| tests/e2e/webhook_lifecycle.test.js | Adds black-box tests for processingTime vs delay and warm-path latency SLOs. |
| tests/e2e/resilience.test.js | Adjusts stress parameters for coverage/CI stability and introduces grace window. |
| src/webhook_manager.js | Adds resetStateForTest() to clear singleton state for tests. |
| src/utils/custom_script_executor.js | Improves worker lifecycle: unref, listener cleanup, terminate de-dupe, late-event ignoring. |
| src/typedefs.js | Documents processingTime semantics as server-side time excluding responseDelay. |
| src/services/SyncService.js | Makes limiter lifecycle restart-safe and adds stop-race checks during sync. |
| src/repositories/LogRepository.js | Fixes processingTime persistence to preserve 0 values via nullish coalescing. |
| src/main.js | Calls closeDb() on shutdown, disables Crawlee polling in tests, uses Actor.exit({exit:false}) in tests, resets webhook state in test reset. |
| src/logger_middleware.js | Caches compiled validators; measures processingTime before response delay; ensures pushData timeout cleanup. |
| src/db/duckdb.js | Adds write-queue reset/disconnect and closes instance on reset/close. |
| src/consts/messages.js | Adds shutdown log message for DB close failure. |
| src/consts/errors.js | Adds error message for disallowed webhook state reset outside tests. |
| package.json | Bumps version to 3.0.5, adjusts coverage scripts, adds cross-env, updates stress suite targets. |
| package-lock.json | Locks new version and cross-env dependency. |
| docs/playbooks/canary-shadow-rollout.md | Updates guidance to use processingTime semantics (server-side only). |
| docs/marketing/apify-complete-masterclass.md | Updates wording to “server processing time” and clarifies processingTime meaning. |
| docs/architecture.md | Documents processingTime measurement point and schema validator caching behavior. |
| docs/api-reference.md | Updates version, clarifies processingTime vs responseDelayMs in API docs and filters. |
| README.md | Adds sponsor section, updates version and processingTime wording, clarifies responseDelay behavior. |
| CHANGELOG.md | Adds 3.0.5 entry describing metric semantics, caching, lifecycle fixes, and tests. |
| .gitignore | Ignores *.tmp* artifacts. |
| .actor/web_server_schema.json | Bumps version and documents processingTime in OpenAPI schema. |
| .actor/output_schema.json | Adds processingTime label/format to output schema view. |
| .actor/dataset_schema.json | Renames “Latency” to “Processing Time” and clarifies delay exclusion. |
| .actor/actor.json | Bumps Actor version to 3.0.5. |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
tests/e2e/resilience.test.js (1)
113-178:⚠️ Potential issue | 🟡 MinorMeasure completion time, not just request dispatch.
dispatchElapsedMsis captured beforePromise.all(ingestRequests), so the window assertion can pass even if the webhook requests finish much later. Move the elapsed measurement after the await so the test validates the “process within window” claim.🧪 Proposed test fix
- it("should process 1000 webhook requests within the target burst window with CI-safe grace, show rate limiting, and keep logs stable", async () => { + it("should process the configured webhook request burst within the target window with CI-safe grace, show rate limiting, and keep logs stable", async () => { @@ - const dispatchElapsedMs = Date.now() - startedAt; const ingestStatuses = await Promise.all(ingestRequests); + const processingElapsedMs = Date.now() - startedAt; @@ - expect(dispatchElapsedMs).toBeLessThanOrEqual( + expect(processingElapsedMs).toBeLessThanOrEqual( STRESS_WINDOW_MS + STRESS_WINDOW_GRACE_MS, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/resilience.test.js` around lines 113 - 178, The test measures dispatchElapsedMs before awaiting the webhook requests which only times request creation, not completion; move the timestamp capture (the startedAt/dispatchElapsedMs calculation) so that dispatchElapsedMs is computed after awaiting Promise.all(ingestRequests) (or capture end time into a new variable after Promise.all) and then assert that dispatchElapsedMs (or the completion elapsed) is <= STRESS_WINDOW_MS + STRESS_WINDOW_GRACE_MS; update references to startedAt/dispatchElapsedMs in the test around ingestRequests, Promise.all(ingestRequests), and the final expect that uses STRESS_WINDOW_MS and STRESS_WINDOW_GRACE_MS.tests/unit/utils/custom_script_executor_mocked.test.js (1)
74-77:⚠️ Potential issue | 🟡 MinorUse the shared mock cleanup helper for module mocks.
jest.resetModules()alone does not cover the project’s mock isolation convention for tests that calljest.unstable_mockModule. As per coding guidelines,tests/**/*.test.js: “Use useMockCleanup() for proper mock isolation in tests”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/utils/custom_script_executor_mocked.test.js` around lines 74 - 77, The beforeEach currently calls jest.resetModules(), which doesn't fully isolate mocks created with jest.unstable_mockModule; replace that call with the shared mock cleanup helper useMockCleanup() so the test suite follows the project's mock isolation convention—update the beforeEach in "Custom Script Executor worker failure paths" to call useMockCleanup() (import/use the helper) instead of jest.resetModules() and ensure any unstable_mockModule usage is covered by the helper.tests/integration/security.sanitation.integration.test.js (1)
6-19:⚠️ Potential issue | 🟡 MinorLoad
setupCommonMocksbefore importing source modules.This test imports
../../src/...modules before the common mock setup. Add the common setup side-effect import first so source modules initialize under the expected test mocks. As per coding guidelines,tests/**/*.test.js: “Import setupCommonMocks BEFORE source code in tests”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/security.sanitation.integration.test.js` around lines 6 - 19, Move the test-side-effect import for setupCommonMocks to the very top of the file so it runs before any src module imports; specifically add an import of setupCommonMocks (the common test-mock initializer used across tests) before the existing imports of APP_ROUTES, HTTP_HEADERS/HTTP_METHODS/HTTP_STATUS/MIME_TYPES, startIntegrationApp, createMalformedPayloadFixtures/createWebhookPayload, waitForCondition, and AUTH_CONSTS so source modules initialize under the expected mocks.src/logger_middleware.js (1)
917-933:⚠️ Potential issue | 🟡 MinorGuard
timeoutHandlebefore calling.unref()to match the pattern used inpushDataTimeout.Line 933 uses a property check without optional chaining, while line 1318 correctly uses optional chaining for the same operation. This inconsistency can cause a runtime error if
timeoutHandleisundefined.Proposed fix
- if (timeoutHandle.unref) timeoutHandle.unref(); + if (timeoutHandle?.unref) timeoutHandle.unref();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/logger_middleware.js` around lines 917 - 933, The call to timeoutHandle.unref() in the timeout cleanup block should guard against timeoutHandle being undefined to match the pattern used in pushDataTimeout; update the code around the controller.abort/LOG_MESSAGES.BACKGROUND_TIMEOUT block to use a safe check (e.g., timeoutHandle?.unref() or an explicit if (timeoutHandle) guard) so calling .unref() cannot throw when timeoutHandle is undefined, keeping the behavior consistent with existing pushDataTimeout handling.
🧹 Nitpick comments (7)
src/services/SyncService.js (1)
163-177:#isRunninggates across async boundaries are a meaningful hardening.Checking
#isRunningafter eachawait(offset fetch,Actor.openDataset,getInfo,getData,batchInsertLogs) plus in thecatchblock prevents stale work from mutating metrics or looping via the recursive#triggerSync()oncestop()has been called. Thecatchearly-return also suppresses the re-throw during shutdown, which is what prevents the scheduler's stopped-job error from being logged as a realSYNC_ERROR_GENERAL.One tiny nit: the comment at line 297 says "propagate to
#triggerSync()'s catch handler (line 156)" — after the edits the catch handler is around lines 169-176. Worth updating the reference so future readers aren't misled.Also applies to: 209-299
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/SyncService.js` around lines 163 - 177, The sync routine needs explicit `#isRunning` guards across async boundaries to avoid doing work after stop(): inside `#syncLogs` add checks for this.#isRunning (early-return) after each awaitable step (offset fetch, Actor.openDataset, getInfo, getData, batchInsertLogs) and in the catch block ensure you return early to suppress rethrows during shutdown; keep the limiter catch that ignores ERROR_MESSAGES.BOTTLENECK_STOPPED and logs LOG_MESSAGES.SYNC_SCHEDULE_ERROR for other errors. Also update the inline comment that points to the catch handler so it references the `#triggerSync`() catch handler (the catch inside `#triggerSync`) rather than the outdated line numbers. Ensure all references mention the methods `#syncLogs`, `#triggerSync`, and stop().src/db/duckdb.js (1)
378-393: Consider stopping the write queue before closing pooled/in-use connections.
closeDb()currently (a) closes every pooled and in-use connection, (b) nulls the instance, and only then (c)awaitsresetWriteQueue().stop({ dropWaitingJobs: true })drops waiting jobs but lets running ones finish — any write/transaction already executing via the write queue will now run against acloseSync()'d connection and reject with a driver error, which the transaction handler then surfaces asTRANSACTION_ROLLBACK_FAILEDwarnings in shutdown logs.Reversing the order stops new work from starting, lets in-flight work finish cleanly, and then frees connections:
♻️ Proposed reordering
export async function closeDb() { - // Drain connection pool + // 1. Stop the write queue first so in-flight writes/transactions complete + // against live connections before we tear the pool down. + await resetWriteQueue(); + + // 2. Drain connection pool while (connectionPool.length > 0) { closeConnectionQuietly(connectionPool.pop()); } while (inUseConnections.length > 0) { closeConnectionQuietly(inUseConnections.pop()); } + + // 3. Close the cached instance. const instanceToClose = dbInstance; dbInstance = null; initPromise = null; - closeInstanceQuietly(instanceToClose); - - await resetWriteQueue(); }Note: read-path callers (
executeQuery) are unaffected by this ordering since they don't go through the queue; they should be quiesced by the caller (e.g. HTTP server shutdown) beforecloseDb()is invoked.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db/duckdb.js` around lines 378 - 393, The shutdown order in closeDb() is wrong: call and await the write-queue shutdown before closing pooled/in-use connections and the db instance to let in-flight writes finish; specifically, invoke the write-queue stop/reset (e.g., await resetWriteQueue() or writeQueue.stop({ dropWaitingJobs: true })) at the start of closeDb(), then proceed to drain connectionPool using closeConnectionQuietly, drain inUseConnections, null dbInstance/initPromise, and finally call closeInstanceQuietly(instanceToClose).src/utils/custom_script_executor.js (1)
1-4: Add the missing@moduletag to the file header.Proposed header fix
/** * `@file` src/utils/custom_script_executor.js * `@description` Validates and executes custom webhook scripts inside disposable worker isolates. + * `@module` utils/custom_script_executor */As per coding guidelines,
src/**/*.js: “Each file should have a header comment block with@file,@description, and@moduletags”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/custom_script_executor.js` around lines 1 - 4, The header block in src/utils/custom_script_executor.js is missing the required `@module` tag; add a `@module` entry (for example "@module customScriptExecutor" or a name matching the file's exported functionality) to the top comment block alongside the existing `@file` and `@description` tags so the file header meets the project's header guideline.tests/unit/utils/custom_script_executor_mocked.test.js (1)
365-398: Make this assertion prove termination is actually awaited.
terminateResolvedis set as soon asterminate()is called, so this still passes ifexecuteCustomScript()resolves before the termination promise settles. Use a deferred promise and assert the result remains pending until you resolve it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/utils/custom_script_executor_mocked.test.js` around lines 365 - 398, The test incorrectly treats terminate as awaited because terminateResolved flips immediately; change terminateMock implementation in this test to return a deferred promise (e.g., create a deferred with promise, resolve, reject functions), have terminateMock.mockImplementation(() => deferred.promise), then after emitting the failed message assert that pendingResult is still pending (for example attempt Promise.race with a short timeout or check that awaiting a microtask doesn't resolve), then resolve deferred.resolve() and finally await pendingResult and assert the resolved value equals failedResult and terminateMock was called once; reference terminateMock, terminateResolved (remove or replace), executeCustomScript, pendingResult, createMockWorkerHarness, and MockWorker to locate the test elements to change.tests/integration/logging.query.integration.test.js (1)
351-390: Minor:bodyproperty in.query()is a no-op filter.Supertest serializes
body: { id: payload.id }asbody[id]=...in the query string, but the/logsendpoint does not support filtering by nested request-body fields — only thewebhookIdfilter actually narrows results. The assertion still works becauseitems.find((item) => item.webhookId === webhookId ...)matches on webhookId alone, but thebodyargument is misleading for readers and can be dropped (or replaced with a real supported filter).Proposed cleanup
- .query({ - webhookId, - limit: LOGS_QUERY_LIMIT, - body: { id: payload.id }, - }); + .query({ webhookId, limit: LOGS_QUERY_LIMIT });Applies similarly to the second
.query(...)block at Line 386–390.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/logging.query.integration.test.js` around lines 351 - 390, The .query(...) calls on activeContext.appClient when hitting APP_ROUTES.LOGS include a no-op nested body filter (body: { id: payload.id }) which only serializes to query string and isn't supported by the /logs endpoint; remove the body property from both .query(...) invocations (the waitForCondition block and the finalLogsResponse call) or replace it with a supported top-level filter (e.g., use webhookId or another documented query param) so the query only contains valid filters and the intent is clear.tests/unit/logger_middleware.test.js (2)
2646-2667: Restore the Ajv prototype spy infinally.If an assertion fails before Line 2667, the
compilespy can leak into later schema tests. Wrap the middleware calls/assertions intry/finallyto preserve test isolation.Suggested cleanup hardening
const ajvSpy = jest // `@ts-expect-error` - access prototype for mocking .spyOn(AjvClass.prototype || AjvClass, "compile") .mockImplementationOnce(() => { throw new Error("Ajv fail"); }); - await middleware(req, res, next); - expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.BAD_REQUEST); - expect(res.json).toHaveBeenCalledWith( - expect.objectContaining({ error: "No errors", id: webhookId }), - ); + try { + await middleware(req, res, next); + expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.BAD_REQUEST); + expect(res.json).toHaveBeenCalledWith( + expect.objectContaining({ error: "No errors", id: webhookId }), + ); - jest.mocked(res.status).mockClear(); - jest.mocked(res.json).mockClear(); - jest.mocked(res.send).mockClear(); + jest.mocked(res.status).mockClear(); + jest.mocked(res.json).mockClear(); + jest.mocked(res.send).mockClear(); - await middleware(req, res, next); + await middleware(req, res, next); - expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.OK); - ajvSpy.mockRestore(); + expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.OK); + } finally { + ajvSpy.mockRestore(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/logger_middleware.test.js` around lines 2646 - 2667, The Ajv prototype spy (ajvSpy) can leak if assertions fail—wrap the sections that call middleware and assert responses in a try/finally so ajvSpy.mockRestore() always runs; specifically, around the two await middleware(req, res, next) + expect(...) blocks in this test, enclose them in try { ... } finally { ajvSpy.mockRestore(); } (referencing AjvClass, ajvSpy, middleware, res, next) to guarantee restoration of the compile spy and preserve test isolation.
366-430: Make these cache tests prove value-based reuse, not same-reference reuse.Both tests return the exact same
schemaObjreference fromgetWebhookData, so a reference-keyed cache could pass while failing the intended “stable configuration” behavior. Return a fresh equivalent schema per request to guard the JSON-stringified cache key contract.Suggested adjustment
+ const createWebhookData = () => ({ + jsonSchema: JSON.parse(JSON.stringify(schemaObj)), + expiresAt: new Date(Date.now() + APP_CONSTS.MS_PER_HOUR).toISOString(), + }); const { middleware, req, res, next, webhookManager } = await createMiddlewareTestContext(); - jest.mocked(webhookManager.getWebhookData).mockReturnValue({ - jsonSchema: schemaObj, - expiresAt: new Date(Date.now() + APP_CONSTS.MS_PER_HOUR).toISOString(), - }); + jest.mocked(webhookManager.getWebhookData).mockImplementation(createWebhookData);Apply the same pattern in the concurrent test so both requests receive equivalent but distinct schema objects.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/logger_middleware.test.js` around lines 366 - 430, The tests currently return the same schemaObj reference from webhookManager.getWebhookData so the cache might be passing by reference instead of value; change the mocks to return a fresh deep-cloned schema per call (e.g., replace mockReturnValue(...) with mockImplementation(() => ({ jsonSchema: JSON.parse(JSON.stringify(schemaObj)), expiresAt: new Date(...).toISOString() })) in both tests that call createMiddlewareTestContext() and mock webhookManager.getWebhookData so the cache must use the JSON/stringified value rather than same-object identity; keep references to schemaObj, webhookManager.getWebhookData, createMiddlewareTestContext, and middleware when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/marketing/apify-complete-masterclass.md`:
- Line 355: The markdown list item for processingTime is mis-indented and
appears as a top-level bullet; update the indentation so the line "*
processingTime (ms, server-side processing time only)" is nested to the same
level as the other event fields (match their leading spaces or tabs) so it
renders as part of the same bullet group/section and not a separate top-level
requirement.
In `@README.md`:
- Around line 29-33: The sponsor badge links for Patreon, Liberapay and
thanks.dev cause hard 403s in the Lychee link checker; either replace those
badge URLs with checker-friendly targets (e.g., a canonical public landing URL
or an /u/ or /r/ redirect that returns 200) for the Patreon, Liberapay and
thanks.dev badges (look for the lines containing "[ or add an explicit Lychee allow/exclude
rule for those domains in the link-check configuration so Lychee ignores or
accepts these domains; choose one approach and apply the same fix to the
duplicated support section where the same badges appear.
In `@src/main.js`:
- Around line 154-170: The test guard isTestSystemInfoPollingDisabled must be
reset during test lifecycle teardown so new TestSystemInfoEventManager instances
get stubbed; update resetShutdownForTest() to set
isTestSystemInfoPollingDisabled = false (the same guard used by
disableActorSystemInfoPollingForTests) so that when
Actor.config.getEventManager() is replaced between tests the next initialize()
will re-stub emitSystemInfoEvent; ensure you also reset any related flags
alongside the existing teardown logic in resetShutdownForTest().
In `@tests/e2e/webhook_lifecycle.test.js`:
- Around line 41-46: The test enforces fixed wall-clock SLOs via the
CLIENT_VISIBLE_LATENCY_SLO_TARGETS_MS constant which can cause flaky CI
failures; change the test to skip or relax these assertions by default and only
run strict checks when an opt-in perf flag or environment override is set (e.g.,
read a PERF_RUN or LATENCY_SLO_OVERRIDE env var) or move the assertions out of
the default E2E path into a dedicated performance job; update the code paths
that reference CLIENT_VISIBLE_LATENCY_SLO_TARGETS_MS so they use the
env-provided thresholds or return early when the perf flag is not enabled.
In `@tests/setup/helpers/app-utils.js`:
- Around line 43-79: The initialize() call can fail and leave mutated state
(process.env[DISABLE_HOT_RELOAD, NODE_ENV, APIFY_LOCAL_STORAGE_DIR]), created
apifyLocalStorageDir, added actorListeners and new process SIGINT/SIGTERM
listeners; extract a cleanup function that restores process.env entries, removes
any SIGINT/SIGTERM listeners added during setup (compare against
initialSigintListeners/initialSigtermListeners), restores Actor.on and clears
actorListeners, and deletes the apifyLocalStorageDir; call this cleanup both if
mainModule.initialize(options) throws and from teardownApp after
shutdown()/resetShutdownForTest() so state is always restored even on init
failure (refer to symbols: mainModule.initialize, Actor.on, actorListeners,
apifyLocalStorageDir, initialSigintListeners, initialSigtermListeners,
teardownApp, shutdown, resetShutdownForTest).
In `@tests/setup/helpers/e2e-process-harness.js`:
- Around line 59-70: waitForChildClose currently always awaits once(child,
STREAM_EVENTS.CLOSE) and can hang if the child already emitted close; update
waitForChildClose (the function) to first detect already-closed processes by
checking child.exitCode !== null OR verifying stdio streams are already closed
(child.stdio && child.stdio.every(s => s && s.closed)) and return immediately in
that case, otherwise register await once(child, STREAM_EVENTS.CLOSE) as before;
this removes the race where once() misses an already-emitted close event.
In `@tests/unit/webhook_manager.test.js`:
- Around line 125-132: The test is vulnerable because generateWebhooks() may
already call openKeyValueStore so the assertion after resetStateForTest()
doesn't prove reset reopened the KVS; clear or reset the
apifyMock.openKeyValueStore call history (e.g., mockClear/mockReset) after
generateWebhooks() or before calling manager.init(), then call manager.init()
and assert apifyMock.openKeyValueStore was called (or called once) to prove
resetStateForTest() cleared the persistence handle; reference
manager.generateWebhooks, manager.resetStateForTest, manager.init and
apifyMock.openKeyValueStore when making this change.
---
Outside diff comments:
In `@src/logger_middleware.js`:
- Around line 917-933: The call to timeoutHandle.unref() in the timeout cleanup
block should guard against timeoutHandle being undefined to match the pattern
used in pushDataTimeout; update the code around the
controller.abort/LOG_MESSAGES.BACKGROUND_TIMEOUT block to use a safe check
(e.g., timeoutHandle?.unref() or an explicit if (timeoutHandle) guard) so
calling .unref() cannot throw when timeoutHandle is undefined, keeping the
behavior consistent with existing pushDataTimeout handling.
In `@tests/e2e/resilience.test.js`:
- Around line 113-178: The test measures dispatchElapsedMs before awaiting the
webhook requests which only times request creation, not completion; move the
timestamp capture (the startedAt/dispatchElapsedMs calculation) so that
dispatchElapsedMs is computed after awaiting Promise.all(ingestRequests) (or
capture end time into a new variable after Promise.all) and then assert that
dispatchElapsedMs (or the completion elapsed) is <= STRESS_WINDOW_MS +
STRESS_WINDOW_GRACE_MS; update references to startedAt/dispatchElapsedMs in the
test around ingestRequests, Promise.all(ingestRequests), and the final expect
that uses STRESS_WINDOW_MS and STRESS_WINDOW_GRACE_MS.
In `@tests/integration/security.sanitation.integration.test.js`:
- Around line 6-19: Move the test-side-effect import for setupCommonMocks to the
very top of the file so it runs before any src module imports; specifically add
an import of setupCommonMocks (the common test-mock initializer used across
tests) before the existing imports of APP_ROUTES,
HTTP_HEADERS/HTTP_METHODS/HTTP_STATUS/MIME_TYPES, startIntegrationApp,
createMalformedPayloadFixtures/createWebhookPayload, waitForCondition, and
AUTH_CONSTS so source modules initialize under the expected mocks.
In `@tests/unit/utils/custom_script_executor_mocked.test.js`:
- Around line 74-77: The beforeEach currently calls jest.resetModules(), which
doesn't fully isolate mocks created with jest.unstable_mockModule; replace that
call with the shared mock cleanup helper useMockCleanup() so the test suite
follows the project's mock isolation convention—update the beforeEach in "Custom
Script Executor worker failure paths" to call useMockCleanup() (import/use the
helper) instead of jest.resetModules() and ensure any unstable_mockModule usage
is covered by the helper.
---
Nitpick comments:
In `@src/db/duckdb.js`:
- Around line 378-393: The shutdown order in closeDb() is wrong: call and await
the write-queue shutdown before closing pooled/in-use connections and the db
instance to let in-flight writes finish; specifically, invoke the write-queue
stop/reset (e.g., await resetWriteQueue() or writeQueue.stop({ dropWaitingJobs:
true })) at the start of closeDb(), then proceed to drain connectionPool using
closeConnectionQuietly, drain inUseConnections, null dbInstance/initPromise, and
finally call closeInstanceQuietly(instanceToClose).
In `@src/services/SyncService.js`:
- Around line 163-177: The sync routine needs explicit `#isRunning` guards across
async boundaries to avoid doing work after stop(): inside `#syncLogs` add checks
for this.#isRunning (early-return) after each awaitable step (offset fetch,
Actor.openDataset, getInfo, getData, batchInsertLogs) and in the catch block
ensure you return early to suppress rethrows during shutdown; keep the limiter
catch that ignores ERROR_MESSAGES.BOTTLENECK_STOPPED and logs
LOG_MESSAGES.SYNC_SCHEDULE_ERROR for other errors. Also update the inline
comment that points to the catch handler so it references the `#triggerSync`()
catch handler (the catch inside `#triggerSync`) rather than the outdated line
numbers. Ensure all references mention the methods `#syncLogs`, `#triggerSync`, and
stop().
In `@src/utils/custom_script_executor.js`:
- Around line 1-4: The header block in src/utils/custom_script_executor.js is
missing the required `@module` tag; add a `@module` entry (for example "@module
customScriptExecutor" or a name matching the file's exported functionality) to
the top comment block alongside the existing `@file` and `@description` tags so the
file header meets the project's header guideline.
In `@tests/integration/logging.query.integration.test.js`:
- Around line 351-390: The .query(...) calls on activeContext.appClient when
hitting APP_ROUTES.LOGS include a no-op nested body filter (body: { id:
payload.id }) which only serializes to query string and isn't supported by the
/logs endpoint; remove the body property from both .query(...) invocations (the
waitForCondition block and the finalLogsResponse call) or replace it with a
supported top-level filter (e.g., use webhookId or another documented query
param) so the query only contains valid filters and the intent is clear.
In `@tests/unit/logger_middleware.test.js`:
- Around line 2646-2667: The Ajv prototype spy (ajvSpy) can leak if assertions
fail—wrap the sections that call middleware and assert responses in a
try/finally so ajvSpy.mockRestore() always runs; specifically, around the two
await middleware(req, res, next) + expect(...) blocks in this test, enclose them
in try { ... } finally { ajvSpy.mockRestore(); } (referencing AjvClass, ajvSpy,
middleware, res, next) to guarantee restoration of the compile spy and preserve
test isolation.
- Around line 366-430: The tests currently return the same schemaObj reference
from webhookManager.getWebhookData so the cache might be passing by reference
instead of value; change the mocks to return a fresh deep-cloned schema per call
(e.g., replace mockReturnValue(...) with mockImplementation(() => ({ jsonSchema:
JSON.parse(JSON.stringify(schemaObj)), expiresAt: new Date(...).toISOString()
})) in both tests that call createMiddlewareTestContext() and mock
webhookManager.getWebhookData so the cache must use the JSON/stringified value
rather than same-object identity; keep references to schemaObj,
webhookManager.getWebhookData, createMiddlewareTestContext, and middleware when
making the change.
In `@tests/unit/utils/custom_script_executor_mocked.test.js`:
- Around line 365-398: The test incorrectly treats terminate as awaited because
terminateResolved flips immediately; change terminateMock implementation in this
test to return a deferred promise (e.g., create a deferred with promise,
resolve, reject functions), have terminateMock.mockImplementation(() =>
deferred.promise), then after emitting the failed message assert that
pendingResult is still pending (for example attempt Promise.race with a short
timeout or check that awaiting a microtask doesn't resolve), then resolve
deferred.resolve() and finally await pendingResult and assert the resolved value
equals failedResult and terminateMock was called once; reference terminateMock,
terminateResolved (remove or replace), executeCustomScript, pendingResult,
createMockWorkerHarness, and MockWorker to locate the test elements to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8e9ec3f0-5c9e-4169-91dc-96dc61daf959
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (41)
.actor/actor.json.actor/dataset_schema.json.actor/output_schema.json.actor/web_server_schema.json.gitignoreCHANGELOG.mdREADME.mddocs/api-reference.mddocs/architecture.mddocs/marketing/apify-complete-masterclass.mddocs/playbooks/canary-shadow-rollout.mdpackage.jsonsrc/consts/errors.jssrc/consts/messages.jssrc/db/duckdb.jssrc/logger_middleware.jssrc/main.jssrc/repositories/LogRepository.jssrc/services/SyncService.jssrc/typedefs.jssrc/utils/custom_script_executor.jssrc/webhook_manager.jstests/e2e/resilience.test.jstests/e2e/webhook_lifecycle.test.jstests/integration/app.lifecycle.integration.test.jstests/integration/logging.query.integration.test.jstests/integration/security.sanitation.integration.test.jstests/setup/helpers/apify-mock.jstests/setup/helpers/app-utils.jstests/setup/helpers/e2e-process-harness.jstests/setup/helpers/shared-mocks.jstests/unit/actor/input_schema.test.jstests/unit/db/duckdb.test.jstests/unit/logger_middleware.test.jstests/unit/main.test.jstests/unit/routes/info.test.jstests/unit/security/logger_middleware_hardening.test.jstests/unit/services/sync_service.test.jstests/unit/utils/custom_script_executor.test.jstests/unit/utils/custom_script_executor_mocked.test.jstests/unit/webhook_manager.test.js
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/link-check.yml (1)
138-156:⚠️ Potential issue | 🟡 MinorAvoid re-hard-failing soft-handled 408 timeouts.
Line 141 treats 408 as soft-handled for known soft URLs, but line 156 still hard-fails whenever
results.timeoutsis non-zero. If lychee increments that counter for all timeouts (including the soft-handled ones), the workflow will fail despite the soft-handling policy.🔧 Proposed fix
- if ((results.timeouts ?? 0) > 0 || (results.unknown ?? 0) > 0 || (results.unsupported ?? 0) > 0) { + const softTimeouts = softFailures.filter((failure) => failure.code === 408).length; + const unhandledTimeouts = Math.max((results.timeouts ?? 0) - softTimeouts, 0); + + if (unhandledTimeouts > 0 || (results.unknown ?? 0) > 0 || (results.unsupported ?? 0) > 0) { hardFailures.push({ source: 'lychee-summary', url: null, code: null, - text: `Unhandled summary counters: timeouts=${results.timeouts ?? 0}, unknown=${results.unknown ?? 0}, unsupported=${results.unsupported ?? 0}`, + text: `Unhandled summary counters: timeouts=${unhandledTimeouts}, unknown=${results.unknown ?? 0}, unsupported=${results.unsupported ?? 0}`, }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/link-check.yml around lines 138 - 156, The workflow is still hard-failing when results.timeouts > 0 even though some 408 timeouts were already soft-handled via softHandledUrlKeys and pushed into softFailures; update the final failure-check to exclude timeouts that were soft-handled before deciding to hard-fail. Concretely, compute the number of soft-handled timeouts (e.g., count entries in softFailures where code === 408 or timeout-related) and subtract that from results.timeouts (or check results.timeouts - softHandledTimeouts > 0) when evaluating the condition that currently references results.timeouts, so only remaining/hard timeouts trigger the hard-fail logic involving results.unknown/results.unsupported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In @.github/workflows/link-check.yml:
- Around line 138-156: The workflow is still hard-failing when results.timeouts
> 0 even though some 408 timeouts were already soft-handled via
softHandledUrlKeys and pushed into softFailures; update the final failure-check
to exclude timeouts that were soft-handled before deciding to hard-fail.
Concretely, compute the number of soft-handled timeouts (e.g., count entries in
softFailures where code === 408 or timeout-related) and subtract that from
results.timeouts (or check results.timeouts - softHandledTimeouts > 0) when
evaluating the condition that currently references results.timeouts, so only
remaining/hard timeouts trigger the hard-fail logic involving
results.unknown/results.unsupported.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b8d0592e-aed8-44c0-b485-b59f687e62b4
📒 Files selected for processing (2)
.github/workflows/link-check.yml.lycheeignore
✅ Files skipped from review due to trivial changes (1)
- .lycheeignore
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
src/services/SyncService.js (1)
291-303: Silent error swallowing during shutdown may hide real failures.When
#isRunningflips tofalsemid-run, the catch block returns without logging. That's reasonable for expected shutdown-induced errors (cancelled dataset fetches, aborted batch inserts), but it also suppresses any genuine bug that happens to surface during the shutdown window — metrics won't be updated and operators won't see a trace. Consider logging atdebuglevel (or tagging the log with aduringShutdown: truefield) so the signal isn't lost entirely:Proposed change
} catch (err) { if (!this.#isRunning) { + log.debug( + { err: serializeError(err) }, + LOG_MESSAGES.SYNC_ERROR_GENERAL, + ); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/SyncService.js` around lines 291 - 303, The catch block in SyncService.js currently returns early when this.#isRunning is false and thus silently swallows errors; update that branch to still emit a log at debug (or include a field like duringShutdown: true) using the same logging facility (e.g., call log.debug or log.error with { err: serializeError(err), duringShutdown: true }) before returning so shutdown-related errors are recorded, and keep the existing behavior of not incrementing `#errorCount/`#lastErrorTime in that early-return path; ensure the overall catch still re-throws for non-shutdown paths so `#triggerSync`()'s catch handler still receives real errors.src/db/duckdb.js (1)
76-84: Remove or use the unusedresetWriteQueue()helper.Static analysis flags this new helper as unused. Either delete it or refactor
resetDbInstance()/closeDb()to call it where the ordering still matches the teardown requirements.Proposed cleanup
-/** - * Replaces the current write queue with a fresh instance after shutting the prior - * queue down. - * `@returns` {Promise<void>} - */ -async function resetWriteQueue() { - await stopWriteQueue(); - writeQueue = createWriteQueue(); -} -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db/duckdb.js` around lines 76 - 84, The helper resetWriteQueue() is declared but unused; either remove it or call it from the existing teardown flows so ordering remains correct. Locate resetWriteQueue() and either delete the function entirely (and ensure stopWriteQueue() + createWriteQueue() are not needed elsewhere) or replace direct stopWriteQueue()/createWriteQueue() calls inside resetDbInstance() and/or closeDb() with a single await resetWriteQueue() so the write-queue shutdown and recreation logic is centralized and the teardown ordering is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/db/duckdb.js`:
- Around line 177-188: resetDbInstance currently clears connectionPool and
inUseConnections without closing their handles; iterate over both arrays and
call closeInstanceQuietly(...) for each connection handle (e.g., for each item
in connectionPool and inUseConnections) to close them before setting
connectionPool.length = 0 and inUseConnections.length = 0, then proceed to null
out dbInstance/initPromise and finally call
closeInstanceQuietly(instanceToClose) and recreate writeQueue; reference the
resetDbInstance, connectionPool, inUseConnections, and closeInstanceQuietly
symbols when making this change.
In `@tests/integration/logging.query.integration.test.js`:
- Around line 321-350: The test expects the wrong HTTP status:
startIntegrationApp was configured with defaultResponseCode:
HTTP_STATUS.ACCEPTED, but the assertion checks HTTP_STATUS.CREATED; update the
assertion in the test (the expect on ingestResponse.status /
expect(...toBe(...)) after activeContext.appClient.post to assert
HTTP_STATUS.ACCEPTED (202) instead of HTTP_STATUS.CREATED (201) so the expected
status aligns with the configured defaultResponseCode used when creating the
webhook via resolveActiveWebhookId/createWebhookPayload.
In `@tests/setup/helpers/app-utils.js`:
- Around line 112-124: The dynamic import of main.js can throw and currently
isn't wrapped, leaking the temp dir, env changes, and patched Actor.on; wrap the
import in a try/catch (or add a surrounding try/finally) so that if await
import(mainModuleUrl.href) throws you call cleanup() before rethrowing, ensuring
actorListeners/Actor.on patches and testInstance cleanup always run; locate the
import of mainModuleUrl.href and the existing cleanup() call near initialize()
and ensure cleanup() is invoked on import errors as well.
In `@tests/setup/helpers/e2e-process-harness.js`:
- Around line 232-234: The eagerly-created childClosePromise (const
childClosePromise = once(child, STREAM_EVENTS.CLOSE)...) can reject if the child
emits 'error' before 'close', causing unhandled rejections when stop() takes the
early-return path (exitCode !== null); fix by either deferring creation of that
promise until it's actually awaited (move the once(...) call into the branch
that needs it) or attach a no-op catch to swallow rejections (e.g.,
childClosePromise.catch(() => {})) so spawn/runtime errors do not surface as
unhandled promise rejections; update references to childClosePromise and ensure
stop() and any code checking exitCode still await or safely ignore the promise.
- Around line 59-74: The JSDoc for function waitForChildClose is missing a
`@param` for closePromise which causes TS7006; update the comment block above
waitForChildClose to add a second `@param` describing closePromise (e.g., `@param`
{Promise<void>} closePromise) so TypeScript can type-check the parameter
correctly, leaving the existing `@param` for child unchanged and matching the
function signature.
In `@tests/unit/utils/custom_script_executor_mocked.test.js`:
- Around line 370-374: The Promise resolver has a broader signature than the
declared resolveTerminate (() => void); update the assignment to cast the
resolver to the exact no-arg type used in tests by using assertType<() =>
void>(resolve) when assigning to resolveTerminate (i.e., keep resolveTerminate
declared as let resolveTerminate: () => void; and change resolveTerminate =
assertType<() => void>(resolve);) so the types align with the project's pattern
used in similar test files.
---
Nitpick comments:
In `@src/db/duckdb.js`:
- Around line 76-84: The helper resetWriteQueue() is declared but unused; either
remove it or call it from the existing teardown flows so ordering remains
correct. Locate resetWriteQueue() and either delete the function entirely (and
ensure stopWriteQueue() + createWriteQueue() are not needed elsewhere) or
replace direct stopWriteQueue()/createWriteQueue() calls inside
resetDbInstance() and/or closeDb() with a single await resetWriteQueue() so the
write-queue shutdown and recreation logic is centralized and the teardown
ordering is preserved.
In `@src/services/SyncService.js`:
- Around line 291-303: The catch block in SyncService.js currently returns early
when this.#isRunning is false and thus silently swallows errors; update that
branch to still emit a log at debug (or include a field like duringShutdown:
true) using the same logging facility (e.g., call log.debug or log.error with {
err: serializeError(err), duringShutdown: true }) before returning so
shutdown-related errors are recorded, and keep the existing behavior of not
incrementing `#errorCount/`#lastErrorTime in that early-return path; ensure the
overall catch still re-throws for non-shutdown paths so `#triggerSync`()'s catch
handler still receives real errors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3f7f0732-94e1-417d-88ed-5e300810b029
📒 Files selected for processing (16)
docs/marketing/apify-complete-masterclass.mdsrc/db/duckdb.jssrc/logger_middleware.jssrc/main.jssrc/services/SyncService.jssrc/utils/custom_script_executor.jstests/e2e/resilience.test.jstests/e2e/webhook_lifecycle.test.jstests/integration/logging.query.integration.test.jstests/integration/security.sanitation.integration.test.jstests/setup/helpers/app-utils.jstests/setup/helpers/e2e-process-harness.jstests/unit/db/duckdb.test.jstests/unit/logger_middleware.test.jstests/unit/utils/custom_script_executor_mocked.test.jstests/unit/webhook_manager.test.js
✅ Files skipped from review due to trivial changes (1)
- docs/marketing/apify-complete-masterclass.md
🚧 Files skipped from review as they are similar to previous changes (5)
- tests/unit/webhook_manager.test.js
- tests/e2e/webhook_lifecycle.test.js
- src/utils/custom_script_executor.js
- tests/unit/db/duckdb.test.js
- tests/integration/security.sanitation.integration.test.js
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/db/duckdb.js (1)
180-192: Refactor LGTM — addresses the prior pool-leak concern.
resetDbInstance()now stops the write queue (awaiting running jobs), drains both pool arrays while closing each handle, then tears down the singleton. This correctly resolves the earlier feedback about losing pooled/in-use connection handles.One edge case to be aware of: read operations via
executeQuery()bypasswriteQueueand acquire connections directly (used by SyncService and LogRepository). An in-flight read duringresetDbInstance()will have its connection ininUseConnectionsand can have its handle closed mid-query bydrainConnections(). This surfaces as close-time errors during concurrent shutdown/reset scenarios. If visible in shutdown logs, consider tracking in-flight reads (e.g., a pending counter or AbortController) so reset can await them before draining.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db/duckdb.js` around lines 180 - 192, resetDbInstance() can close connection handles that executeQuery() readers are actively using because executeQuery() bypasses writeQueue and borrows from inUseConnections; update resetDbInstance() (and/or drainConnections()) to wait for or track in-flight read operations before closing handles by adding a pending-read counter or AbortController mechanism that executeQuery() increments/decrements (e.g., increment a pendingReads counter when executeQuery() acquires a connection and decrement when finished) and have resetDbInstance() await pendingReads === 0 (or signal/abort and await completion) before calling closeInstanceQuietly() / draining inUseConnections; ensure the new tracking is referenced where executeQuery() acquires/releases connections and observed by resetDbInstance()/drainConnections().tests/unit/utils/custom_script_executor_mocked.test.js (1)
186-241: Consider splitting this test into focused cases.The test asserts three distinct behaviors in one body:
unref()is called once, listeners are cleared after resolution, and late direct invocations of captured.listenerhandlers are no-ops (terminate count stays at 1). When any one of these regresses, the failure message will be harder to localize. Splitting into threeit(...)blocks (or at least grouping late-event replay into its own case) would make failures more diagnostic without losing coverage.Also note that lines 203-208 rely on Node's
_onceWrapinternal exposing.listeneronrawListeners()[0]. That's documented/stable, but worth being aware of since a change in Node's wrapper would silently makelateMessageListener/lateErrorListenerundefinedand the optional-chain?.callwould no-op—the remaining assertion (terminateMockcalled once) would still pass, masking the regression. AtoBeInstanceOf(Function)on the captured listeners (already partially covered bytoBeDefined) is sufficient.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/utils/custom_script_executor_mocked.test.js` around lines 186 - 241, Split the single long spec into three focused tests: one that verifies executeCustomScript calls unref() once (reference executeCustomScript and unrefMock/MockWorker), one that verifies listeners are removed after resolution (use workerInstances[0].listenerCount for STREAM_EVENTS.MESSAGE/ERROR/EXIT and the original executeCustomScript flow), and one that verifies late captured listeners are no-ops (capture lateMessageListener/lateErrorListener from workerInstances[0].rawListeners(...)[0].listener, assert each is a Function with toBeInstanceOf(Function), then call them and assert terminateMock remains called only once). Ensure each new it() isolates setup/teardown (createMockWorkerHarness, jest.unstable_mockModule, import executeCustomScript) so failures point to the specific behavior.tests/unit/logger_middleware.test.js (1)
3314-3350: Consider adding local restoration of theDate.nowspy for test isolation.The harness already restores all mock implementations via
jest.restoreAllMocks()inuseMockCleanup()'s afterEach, but storing the spy and callingmockRestore()in a finally block keeps this test's cleanup explicit and self-contained, reducing coupling to harness behavior.🧪 Proposed test-isolation improvement
- jest.spyOn(Date, "now").mockImplementation(() => { + const dateNowSpy = jest.spyOn(Date, "now").mockImplementation(() => { const nextValue = mockedNowValues.shift(); return nextValue ?? fallbackNowValue; }); - for (let index = 0; index < EXACT_LATENCY_SAMPLE_COUNT; index += 1) { - const req = createMockRequest({ - method: HTTP_METHODS.POST, - params: { id: "wh_exact_latency_percentiles" }, - body: createWebhookPayload({ - id: `evt_exact_latency_${index}`, - source: "unit-latency-exact", - data: { sequence: index }, - }), - }); - const res = createMockResponse(); - const next = createMockNextFunction(); - - await middleware(req, res, next); - - expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.OK); - } - - const emittedSamples = onEvent.mock.calls.map( - ([event]) => assertType(event).processingTime, - ); - - expect(emittedSamples).toEqual(EXACT_LATENCY_SAMPLES_MS); - expect( - getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p50), - ).toBe(EXACT_LATENCY_EXPECTED_MS.p50); - expect( - getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p95), - ).toBe(EXACT_LATENCY_EXPECTED_MS.p95); - expect( - getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p99), - ).toBe(EXACT_LATENCY_EXPECTED_MS.p99); + try { + for (let index = 0; index < EXACT_LATENCY_SAMPLE_COUNT; index += 1) { + const req = createMockRequest({ + method: HTTP_METHODS.POST, + params: { id: "wh_exact_latency_percentiles" }, + body: createWebhookPayload({ + id: `evt_exact_latency_${index}`, + source: "unit-latency-exact", + data: { sequence: index }, + }), + }); + const res = createMockResponse(); + const next = createMockNextFunction(); + + await middleware(req, res, next); + + expect(res.status).toHaveBeenCalledWith(HTTP_STATUS.OK); + } + + const emittedSamples = onEvent.mock.calls.map( + ([event]) => assertType(event).processingTime, + ); + + expect(emittedSamples).toEqual(EXACT_LATENCY_SAMPLES_MS); + expect( + getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p50), + ).toBe(EXACT_LATENCY_EXPECTED_MS.p50); + expect( + getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p95), + ).toBe(EXACT_LATENCY_EXPECTED_MS.p95); + expect( + getPercentile(emittedSamples, EXACT_LATENCY_PERCENTILES.p99), + ).toBe(EXACT_LATENCY_EXPECTED_MS.p99); + } finally { + dateNowSpy.mockRestore(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/logger_middleware.test.js` around lines 3314 - 3350, The Date.now spy used in this test should be restored locally to ensure test isolation: store the spy returned by jest.spyOn(Date, "now") in a variable (e.g., dateNowSpy) before the loop and call dateNowSpy.mockRestore() in a finally block after the middleware loop so the mock is always cleaned up even if assertions fail; reference the existing mockedNowValues/fallbackNowValue setup and the middleware invocation so you restore Date.now for this test rather than relying only on global jest.restoreAllMocks().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/logger_middleware.test.js`:
- Around line 503-513: The test patches Map.prototype.keys with keysSpy but
restores it only after the assertion, which can leak if the assertion throws;
wrap the call to middleware.updateOptions (and the expect) in a try/finally and
call keysSpy.mockRestore() in the finally block so Map.prototype.keys is always
restored; reference the keysSpy spy on Map.prototype.keys and the
middleware.updateOptions call when making the change.
---
Nitpick comments:
In `@src/db/duckdb.js`:
- Around line 180-192: resetDbInstance() can close connection handles that
executeQuery() readers are actively using because executeQuery() bypasses
writeQueue and borrows from inUseConnections; update resetDbInstance() (and/or
drainConnections()) to wait for or track in-flight read operations before
closing handles by adding a pending-read counter or AbortController mechanism
that executeQuery() increments/decrements (e.g., increment a pendingReads
counter when executeQuery() acquires a connection and decrement when finished)
and have resetDbInstance() await pendingReads === 0 (or signal/abort and await
completion) before calling closeInstanceQuietly() / draining inUseConnections;
ensure the new tracking is referenced where executeQuery() acquires/releases
connections and observed by resetDbInstance()/drainConnections().
In `@tests/unit/logger_middleware.test.js`:
- Around line 3314-3350: The Date.now spy used in this test should be restored
locally to ensure test isolation: store the spy returned by jest.spyOn(Date,
"now") in a variable (e.g., dateNowSpy) before the loop and call
dateNowSpy.mockRestore() in a finally block after the middleware loop so the
mock is always cleaned up even if assertions fail; reference the existing
mockedNowValues/fallbackNowValue setup and the middleware invocation so you
restore Date.now for this test rather than relying only on global
jest.restoreAllMocks().
In `@tests/unit/utils/custom_script_executor_mocked.test.js`:
- Around line 186-241: Split the single long spec into three focused tests: one
that verifies executeCustomScript calls unref() once (reference
executeCustomScript and unrefMock/MockWorker), one that verifies listeners are
removed after resolution (use workerInstances[0].listenerCount for
STREAM_EVENTS.MESSAGE/ERROR/EXIT and the original executeCustomScript flow), and
one that verifies late captured listeners are no-ops (capture
lateMessageListener/lateErrorListener from
workerInstances[0].rawListeners(...)[0].listener, assert each is a Function with
toBeInstanceOf(Function), then call them and assert terminateMock remains called
only once). Ensure each new it() isolates setup/teardown
(createMockWorkerHarness, jest.unstable_mockModule, import executeCustomScript)
so failures point to the specific behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0090eeb9-5632-4f47-a1d5-723ac80650e0
📒 Files selected for processing (11)
CHANGELOG.mdsrc/db/duckdb.jstests/README.mdtests/e2e/README.mdtests/integration/README.mdtests/integration/logging.query.integration.test.jstests/setup/helpers/app-utils.jstests/setup/helpers/e2e-process-harness.jstests/unit/db/duckdb.test.jstests/unit/logger_middleware.test.jstests/unit/utils/custom_script_executor_mocked.test.js
✅ Files skipped from review due to trivial changes (4)
- tests/e2e/README.md
- tests/integration/README.md
- tests/README.md
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/setup/helpers/e2e-process-harness.js
- tests/unit/db/duckdb.test.js
- tests/integration/logging.query.integration.test.js
There was a problem hiding this comment.
🧹 Nitpick comments (2)
.claude/rules/testing.md (1)
63-63: Nit: compound adjective hyphenation.Consider "temp-dir-based harness setup" for the compound modifier.
✏️ Proposed tweak
-- `fsPromisesMock` - Shared `node:fs/promises` mock surface, including `mkdtemp()` for temp-dir based harness setup +- `fsPromisesMock` - Shared `node:fs/promises` mock surface, including `mkdtemp()` for temp-dir-based harness setup🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/rules/testing.md at line 63, The phrase "temp-dir based harness setup" should use hyphenation as a compound modifier; update the documentation line describing `fsPromisesMock` (which references `mkdtemp()` and the shared `node:fs/promises` mock surface) to read "temp-dir-based harness setup" so the compound adjective is grammatically correct.src/logger_middleware.js (1)
447-475: Optional: simplify#refreshCompilationsby leaning on the new cache.Now that
#getValidatoris memoized via#schemaCacheKeys+#validatorCache, the explicitoldSchemaStr/newSchemaStrguard on Lines 464-472 duplicates aJSON.stringifypass perupdateOptionscall and no longer guards against anything expensive — a cache hit returns the existing compiled validator for free. You can drop the guard and call#getValidatorunconditionally, which also produces the correctnullwhenjsonSchemais cleared.♻️ Proposed simplification
- // 2. Smart schema re-compilation - const oldSchemaStr = - currentOptions && currentOptions.jsonSchema - ? JSON.stringify(currentOptions.jsonSchema) - : undefined; - const newSchemaStr = newOptions.jsonSchema - ? JSON.stringify(newOptions.jsonSchema) - : undefined; - - if (!currentOptions || newSchemaStr !== oldSchemaStr) { - this.#validate = this.#getValidator(newOptions.jsonSchema); - } + // 2. Schema validator (cached by `#getValidator`) + this.#validate = this.#getValidator(newOptions.jsonSchema);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/logger_middleware.js` around lines 447 - 475, The current `#refreshCompilations` method does an extra JSON.stringify compare (oldSchemaStr/newSchemaStr) before calling this.#getValidator, which is redundant now that `#getValidator` is memoized via `#schemaCacheKeys` and `#validatorCache`; remove the oldSchemaStr/newSchemaStr computation and the conditional check and instead call this.#validate = this.#getValidator(newOptions.jsonSchema) unconditionally so cache hits avoid re-compilation and clearing jsonSchema still yields the correct null validator.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/rules/testing.md:
- Line 63: The phrase "temp-dir based harness setup" should use hyphenation as a
compound modifier; update the documentation line describing `fsPromisesMock`
(which references `mkdtemp()` and the shared `node:fs/promises` mock surface) to
read "temp-dir-based harness setup" so the compound adjective is grammatically
correct.
In `@src/logger_middleware.js`:
- Around line 447-475: The current `#refreshCompilations` method does an extra
JSON.stringify compare (oldSchemaStr/newSchemaStr) before calling
this.#getValidator, which is redundant now that `#getValidator` is memoized via
`#schemaCacheKeys` and `#validatorCache`; remove the oldSchemaStr/newSchemaStr
computation and the conditional check and instead call this.#validate =
this.#getValidator(newOptions.jsonSchema) unconditionally so cache hits avoid
re-compilation and clearing jsonSchema still yields the correct null validator.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2ce284dd-0d16-49a4-9d8f-d027b54f6dcc
📒 Files selected for processing (15)
.claude/rules/testing.mdCHANGELOG.mddocs/architecture.mdsrc/db/duckdb.jssrc/logger_middleware.jssrc/services/SyncService.jstests/README.mdtests/integration/README.mdtests/integration/security.sanitation.integration.test.jstests/setup/helpers/apify-mock.jstests/setup/helpers/shared-mocks.jstests/unit/db/duckdb.test.jstests/unit/logger_middleware.test.jstests/unit/services/sync_service.test.jstests/unit/utils/custom_script_executor_mocked.test.js
✅ Files skipped from review due to trivial changes (4)
- tests/integration/README.md
- tests/README.md
- CHANGELOG.md
- docs/architecture.md
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/setup/helpers/shared-mocks.js
- src/db/duckdb.js
- tests/integration/security.sanitation.integration.test.js
- tests/unit/db/duckdb.test.js
- tests/unit/services/sync_service.test.js
- src/services/SyncService.js
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 46 out of 48 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/db/duckdb.js:306
- resetDbInstance() waits for activeConnectionOperations to drain, but acquireConnection() can still be doing awaited work (e.g., isConnectionUsable() validation query or instance.connect()) before executeQuery() increments activeConnectionOperations. If resetDbInstance runs concurrently (notably during shutdown), it can close the DuckDB instance while a connection acquisition/validation is in-flight, causing intermittent failures. Consider tracking activeConnectionOperations for the entire acquireConnection/isConnectionUsable phase (or otherwise preventing reset from closing the instance until acquisitions finish).
async function acquireConnection() {
await waitForResetCompletion();
const instance = await getDbInstance();
while (connectionPool.length > 0) {
const pooledConn = connectionPool.pop();
if (!pooledConn) {
break;
}
const isUsable = await isConnectionUsable(pooledConn);
if (!isUsable) {
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…on to include updated reset behavior Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Summary
processingTimebefore any configuredresponseDelayMsso persisted metrics represent server-side processing work onlyValidation
npm run test:coverage:matrix && npm run coverage:check:matrixnpm testnpm run test:jest -- tests/unit/routes/info.test.jsSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Chores