fix(discovery): tune RPC timeout and failure threshold from production data by uibeka · Pull Request #239 · Conway-Research/automaton

uibeka · 2026-02-27T01:49:03Z

Problem

PER_CHUNK_TIMEOUT_MS (3s) and MAX_CONSECUTIVE_FAILURES (2) in getRegisteredAgentsByEvents() were set pre-production in PR #228. Production operation on Base's public RPC revealed both values are too aggressive for real-world latency:

eth_getLogs on recent block ranges regularly exceeds 3 seconds (3-6s observed). These are not failures — the RPC is responding, just slower on recent blocks that aren't fully indexed yet. The 3-second Promise.race timeout treats them as failures.
Two consecutive timeouts on the newest chunks aborts the entire scan. The scanner processes newest blocks first. If the first two chunks are slow (common for recent blocks), the scanner gives up before ever reaching the older blocks where agent mint events actually live. Result: discover_agents reports 0 agents even though 20+ exist on-chain.

Production Evidence

Observed across two agents running on Base mainnet (Feb 26):

Scanner hits 2 consecutive chunk timeouts on recent blocks → "Too many consecutive chunk failures, stopping scan" → discover_agents returns 0 agents
Agents loop on empty discovery results or escalate to expensive fallback operations
After patching to 8s/5-failures locally: 10+ successful discovery calls, scanner finds all 20+ agents consistently

Fix

Two constant value changes in getRegisteredAgentsByEvents():

Constant	Before	After	Rationale
`PER_CHUNK_TIMEOUT_MS`	`3_000` (3s)	`8_000` (8s)	Accommodates observed Base RPC latency (3-6s) with margin
`MAX_CONSECUTIVE_FAILURES`	`2`	`5`	Tolerates transient slow chunks on recent blocks without abandoning scan

Scope

1 file changed (src/registry/erc8004.ts) — 2 constant values updated
1 test file (src/__tests__/loop.test.ts) — timeout bump from 30s default to 180s to accommodate corrected scanning duration. The discover_agents loop test makes real RPC calls to Base mainnet; with correct timeout values, worst-case scan path is ~40s (5×8s), exceeding the default 30s test timeout. The test was only fast before because the scanner was aborting prematurely — the exact bug being fixed.
Zero logic changes — the scanning loop, timeout mechanism, failure tracking, and log format are all untouched

Testing

pnpm build — zero errors
pnpm test — all existing tests pass
Production-validated: 10+ successful discovery calls across two agents after local patch

…n data PER_CHUNK_TIMEOUT_MS (3s → 8s) and MAX_CONSECUTIVE_FAILURES (2 → 5) in getRegisteredAgentsByEvents were set pre-production in PR Conway-Research#228. Production operation on Base revealed both are too tight: eth_getLogs on recent block ranges regularly exceeds 3s, and two consecutive timeouts on the newest chunks (scanned first) aborts the entire scan before reaching older blocks where agent mint events live. Updated to production-validated values tested across 10+ successful discovery calls on two agents. 8s accommodates observed Base RPC latency with margin. 5 consecutive failures tolerates transient slow chunks without abandoning the scan. Also bumped timeout on the discover_agents loop test (180s) since it makes real RPC calls and the increased per-chunk timeout extends worst-case execution time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

uibeka · 2026-02-27T15:25:43Z

@unifiedh PR #241 merged and complimentary to this PR. Please review when you get a chance

uibeka mentioned this pull request Feb 27, 2026

feat(discovery): configurable RPC endpoint for agent discovery #240

Open

tyxben mentioned this pull request Feb 27, 2026

fix(discovery): binary search fallback when totalSupply() reverts #241

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(discovery): tune RPC timeout and failure threshold from production data#239

fix(discovery): tune RPC timeout and failure threshold from production data#239
uibeka wants to merge 1 commit intoConway-Research:mainfrom
uibeka:fix/discovery-rpc-timeout-tuning

uibeka commented Feb 27, 2026

Uh oh!

uibeka commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

uibeka commented Feb 27, 2026

Problem

Production Evidence

Fix

Scope

Testing

Related

Uh oh!

uibeka commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant