feat(sim): path-aware forager routing (Fix-A) + deepened recent-tiles (C-both) — flurry PR 5 (V29)#207
feat(sim): path-aware forager routing (Fix-A) + deepened recent-tiles (C-both) — flurry PR 5 (V29)#207LightAxe wants to merge 13 commits into
Conversation
…28) — flurry PR 4 PR 4 of the #127/#128 fix series (plan/flurry/PR2-FIX-SPECS.md). Freezes surface terrain so it never mutates as food piles spawn/deplete or entrances are designated — the static-terrain bug class Phase 0 measured (477/900 piles revealed a HardBlock on removal). Scope is STRICTLY static terrain; no #127/#128 movement fixes (those are PR 5 / PR 6). What changed: - surface-features.ts: deleted the dynamic entrance/food-pile suppression from the shared feature selector (movement AND render). Split procedural vs frozen: surfaceFeatureProcedural (terrainSeed only) + a baked 128x128 movement-effect grid that is the source of truth for surfaceMovementAt (O(1)). surfaceFeatureAt consults the carve override so render never paints a sprite over a carved-passable tile (R4-3). Added bakeStaticTerrain (deterministic root-clearance reservation + corridor carve guaranteeing ONE connected component; no terrainSeed retry), computeSurfaceComponentMask, isSurfaceTileInComponent, validateSurfaceConnectivity. Boot-asserts no registry feature is Cosmetic (carve-detection precondition). - types.ts: SIM_VERSION_V28_STATIC_TERRAIN (LATEST=28); new stored Uint8Array bakedSurfaceEffect + derived surfaceComponentMask; procedural bake in createWorldState; threaded copyWorldState. - save.ts: bakedSurfaceEffect serialized packed 2-bit + base64 (~5.3 KB vs ~48 KB raw). Deserialize validates dims / enum range / full connectivity (every saved pile + entrance), rejecting corrupt or pre-V28 maps. MIN_ACCEPTED raised to V28 (posture 2 — old saves reject cleanly). - scenario.ts: bake BEFORE placing piles; reachable-spawn pile gate; world-gen connectivity assertion. - tick.ts: DesignateEntrance rejects unless candidate + clearance are already passable on the frozen grid AND in the connected component (can no longer carve). - food-system.ts: runtime pile spawn gated on the connected component. - constants.ts: SURFACE_ROOT_CLEARANCE_RADIUS. Oracle/tests: featureFieldHash extended to the FULL SurfaceFeatureSlice (kind+variant+anchor); committed oracle (0 mutate across pile spawn/deplete, entrance designate/open, save/load), connectivity, and simVersion rejection-boundary tests; determinism serializer threads the baked grid; suppression/render tests updated to the static-terrain contract. Measurements: save-size delta +0.49%; tick-time ~0.24 ms/tick; acceptance sweep (10 seeds x 3 difficulties) 0 connectivity failures, 0 hash mutations. verify green (79 files / 2336 tests). Does NOT close #127/#128 (PR 5/6). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…view) Addresses the PR #206 CI failure + Codex/ship-review findings. - Perf (CI timeout fix): createWorldState + createScenario ran a full 16,384-tile procedural bake on every call (twice per scenario), tipping many-iteration tests over the 5s timeout on the CI runner. Memoize bakeSurfaceEffectGrid by terrainSeed (pure function of it) and return a copy; the suite is now faster than the pre-PR baseline (tests ~31s vs ~60s). - save.ts base64 (Codex P2): reject `=` padding outside the final quartet (and the never-valid "x=y" form) instead of treating it as zero — malformed encodings now fail the load boundary as intended. - SAVE_FORMAT_VERSION 3 → 4 (ship-review): the new required on-disk bakedSurfaceEffect field is a shape break; bump it (and SAVE_KEY :v3→:v4 + purge old key) so save-shape correctness isn't silently coupled to the simVersion floor. - DesignateEntrance halo (ship-review, bugs dim): check component membership per clearance tile instead of a bare HardBlock test, so it no longer relies on an unenforced "globally one walkable component" assumption (isolated empty pockets are harmless). Softened the over-claiming doc comments to match what the bake actually guarantees (all roots/piles/entrances share one component). - computeSurfaceComponentMask: throw on a wrong-sized grid (fail loud instead of silently flooding the map into one component). verify green (79 files / 2336 tests). Still PR 4 scope only; does NOT close #127/#128. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rio (codex P1) Codex P1: the procedural-bake `Map` was mutable module-level state in src/sim — an AGENTS.md ECS-rule blocker (state outside the world snapshot). Removed it. Restore the performance without a cache: - bakeStaticTerrain now carves a COPY of the procedural grid createWorldState already baked, instead of re-baking from scratch — one full procedural pass per createScenario (was two). Measured: 200x createScenario ~2.0s (was ~4s). - vitest.config: raise the default testTimeout to 15s. Static terrain made world construction bake the 128x128 grid (~8 ms each), so construction-heavy sweeps (200-seed spider-lair, 500-iteration pause-cadence) sat near the 5s default and tipped over it under CI-runner contention — the PR #206 CI timeout. Assertions and iteration counts are unchanged; genuinely-hung tests still fail. verify green (79 files / 2336 tests, ~57s). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ut fix The per-tile procedural bake called the recursive overlap-suppression check up to 16,384× per grid (~8 ms), so construction-heavy tests (200-seed spider-lair, 500-iteration pause-cadence) blew even a 15 s timeout on the CI runner. Raising the timeout only masked it; this fixes the root cause. bakeSurfaceEffectGrid now iterates ANCHORS in lex (ay,ax) order instead of tiles: the first surviving anchor covering a tile claims it (matching the per-tile lex-smallest-winner tie-break), with a per-anchor "first covering feature per offset" scratch to reproduce the selector's first-covering-feature + break-on- suppress semantics exactly, including negative (off-edge) anchors. Verified byte-identical to the per-tile surfaceFeatureProcedural across 80 seeds × all 16,384 tiles (0 mismatches), so surfaceFeatureAt's carve detection and the feature-field oracle stay consistent. Measured: ~0.55 ms/bake (was ~8 ms); 200× createScenario ~0.45 s (was ~2-4 s). Also: - Reverted the vitest testTimeout bump — no longer needed; the bake is fast. - createScenario resets surfaceComponentMask after initColony so the persisted/ validated mask is rooted at the real canonical root (first colony entrance) rather than relying on the pre-colony fallback root happening to equal it (ship-review advisory). verify green (79 files / 2336 tests). PR 4 scope only; does NOT close #127/#128. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(codex P2) The DesignateEntrance sim gate (PR 4) now rejects tiles outside the connected walkable component or with a blocked clearance halo, but surface-input still previewed any empty tile — so a player saw a valid-looking entrance target and confirmation did nothing. Add isValidEntranceTarget (empty tile AND candidate + SURFACE_ROOT_CLEARANCE_RADIUS halo all in-component) mirroring the gate, and use it for the right-click preview. Smoke tests updated: the mock world carries a fully-walkable baked grid, plus new cases asserting no preview when the target or a halo tile is HardBlock. verify green (79 files / 2338 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ia simVersion The e2e CI job failed (5 save/load specs): bumping SAVE_KEY v3→v4 orphaned every spec that addresses the literal 'subterrans:save:v3', and the bump itself was only a low-severity ship-review advisory, not a spec requirement. Revert SAVE_FORMAT_VERSION (4→3) and SAVE_KEY (:v4→:v3). The new required bakedSurfaceEffect lives in the SNAPSHOT (SerializedWorldState), and this codebase versions snapshot-content shape changes by simVersion — not the envelope's SAVE_FORMAT_VERSION (which versions envelope structure). deserializeWorldState validates simVersion FIRST, so raising MIN_ACCEPTED_SIM_VERSION to V28 already rejects every pre-static-terrain save (which also lacks the field) before unpack is reached — the required behaviour, achieved without an envelope bump and matching the v2/v3 precedent. The legacy-purge list reverts accordingly. All required PR-4 save behaviour is retained: MIN_ACCEPTED=V28, the required bakedSurfaceEffect field, and load-time dims/enum/connectivity validation. Verified locally: npm run verify green (2338); the 5 previously-failing e2e specs (menu-and-dialog #115/#196, phase-09 save-prompt) now pass (19/19 chromium). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ailing-bit guard Two low-severity ship-review advisories on the latest diff: - surface-features.ts: the @param world JSDoc on isAnchorSuppressedByOverlap still described the deleted isAnchorGameplaySuppressed call. Correct it — world is now only threaded to the recursive calls; no field is read (PR 4 removed dynamic suppression). Doc-only. - save.ts: unpackBakedSurfaceEffect now rejects non-zero unused high bits in the final packed byte when expectedLen isn't a multiple of 4, so a tampered payload can't decode to a grid serialize would never emit. No-op for the production 16384-tile grid (a multiple of 4), but keeps the exported, length-general validator honest. (Left as-is: the preview/sim entrance gates still diverge on column-uniqueness / entrance-cap — that divergence predates PR 4; this PR only added and mirrored the new component/clearance gate. Out of scope.) verify green (79 files / 2338 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
🔍 Adversarial Review (Claude, standing in for Codex)Reviewed commit: What I verified before the findings: P2 — The
P2 — Underground scent-pull silently removed, and the comment claims the opposite Pre-PR 5, the scent branch had no zone guard: an underground P2 — The surface unreachable-target case degrades to wall-aiming instead of asserting The spec's locked owner requirement: "no path → pin on wall" is structurally eliminated; an actual connectivity violation "yields P3 —
P3 — Load accepts ring records the serializer can never produce
P3 — Sweep omits the spec's "underground-carrier behaviour" metric The §PR 5 sweep metric list is "pickup→deposit latency, mean & 95p path length, task-completion counts, underground-carrier behaviour, plus serialized save bytes." P3 — Goal-field cache eviction is untested The clear-all-at-256 overflow path ( VERDICT: REVISE — the routing core, encoding, and test discipline are strong (notably better per-field/validation coverage than PR 4 shipped with), but the three P2s cut at the PR's own acceptance story: the headline wall-aim cap no longer measures behaviour, one target-selection change is silent and mislabeled, and the locked "assert, don't silently mill" requirement is bypassed on every production path. |
…lence/serialization tests, hardening Fable (adversarial review standing in for Codex) found no runtime bugs but flagged spec-mandated validation/test gaps + low-severity hardening. All addressed: P2 — load-validation + load-bearing test coverage: - validateSurfaceConnectivity now also checks each entrance's full radius-3 clearance halo is in-component (spec R3-8), mirroring the DesignateEntrance gate — so a corrupt/tampered save the sim gate could never produce is rejected. - Committed the bake↔selector equivalence sweep (bakeSurfaceEffectGrid byte- identical to per-tile surfaceFeatureProcedural across 6 seeds × all 16,384 tiles), plus direct computeSurfaceComponentMask and corridor-carve (carveCorridor merges a HardBlock-split row) regression tests. - §6.5 field-specific serialization tests for bakedSurfaceEffect: create→copy round-trip (determinism.test), save→load mutate-and-survive (catches a re-derive-from-seed deserializer), every unpackBakedSurfaceEffect rejection path, and load-time connectivity rejection. P3 — robustness: - Oracle entrance-designation step asserts ≥1 entrance was actually accepted (no vacuous rejection-only pass). - surfaceComponentMask materialized eagerly at world-gen, load, AND tick entry, so input/render queries (and mid-tick spawn checks) are pure reads, never a lazy world mutation. - isValidEntranceTarget halo bounds use the grid CONSTANTS (matches the sim gate). - packBakedSurfaceEffect throws on out-of-range effect values (fail at write, not a later read); unpackBakedSurfaceEffect rejects by base64 length BEFORE decoding (DoS guard, matching the issue #99 load-cap posture). - validateSurfaceConnectivity rejects the degenerate all-HardBlock-root case (zero entrances + zero piles no longer passes on a garbage all-zero mask). verify green (79 files / 2345 tests, +9 new); internal ship-review passed:true. Still PR 4 scope (static terrain); does NOT close #127/#128. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🔍 Adversarial Re-Review (Claude, standing in for Codex) — round 2Reviewed commit: All round-1 findings are addressed — verified against the diff:
One residual, non-blocking: P3 — The PR body wasn't actually updated for the underground scent-pull change The new comment at VERDICT: APPROVED — the acceptance story is falsifiable again, the one behaviour change is deliberate and documented (modulo the description edit above), and the load path now rejects everything the serializer can't produce. |
…te (Fable P3) The malformed-input cases in the unpackBakedSurfaceEffect rejection test were all shorter than the expected base64 length for a full grid, so the DoS length pre-check rejected every one of them before the decoder branch each label named could run. Rebuild each case at the exact full-grid length (5464 chars) with a single targeted corruption — bad char, mid-string padding, x=y final-quartet padding — so each exercises the base64ToBytes branch it claims, and keep an explicit wrong-length case for the length gate itself. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (C-both) — flurry PR 5 (V29) PR 5 of the #127/#128 fix series, built on PR 4 static terrain. Eliminates the dominant #127 mechanism (scent/priority-vs-wall, 86% of episodes) and the residual wander short-tail. Fix-A — passability-aware stepping (surface-routing.ts): - New stepTowardReachable(world, from, target): first step of a COMPLETE BFS goal field (distance-to-target over PR 4's single connected component), cached per target tile on the world (derived, not serialized). Replaces the naive pickCardinalStep in BOTH forager target branches (priority + scent). Target identity unchanged — only the step is path-aware. Three-valued: AtGoal (0,0) / Step / InvariantViolation. Scent ranks Manhattan-15-eligible piles by reachable path distance, lowest-foodPileId tie-break. No-revisit bypassed for targeted steps. Zone-guarded to the surface (underground falls back to cardinal step). Diagonals chosen only when a shared orthogonal is itself passable+descending (no corner-squeeze that the movement guard would reject). C-both — recent-tiles deepening: - RECENT_TILES_LEN 4→12 (chosen N: largest of {12,10,8,6,5} meeting all caps). - Compact canonical save encoding (records sorted by antId; head + non-sentinel (slot,x,y) in slot order); load validates + rejects malformed streams. Far smaller than the old flat arrays — total save size −19% vs baseline. simVersion V29 (posture 2): bump LATEST + raise MIN_ACCEPTED. Pocket-escape not needed — deepening alone drives confinement to zero. Acceptance (committed harness tests, ACCEPTANCE hold-out): aimedIntoWall=0, worst confinement=0tk (≤60), episodes>300=0; 4-vs-N sweep N=12 meets every throughput cap; field-specific copy/save-load + V29 boundary tests; save −18.95% (≤+5%); tick-time 0.26 ms/tick (≤0.5). verify green (80 files / 2380 tests); ship-review passed:true. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… metric, honest zone comments, ring-load hardening Fable (adversarial review standing in for Codex) flagged that the acceptance story had drifted. All findings addressed: P2 — acceptance integrity: - harness stepLandsOnWall reverted to the NAIVE pickCardinalStep direction (the #127 wall-aim signature), measured INDEPENDENTLY of the router. Calling stepTowardReachable made the cap tautologically 0 (the router never returns a wall step). Fix-A satisfies the cap by eliminating the confinement episodes the flag is tallied within (episodeAimedWall is only set during a confined run), not by redefining the flag — now falsifiable again. - The underground scent pull is now DELIBERATELY dropped in V29 with an honest comment (it fed SURFACE pile coordinates to an underground walker — meaningless; underground searchers use pheromone/wander). The prior "(pre-PR-5 behaviour)" comment was wrong. - The unreachable-surface priority case is documented as the benign transient stale-target handling it is (PR 4 guarantees current priority piles are reachable, so an unreachable target is a leftover from a prior Fighting/zone stint). Investigated alternatives: asserting crashes on these (S3 determinism scenarios); falling through to wander REGRESSES worst confinement to 108 tk. The cardinal step + surface-detour guard keeps the ant moving (harness confirms worst=0, aimedIntoWall=0), so it is kept — stepTowardReachable's throw remains a defensive guard. P3: - packRecentTiles comment corrected (antId<nextEntityId skips NEVER-allocated ids; dead-but-allocated ARE included). - unpackRecentTiles now rejects antId outside [0, min(nextEntityId, capacity)) — nextEntityId is the canonical (serializer) bound, capacity the memory-safety bound; a tampered count/nextEntityId mismatch is rejected, not silently dropped. - measureForagingThroughput doc: underground-carrier behaviour is subsumed by the cross-zone delivery latency/path/completion counters. - Added a goal-field cache-eviction (clear-all-on-overflow) regression test. verify green (80 files / 2381 tests); ship-review passed:true. Still PR 5 scope (V29 path-aware routing + deepened recent-tiles). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@codex review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
…review round 1) Entity ids are never reused and dead rings are never read (all isRecentTile/detour consumers run only for alive ants), but an ant killed mid-Searching/CarryingFood keeps its populated ring in memory forever — serializing those records grew every subsequent save monotonically. Skip non-alive ids at pack time; old saves containing dead-ant records still load (unpack doesn't check alive) and self-heal on the next save. Tests now seed rings on live worker ids instead of raw ids 0/1/3, plus a regression test that a dead ant's populated ring is not serialized. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6a740d9 to
b82696b
Compare
🔍 Adversarial Re-Review (Claude, standing in for Codex) — round 3 (post-rebase)Reviewed commit: The one new change —
No findings. VERDICT: APPROVED (head |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b82696bc17
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…sk for input, BFS scratch queue Codex P1: isValidEntranceTarget reached ensureSurfaceComponentMask via isSurfaceTileInComponent, which writes the memoised mask back onto the world on a cache miss — an input-layer WorldState mutation violating the sim/render boundary (input may only read sim state and enqueue commands). The preview now fetches the mask once through a new getSurfaceComponentMaskReadOnly, which returns the cached mask (always populated by createScenario/deserialize/tick in practice) or computes a transient one WITHOUT caching. Regression test pins that a null mask cache stays null across a preview query. Codex P2: computeSurfaceGoalField allocated a fresh 16k-cell BFS queue per first-touch field compute inside the ant movement loop. The queue is now module-level scratch (combat.ts pattern, AGENTS.md hot-loop rule); the returned field remains a fresh allocation because the cache owns it. Documented the amortization contract on ensureSurfaceGoalField: compute runs at most once per target for the world's lifetime, at pile-spawn cadence, not per ant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
PR 5 of the #127/#128 fix series, built on PR 4 static terrain (#206) — stacked; review/merge #206 first. Eliminates the dominant #127 mechanism (scent/priority-vs-wall — 86% of episodes, every one of the 10 longest) and the residual wander short-tail. simVersion V29 (posture 2). Do not merge until Rob OKs.
Fix-A — passability-aware stepping (
src/sim/surface-routing.ts)stepTowardReachable(world, from, target): returns the first step of a complete BFS goal field (distance-to-target over PR 4's single connected component), cached per-target on the world (derived, not serialized). Replaces the naivepickCardinalStepin both forager target branches (priority + scent) — target identity unchanged, only the step is path-aware.foodPileIdtie-break.C-both — recent-tiles deepening
RECENT_TILES_LEN4 → 12 (largest of {12,10,8,6,5} meeting all caps).Acceptance (committed harness tests, ACCEPTANCE hold-out)
aimedIntoWall=0, worst confinement 0 tk (≤60), episodes>300 = 0unpackRecentTilesmalformed-stream rejectionVerification
npm run verifygreen — 80 files / 2380 tests; internalship-reviewpassed:true. Pocket-escape not needed. Codex review pending (account out of credits); reviewed internally + by Fable.🤖 Generated with Claude Code
V29 scoped behaviour change (Fable round-2 note)
PR 5 deliberately drops the underground scent-pull: pre-PR-5, an underground SearchingFood forager with no priority/entrance was pulled toward a food pile by feeding the pile's surface tile coordinates to
pickCardinalStepin underground space — meaningless steering. V29 removes it; underground searchers fall through to pheromone/wander (the correct underground discovery path). The priority branch's underground fallback (naive cardinal toward the target coords) is unchanged. This is the PR series' one intentional target-selection change.