Skip to content

feat(sim): path-aware forager routing (Fix-A) + deepened recent-tiles (C-both) — flurry PR 5 (V29)#207

Open
LightAxe wants to merge 13 commits into
mainfrom
flurry-pr5-routing
Open

feat(sim): path-aware forager routing (Fix-A) + deepened recent-tiles (C-both) — flurry PR 5 (V29)#207
LightAxe wants to merge 13 commits into
mainfrom
flurry-pr5-routing

Conversation

@LightAxe

@LightAxe LightAxe commented Jun 9, 2026

Copy link
Copy Markdown
Owner

PR 5 of the #127/#128 fix series, built on PR 4 static terrain (#206) — stacked; review/merge #206 first. Eliminates the dominant #127 mechanism (scent/priority-vs-wall — 86% of episodes, every one of the 10 longest) and the residual wander short-tail. simVersion V29 (posture 2). Do not merge until Rob OKs.

Note: this PR's diff currently shows PR 4's changes too (it branches off flurry-pr4-static-terrain, not yet merged). Once #206 merges, this collapses to just the PR-5 diff.

Fix-A — passability-aware stepping (src/sim/surface-routing.ts)

  • New stepTowardReachable(world, from, target): returns the first step of a complete BFS goal field (distance-to-target over PR 4's single connected component), cached per-target on the world (derived, not serialized). Replaces the naive pickCardinalStep in both forager target branches (priority + scent) — target identity unchanged, only the step is path-aware.
  • Three-valued: AtGoal (0,0) / Step / InvariantViolation. Zone-guarded to the surface (underground falls back to the cardinal step). Diagonals chosen only when a shared orthogonal is itself passable + descending (no corner-squeeze the movement guard would reject).
  • Scent ranks Manhattan-15-eligible piles by reachable path distance, lowest-foodPileId tie-break.

C-both — recent-tiles deepening

  • RECENT_TILES_LEN 4 → 12 (largest of {12,10,8,6,5} meeting all caps).
  • Compact canonical save encoding (records sorted by antId; head + non-sentinel (slot,x,y) in slot order; load validates + rejects malformed streams) — far smaller than the old flat arrays.

Acceptance (committed harness tests, ACCEPTANCE hold-out)

  • aimedIntoWall=0, worst confinement 0 tk (≤60), episodes>300 = 0
  • 4-vs-N sweep: N=12 — completions 242≥240, latency −1.6%, meanPath −7.4%, p95 −11.7% (all caps met)
  • field-specific copy + save/load round-trip + V29 rejection-boundary; unpackRecentTiles malformed-stream rejection
  • save size −18.95% (≤+5%); tick-time 0.26 ms/tick (≤0.5)

Verification

npm run verify green — 80 files / 2380 tests; internal ship-review passed:true. Pocket-escape not needed. Codex review pending (account out of credits); reviewed internally + by Fable.

🤖 Generated with Claude Code

V29 scoped behaviour change (Fable round-2 note)

PR 5 deliberately drops the underground scent-pull: pre-PR-5, an underground SearchingFood forager with no priority/entrance was pulled toward a food pile by feeding the pile's surface tile coordinates to pickCardinalStep in underground space — meaningless steering. V29 removes it; underground searchers fall through to pheromone/wander (the correct underground discovery path). The priority branch's underground fallback (naive cardinal toward the target coords) is unchanged. This is the PR series' one intentional target-selection change.

LightAxe and others added 7 commits June 7, 2026 21:21
…28) — flurry PR 4

PR 4 of the #127/#128 fix series (plan/flurry/PR2-FIX-SPECS.md). Freezes surface
terrain so it never mutates as food piles spawn/deplete or entrances are
designated — the static-terrain bug class Phase 0 measured (477/900 piles revealed
a HardBlock on removal). Scope is STRICTLY static terrain; no #127/#128 movement
fixes (those are PR 5 / PR 6).

What changed:
- surface-features.ts: deleted the dynamic entrance/food-pile suppression from the
  shared feature selector (movement AND render). Split procedural vs frozen:
  surfaceFeatureProcedural (terrainSeed only) + a baked 128x128 movement-effect
  grid that is the source of truth for surfaceMovementAt (O(1)). surfaceFeatureAt
  consults the carve override so render never paints a sprite over a carved-passable
  tile (R4-3). Added bakeStaticTerrain (deterministic root-clearance reservation +
  corridor carve guaranteeing ONE connected component; no terrainSeed retry),
  computeSurfaceComponentMask, isSurfaceTileInComponent, validateSurfaceConnectivity.
  Boot-asserts no registry feature is Cosmetic (carve-detection precondition).
- types.ts: SIM_VERSION_V28_STATIC_TERRAIN (LATEST=28); new stored Uint8Array
  bakedSurfaceEffect + derived surfaceComponentMask; procedural bake in
  createWorldState; threaded copyWorldState.
- save.ts: bakedSurfaceEffect serialized packed 2-bit + base64 (~5.3 KB vs ~48 KB
  raw). Deserialize validates dims / enum range / full connectivity (every saved
  pile + entrance), rejecting corrupt or pre-V28 maps. MIN_ACCEPTED raised to V28
  (posture 2 — old saves reject cleanly).
- scenario.ts: bake BEFORE placing piles; reachable-spawn pile gate; world-gen
  connectivity assertion.
- tick.ts: DesignateEntrance rejects unless candidate + clearance are already
  passable on the frozen grid AND in the connected component (can no longer carve).
- food-system.ts: runtime pile spawn gated on the connected component.
- constants.ts: SURFACE_ROOT_CLEARANCE_RADIUS.

Oracle/tests: featureFieldHash extended to the FULL SurfaceFeatureSlice
(kind+variant+anchor); committed oracle (0 mutate across pile spawn/deplete,
entrance designate/open, save/load), connectivity, and simVersion rejection-boundary
tests; determinism serializer threads the baked grid; suppression/render tests
updated to the static-terrain contract.

Measurements: save-size delta +0.49%; tick-time ~0.24 ms/tick; acceptance sweep
(10 seeds x 3 difficulties) 0 connectivity failures, 0 hash mutations.

verify green (79 files / 2336 tests). Does NOT close #127/#128 (PR 5/6).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…view)

Addresses the PR #206 CI failure + Codex/ship-review findings.

- Perf (CI timeout fix): createWorldState + createScenario ran a full
  16,384-tile procedural bake on every call (twice per scenario), tipping
  many-iteration tests over the 5s timeout on the CI runner. Memoize
  bakeSurfaceEffectGrid by terrainSeed (pure function of it) and return a copy;
  the suite is now faster than the pre-PR baseline (tests ~31s vs ~60s).
- save.ts base64 (Codex P2): reject `=` padding outside the final quartet (and
  the never-valid "x=y" form) instead of treating it as zero — malformed
  encodings now fail the load boundary as intended.
- SAVE_FORMAT_VERSION 3 → 4 (ship-review): the new required on-disk
  bakedSurfaceEffect field is a shape break; bump it (and SAVE_KEY :v3→:v4 +
  purge old key) so save-shape correctness isn't silently coupled to the
  simVersion floor.
- DesignateEntrance halo (ship-review, bugs dim): check component membership per
  clearance tile instead of a bare HardBlock test, so it no longer relies on an
  unenforced "globally one walkable component" assumption (isolated empty pockets
  are harmless). Softened the over-claiming doc comments to match what the bake
  actually guarantees (all roots/piles/entrances share one component).
- computeSurfaceComponentMask: throw on a wrong-sized grid (fail loud instead of
  silently flooding the map into one component).

verify green (79 files / 2336 tests). Still PR 4 scope only; does NOT close #127/#128.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rio (codex P1)

Codex P1: the procedural-bake `Map` was mutable module-level state in src/sim —
an AGENTS.md ECS-rule blocker (state outside the world snapshot). Removed it.

Restore the performance without a cache:
- bakeStaticTerrain now carves a COPY of the procedural grid createWorldState
  already baked, instead of re-baking from scratch — one full procedural pass per
  createScenario (was two). Measured: 200x createScenario ~2.0s (was ~4s).
- vitest.config: raise the default testTimeout to 15s. Static terrain made world
  construction bake the 128x128 grid (~8 ms each), so construction-heavy sweeps
  (200-seed spider-lair, 500-iteration pause-cadence) sat near the 5s default and
  tipped over it under CI-runner contention — the PR #206 CI timeout. Assertions
  and iteration counts are unchanged; genuinely-hung tests still fail.

verify green (79 files / 2336 tests, ~57s).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ut fix

The per-tile procedural bake called the recursive overlap-suppression check up to
16,384× per grid (~8 ms), so construction-heavy tests (200-seed spider-lair,
500-iteration pause-cadence) blew even a 15 s timeout on the CI runner. Raising
the timeout only masked it; this fixes the root cause.

bakeSurfaceEffectGrid now iterates ANCHORS in lex (ay,ax) order instead of tiles:
the first surviving anchor covering a tile claims it (matching the per-tile
lex-smallest-winner tie-break), with a per-anchor "first covering feature per
offset" scratch to reproduce the selector's first-covering-feature + break-on-
suppress semantics exactly, including negative (off-edge) anchors. Verified
byte-identical to the per-tile surfaceFeatureProcedural across 80 seeds × all
16,384 tiles (0 mismatches), so surfaceFeatureAt's carve detection and the
feature-field oracle stay consistent. Measured: ~0.55 ms/bake (was ~8 ms);
200× createScenario ~0.45 s (was ~2-4 s).

Also:
- Reverted the vitest testTimeout bump — no longer needed; the bake is fast.
- createScenario resets surfaceComponentMask after initColony so the persisted/
  validated mask is rooted at the real canonical root (first colony entrance)
  rather than relying on the pre-colony fallback root happening to equal it
  (ship-review advisory).

verify green (79 files / 2336 tests). PR 4 scope only; does NOT close #127/#128.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(codex P2)

The DesignateEntrance sim gate (PR 4) now rejects tiles outside the connected
walkable component or with a blocked clearance halo, but surface-input still
previewed any empty tile — so a player saw a valid-looking entrance target and
confirmation did nothing. Add isValidEntranceTarget (empty tile AND candidate +
SURFACE_ROOT_CLEARANCE_RADIUS halo all in-component) mirroring the gate, and use
it for the right-click preview. Smoke tests updated: the mock world carries a
fully-walkable baked grid, plus new cases asserting no preview when the target or
a halo tile is HardBlock.

verify green (79 files / 2338 tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ia simVersion

The e2e CI job failed (5 save/load specs): bumping SAVE_KEY v3→v4 orphaned every
spec that addresses the literal 'subterrans:save:v3', and the bump itself was only
a low-severity ship-review advisory, not a spec requirement.

Revert SAVE_FORMAT_VERSION (4→3) and SAVE_KEY (:v4→:v3). The new required
bakedSurfaceEffect lives in the SNAPSHOT (SerializedWorldState), and this codebase
versions snapshot-content shape changes by simVersion — not the envelope's
SAVE_FORMAT_VERSION (which versions envelope structure). deserializeWorldState
validates simVersion FIRST, so raising MIN_ACCEPTED_SIM_VERSION to V28 already
rejects every pre-static-terrain save (which also lacks the field) before unpack
is reached — the required behaviour, achieved without an envelope bump and
matching the v2/v3 precedent. The legacy-purge list reverts accordingly.

All required PR-4 save behaviour is retained: MIN_ACCEPTED=V28, the required
bakedSurfaceEffect field, and load-time dims/enum/connectivity validation.

Verified locally: npm run verify green (2338); the 5 previously-failing e2e specs
(menu-and-dialog #115/#196, phase-09 save-prompt) now pass (19/19 chromium).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ailing-bit guard

Two low-severity ship-review advisories on the latest diff:
- surface-features.ts: the @param world JSDoc on isAnchorSuppressedByOverlap still
  described the deleted isAnchorGameplaySuppressed call. Correct it — world is now
  only threaded to the recursive calls; no field is read (PR 4 removed dynamic
  suppression). Doc-only.
- save.ts: unpackBakedSurfaceEffect now rejects non-zero unused high bits in the
  final packed byte when expectedLen isn't a multiple of 4, so a tampered payload
  can't decode to a grid serialize would never emit. No-op for the production
  16384-tile grid (a multiple of 4), but keeps the exported, length-general
  validator honest.

(Left as-is: the preview/sim entrance gates still diverge on column-uniqueness /
entrance-cap — that divergence predates PR 4; this PR only added and mirrored the
new component/clearance gate. Out of scope.)

verify green (79 files / 2338 tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@LightAxe

LightAxe commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

🔍 Adversarial Review (Claude, standing in for Codex)

Reviewed commit: 58e2426 — the PR 5 delta over e18e604 (PR 4 head; PR 4's portion of this stacked diff was reviewed separately in #206). Spec: plan/flurry/PR2-FIX-SPECS.md → PR 5 (Fix-A + C-both).

What I verified before the findings: surface-routing.ts is sim-discipline-clean (integer-only, no RNG/clock, 4-connected BFS mirroring computeSurfaceComponentMask); the corner-squeeze diagonal guard is sound (a descending orthogonal implies the movement guard's per-axis revert always has a legal cardinal) and is unit-tested including the wrap-around case; the no-revisit reconciliation takes the spec's "bypass entirely" option (R4-5) consistently at both the substitution and the corner-revert sites; scent ranking keeps Manhattan-15 eligibility first, path-distance rank second, lowest-foodPileId final tie-break (R2-3); the goal-field cache is correctly derived state — reset in copyWorldState, excluded from both the save and determinism serializers, and unable to affect behaviour since entries are pure functions of frozen terrain; RECENT_TILES_LEN is symbolic at every ring site; the compact recent-tiles encoding is canonical and its validation paths are thoroughly tested (the PR 4 R1-12 lesson clearly landed — 14 rejection tests plus copy and mutate-roundtrip field tests); V29 posture-2 with the boundary test is correct. CI is green.


P2 — The aimedIntoWall=0 acceptance cap is now tautological

stepLandsOnWall (src/platform/investigation/harness.ts:393) was rewritten to call stepTowardReachable — the router under test. The router cannot return a step onto a HardBlock by construction (it only selects neighbours with finite goal-field distance; HardBlocks are SURFACE_GOAL_UNREACHED), and the doc comment admits "this returns false by construction." So the headline acceptance metric — "scent/priority-vs-wall episodes: 0 (from 127/136)" — is satisfied by definition, not by observation. Worse, the detector models the router's happy path rather than the ant's executed step, so it is blind to exactly the paths where a wall-aim could still occur: the naive pickCardinalStep fallback (zone guard / unreachable degrade in ant-system.ts), no-revisit or occupancy interactions, or any future divergence between router and movement code. Suggest deriving the wall-aim signal from observed movement intent (e.g. record the pre-guard (dx,dy) the sim actually chose for the targeted branch, or compare the executed step against HardBlock) so the cap stays falsifiable. The confinement caps (worst ≤ 60 tk, >300 tk = 0) remain genuine observations and still bite — this finding is specifically about the wall-aim oracle.

P2 — Underground scent-pull silently removed, and the comment claims the opposite

Pre-PR 5, the scent branch had no zone guard: an underground SearchingFood forager (no open entrance, no priority target) within Manhattan-15 of a pile cardinal-stepped toward the pile's coordinates. The new code (ant-system.ts:4598) returns null for underground ants, falling through to pheromone/wander — while the priority branch in the same situation explicitly preserves its naive-cardinal fallback. The scent branch's comment justifies the fall-through as "(pre-PR-5 behaviour)", which is factually wrong — pre-PR-5 behaviour was the scent pull. This is a silent target-selection change in exactly the area the spec scoped as untouchable ("the primitive replaces the STEP, not the target identity"; "without changing target selection semantics that aren't broken"). The old underground pull (steering by surface-pile coords in underground space) may well be the right thing to delete — but then delete it deliberately: state it in the comment and the PR body, and treat it as a scoped V29 behaviour change, not an accident the next reader inherits. Otherwise restore the cardinal scent pull for underground ants, mirroring the priority branch.

P2 — The surface unreachable-target case degrades to wall-aiming instead of asserting

The spec's locked owner requirement: "no path → pin on wall" is structurally eliminated; an actual connectivity violation "yields InvariantViolation (assert, don't silently mill)." The implementation pre-checks reachability in the priority branch (ant-system.ts:4570) and, for a surface ant whose target is unreachable, silently falls back to the naive cardinal step — i.e. the precise #127 wall-pinning behaviour this PR exists to eliminate — and findReachableScentPile likewise silently skips unreachable piles. The result is that the assert inside stepTowardReachable is dead code on every production path, and a PR 4 invariant breach in production would manifest as a quiet #127 regression instead of a loud failure. The stated motivation (contrived hand-built test worlds) doesn't require this: those tests fill the grid walkable. Keep the silent fallback only for the legitimate underground zone guard; for zone === Surface + SURFACE_GOAL_UNREACHED, throw (or at minimum count/log loudly).

P3 — packRecentTiles comment mischaracterizes its own filter

src/platform/save.ts:578 says the antId < nextEntityId bound "(skips dead-but-allocated entities' stale ring state)". It doesn't — it skips never-allocated ids; dead-but-allocated ants are included, which is correct and exactly what the spec's canonical encoding requires. Fix the comment before it misleads a future change into "optimizing away" dead ants and breaking byte-exact round-trip.

P3 — Load accepts ring records the serializer can never produce

unpackRecentTiles validates antId < capacity but not antId < nextEntityId (available as rawNext in deserializeWorldState before ants are deserialized). A tampered save can carry ring state for never-allocated ids — harmless today because initAnt re-clears the ring on allocation, but it's a cheap validation that would make load reject everything serialize can't emit, consistent with the encoding's otherwise-strict canonicality.

P3 — Sweep omits the spec's "underground-carrier behaviour" metric

The §PR 5 sweep metric list is "pickup→deposit latency, mean & 95p path length, task-completion counts, underground-carrier behaviour, plus serialized save bytes." ThroughputSample covers the first three (and the carry leg implicitly crosses zones) and the size cap is tested separately, but nothing distinguishes underground-carrier behaviour specifically. Worth either a one-line justification for why the carry-leg latency subsumes it, or a small underground-specific counter in the sweep.

P3 — Goal-field cache eviction is untested

The clear-all-at-256 overflow path (surface-routing.ts:143) has no test. Eviction is correctness-neutral (entries recompute identically), but one test — 257 distinct targets, assert the cache cleared and a re-requested field recomputes equal — would pin the bound and the rebuild behaviour.


VERDICT: REVISE — the routing core, encoding, and test discipline are strong (notably better per-field/validation coverage than PR 4 shipped with), but the three P2s cut at the PR's own acceptance story: the headline wall-aim cap no longer measures behaviour, one target-selection change is silent and mislabeled, and the locked "assert, don't silently mill" requirement is bypassed on every production path.

…lence/serialization tests, hardening

Fable (adversarial review standing in for Codex) found no runtime bugs but flagged
spec-mandated validation/test gaps + low-severity hardening. All addressed:

P2 — load-validation + load-bearing test coverage:
- validateSurfaceConnectivity now also checks each entrance's full radius-3
  clearance halo is in-component (spec R3-8), mirroring the DesignateEntrance
  gate — so a corrupt/tampered save the sim gate could never produce is rejected.
- Committed the bake↔selector equivalence sweep (bakeSurfaceEffectGrid byte-
  identical to per-tile surfaceFeatureProcedural across 6 seeds × all 16,384
  tiles), plus direct computeSurfaceComponentMask and corridor-carve
  (carveCorridor merges a HardBlock-split row) regression tests.
- §6.5 field-specific serialization tests for bakedSurfaceEffect: create→copy
  round-trip (determinism.test), save→load mutate-and-survive (catches a
  re-derive-from-seed deserializer), every unpackBakedSurfaceEffect rejection
  path, and load-time connectivity rejection.

P3 — robustness:
- Oracle entrance-designation step asserts ≥1 entrance was actually accepted (no
  vacuous rejection-only pass).
- surfaceComponentMask materialized eagerly at world-gen, load, AND tick entry, so
  input/render queries (and mid-tick spawn checks) are pure reads, never a lazy
  world mutation.
- isValidEntranceTarget halo bounds use the grid CONSTANTS (matches the sim gate).
- packBakedSurfaceEffect throws on out-of-range effect values (fail at write, not
  a later read); unpackBakedSurfaceEffect rejects by base64 length BEFORE decoding
  (DoS guard, matching the issue #99 load-cap posture).
- validateSurfaceConnectivity rejects the degenerate all-HardBlock-root case (zero
  entrances + zero piles no longer passes on a garbage all-zero mask).

verify green (79 files / 2345 tests, +9 new); internal ship-review passed:true.
Still PR 4 scope (static terrain); does NOT close #127/#128.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LightAxe

Copy link
Copy Markdown
Owner Author

🔍 Adversarial Re-Review (Claude, standing in for Codex) — round 2

Reviewed commit: 6a740d9 (delta over 58e2426, the previously reviewed PR 5 head). CI green.

All round-1 findings are addressed — verified against the diff:

  • P2 tautological wall-aim metric ✅ — stepLandsOnWall is reverted to the independent naive per-axis-sign signature, measured without calling the router under test, and the new doc comment captures the correct falsifiability story: Fix-A satisfies the cap by eliminating the confinement episodes the flag is tallied within (episodeAimedWall only sets during a confined run), not by redefining the flag. If Fix-A regresses — an ant confined while its naive intent points at a wall — this trips again. Exactly right.
  • P2 underground scent-pull ✅ (code) — the removal is kept and now documented honestly as a deliberate, scoped V29 behaviour change with the real rationale (the old pull fed surface pile coordinates to an underground walker — meaningless steering). One residual below.
  • P2 unreachable-surface fallback ✅ — finding withdrawn on the evidence. The investigation reframes this case correctly: PR 4 guarantees the current priority pile is reachable, so a surface ant with an unreachable target necessarily holds a stale leftover (e.g. post-Fighting), which is a benign transient, not an invariant breach — and it demonstrably occurs in the S3 determinism scenarios, so asserting would crash legitimate games. The data-backed choice (cardinal step keeps the ant moving, worst confinement 0 tk; wander fallback regresses to 108 tk) is the right call, and the new comment documents both the reasoning and the rejected alternatives. This is what I wanted instead of a silent degrade.
  • P3s ✅ — packRecentTiles comment corrected (never-allocated vs dead-but-allocated); unpackRecentTiles now bounds antId by both nextEntityId (canonical serializer bound) and capacity (memory-safety bound), with a correct analysis of why both are needed under independent tampering; the underground-carrier subsumption rationale is documented on measureForagingThroughput; the cache-eviction clear-all path has a regression test (bounded size + post-clear recompute equality).

One residual, non-blocking:

P3 — The PR body wasn't actually updated for the underground scent-pull change

The new comment at ant-system.ts says the V29 underground scent-pull removal is "called out in the PR body" — it isn't. The body still describes the zone guard as "underground falls back to the cardinal step" (true for the priority branch only) and doesn't mention that underground scent-pull was deliberately dropped. Since this is the PR series' one intentional target-selection change, the description should say so — it's the document a future archaeologist reads first. Description edit only, no code change.


VERDICT: APPROVED — the acceptance story is falsifiable again, the one behaviour change is deliberate and documented (modulo the description edit above), and the load path now rejects everything the serializer can't produce.

LightAxe and others added 3 commits June 9, 2026 20:36
…te (Fable P3)

The malformed-input cases in the unpackBakedSurfaceEffect rejection test
were all shorter than the expected base64 length for a full grid, so the
DoS length pre-check rejected every one of them before the decoder branch
each label named could run. Rebuild each case at the exact full-grid
length (5464 chars) with a single targeted corruption — bad char,
mid-string padding, x=y final-quartet padding — so each exercises the
base64ToBytes branch it claims, and keep an explicit wrong-length case
for the length gate itself.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (C-both) — flurry PR 5 (V29)

PR 5 of the #127/#128 fix series, built on PR 4 static terrain. Eliminates the
dominant #127 mechanism (scent/priority-vs-wall, 86% of episodes) and the
residual wander short-tail.

Fix-A — passability-aware stepping (surface-routing.ts):
- New stepTowardReachable(world, from, target): first step of a COMPLETE BFS goal
  field (distance-to-target over PR 4's single connected component), cached per
  target tile on the world (derived, not serialized). Replaces the naive
  pickCardinalStep in BOTH forager target branches (priority + scent). Target
  identity unchanged — only the step is path-aware. Three-valued: AtGoal (0,0) /
  Step / InvariantViolation. Scent ranks Manhattan-15-eligible piles by reachable
  path distance, lowest-foodPileId tie-break. No-revisit bypassed for targeted
  steps. Zone-guarded to the surface (underground falls back to cardinal step).
  Diagonals chosen only when a shared orthogonal is itself passable+descending
  (no corner-squeeze that the movement guard would reject).

C-both — recent-tiles deepening:
- RECENT_TILES_LEN 4→12 (chosen N: largest of {12,10,8,6,5} meeting all caps).
- Compact canonical save encoding (records sorted by antId; head + non-sentinel
  (slot,x,y) in slot order); load validates + rejects malformed streams. Far
  smaller than the old flat arrays — total save size −19% vs baseline.

simVersion V29 (posture 2): bump LATEST + raise MIN_ACCEPTED. Pocket-escape not
needed — deepening alone drives confinement to zero.

Acceptance (committed harness tests, ACCEPTANCE hold-out): aimedIntoWall=0, worst
confinement=0tk (≤60), episodes>300=0; 4-vs-N sweep N=12 meets every throughput
cap; field-specific copy/save-load + V29 boundary tests; save −18.95% (≤+5%);
tick-time 0.26 ms/tick (≤0.5). verify green (80 files / 2380 tests); ship-review
passed:true.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… metric, honest zone comments, ring-load hardening

Fable (adversarial review standing in for Codex) flagged that the acceptance
story had drifted. All findings addressed:

P2 — acceptance integrity:
- harness stepLandsOnWall reverted to the NAIVE pickCardinalStep direction (the
  #127 wall-aim signature), measured INDEPENDENTLY of the router. Calling
  stepTowardReachable made the cap tautologically 0 (the router never returns a
  wall step). Fix-A satisfies the cap by eliminating the confinement episodes the
  flag is tallied within (episodeAimedWall is only set during a confined run), not
  by redefining the flag — now falsifiable again.
- The underground scent pull is now DELIBERATELY dropped in V29 with an honest
  comment (it fed SURFACE pile coordinates to an underground walker — meaningless;
  underground searchers use pheromone/wander). The prior "(pre-PR-5 behaviour)"
  comment was wrong.
- The unreachable-surface priority case is documented as the benign transient
  stale-target handling it is (PR 4 guarantees current priority piles are
  reachable, so an unreachable target is a leftover from a prior Fighting/zone
  stint). Investigated alternatives: asserting crashes on these (S3 determinism
  scenarios); falling through to wander REGRESSES worst confinement to 108 tk. The
  cardinal step + surface-detour guard keeps the ant moving (harness confirms
  worst=0, aimedIntoWall=0), so it is kept — stepTowardReachable's throw remains a
  defensive guard.

P3:
- packRecentTiles comment corrected (antId<nextEntityId skips NEVER-allocated ids;
  dead-but-allocated ARE included).
- unpackRecentTiles now rejects antId outside [0, min(nextEntityId, capacity)) —
  nextEntityId is the canonical (serializer) bound, capacity the memory-safety
  bound; a tampered count/nextEntityId mismatch is rejected, not silently dropped.
- measureForagingThroughput doc: underground-carrier behaviour is subsumed by the
  cross-zone delivery latency/path/completion counters.
- Added a goal-field cache-eviction (clear-all-on-overflow) regression test.

verify green (80 files / 2381 tests); ship-review passed:true. Still PR 5 scope
(V29 path-aware routing + deepened recent-tiles).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LightAxe

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

…review round 1)

Entity ids are never reused and dead rings are never read (all
isRecentTile/detour consumers run only for alive ants), but an ant
killed mid-Searching/CarryingFood keeps its populated ring in memory
forever — serializing those records grew every subsequent save
monotonically. Skip non-alive ids at pack time; old saves containing
dead-ant records still load (unpack doesn't check alive) and self-heal
on the next save. Tests now seed rings on live worker ids instead of
raw ids 0/1/3, plus a regression test that a dead ant's populated ring
is not serialized.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LightAxe LightAxe force-pushed the flurry-pr5-routing branch from 6a740d9 to b82696b Compare June 10, 2026 07:22
@LightAxe

Copy link
Copy Markdown
Owner Author

🔍 Adversarial Re-Review (Claude, standing in for Codex) — round 3 (post-rebase)

Reviewed commit: b82696b (delta over the approved-equivalent a2bd329; rebase fidelity onto the new PR 4 head verified — the cross-branch diff contains exactly the known underlying PR 4 commits, nothing smuggled). CI green.

The one new change — packRecentTiles now skips dead-but-allocated ants — holds up under scrutiny:

  • The leak is real and the fix is correct. Entity ids are never reused, so every ant killed mid-Searching/CarryingFood left a permanently-serialized ring record; saves grew monotonically with cumulative deaths. I verified the correctness preconditions independently: every ring consumer (isRecentTile, the detour fallback, the no-revisit sweep) runs only for alive ants, and ids are never recycled, so dead-ring bytes are genuinely unreachable state. Save→load→save remains byte-stable (both saves skip), the encoding stays canonical (the filter is deterministic), old saves with dead records still load and self-heal, and the round-trip tests were correctly migrated to alive worker ids with a regression test pinning the skip.
  • On the record: this is a deliberate deviation from the spec text. PR2-FIX-SPECS.md §PR 5 resolved the dead-but-allocated ambiguity the other way ("one per ant with ≥1 non-sentinel slot AND antId < nextEntityId … reconstructs the identical ring state"). The spec's underlying goals — canonical, unambiguous, exact for everything the sim can ever read — are all still met, and the spec's choice had a real flaw (the monotonic leak), so I accept the deviation. Worth a one-line amendment to the spec doc so the next reader doesn't flag code-vs-spec drift as a bug.

No findings.

VERDICT: APPROVED (head b82696b)

@LightAxe

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b82696bc17

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/input/surface-input.ts Outdated
Comment thread src/sim/surface-routing.ts
…sk for input, BFS scratch queue

Codex P1: isValidEntranceTarget reached ensureSurfaceComponentMask via
isSurfaceTileInComponent, which writes the memoised mask back onto the
world on a cache miss — an input-layer WorldState mutation violating the
sim/render boundary (input may only read sim state and enqueue
commands). The preview now fetches the mask once through a new
getSurfaceComponentMaskReadOnly, which returns the cached mask (always
populated by createScenario/deserialize/tick in practice) or computes a
transient one WITHOUT caching. Regression test pins that a null mask
cache stays null across a preview query.

Codex P2: computeSurfaceGoalField allocated a fresh 16k-cell BFS queue
per first-touch field compute inside the ant movement loop. The queue is
now module-level scratch (combat.ts pattern, AGENTS.md hot-loop rule);
the returned field remains a fresh allocation because the cache owns it.
Documented the amortization contract on ensureSurfaceGoalField: compute
runs at most once per target for the world's lifetime, at pile-spawn
cadence, not per ant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LightAxe

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant