bug: stagnation escape hatch が最後の ANVIL_PLAN_UPDATE を処理する前に break し、checked retire が捨てられる

## Summary

Phase2 retest after #321 still has a red point in `autonomy_v2_p2_fix_slice_v1_20260410_000938` B1, but the blocker is now more specific than "already-satisfied item retire is missing".

The last closure-hint response contains a **correct `ANVIL_PLAN_UPDATE` that checks the final remaining file-backed item** (`src/app/api/worktrees/[id]/current-output/route.ts`), but the agentic loop hits the stagnation escape hatch and `break`s **before parsing/applying that response**.

As a result, the checked retire for the last item is dropped, `remaining=1` persists, and the run ends as `partial` + `runner_nonzero_exit` even though the model emitted the intended no-change closure.

## Reproduction / observed behavior

- HEAD under test: `4c9acf7` (PR #322 / Issue #321)
- result dir: `commandindextest/results/autonomy_v2_p2_fix_slice_v1_20260410_000938/`
- run: `B1`
- provider/model: Ollama `qwen3.5:122b` + sidecar `qwen3.5:9b`
- context/max output: `65536`

### Result artifact

`B1_result.json` shows the external failure shape:

- `valid_run=true`
- `exit_class=runner_nonzero_exit`
- `command_return_code=2`
- `changed_files=0`
- `anvil_error_count=1`
- `session_completed_count=1`

Source: `commandindextest/results/autonomy_v2_p2_fix_slice_v1_20260410_000938/B1_result.json:17-35`

### What the log shows

1. The first `ANVIL_PLAN_UPDATE` checks **five** items but does **not** include `current-output/route.ts`.

- `B1.log:458-464`

2. Runtime retires those five items and immediately reports `remaining=1 next=src/app/api/worktrees/[id]/current-output/route.ts ... no change needed`.

- `B1.log:498-504`
  - five `plan item retired via checked marker`
  - `plan update pipeline: all items checked → retired`
  - `plan-aware final gate: suppressing ANVIL_FINAL (premature) remaining=1 ... current-output/route.ts ... no change needed`

3. After the closure hint, the model emits a **second** `ANVIL_PLAN_UPDATE` that explicitly checks the last remaining item:

```text
- [x] src/app/api/worktrees/[id]/current-output/route.ts: uses detectSessionStatus() which uses buildDetectPromptOptions() - no change needed
```

- `B1.log:516-522`

4. That same response also emits `ANVIL_FINAL` saying the audit found no remaining gaps.

- `B1.log:525-559`

5. But immediately after that response, the loop logs:

- `stagnation detected; forced_mode_active=true`
- `pre-exit repair turn injected before escape hatch`
- `stagnation escape hatch: terminating loop`
- `completion_kind=partial plan_items=6 plan_finished=5`
- telemetry `plan_update_count=1`

- `B1.log:561-568`

Crucially:

- there is **no** retire log for `current-output/route.ts`
- telemetry still says `plan_update_count=1` even though the log visibly contains **two** `ANVIL_PLAN_UPDATE` blocks

This strongly indicates that the second, closure-hint-driven update was never applied.

## Source-backed root cause

### 1. Escape hatch runs before late-response plan parsing

In the main loop, stagnation accounting and escape-hatch termination happen **before** the code that parses `next_token_buffer` for `ANVIL_PLAN` / `ANVIL_PLAN_UPDATE`.

Ordering in `src/app/agentic.rs`:

1. end-turn stagnation / `should_allow_escape_hatch()`
2. optional `pre-exit repair` injection
3. `break`
4. only **after that**, `try_register_plan(&next_token_buffer)` / `try_update_plan(&next_token_buffer)`

Code:

- `src/app/agentic.rs:987-1045`
- `src/app/agentic.rs:1068-1126`

So when the latest zero-tool-call response contains a corrective `ANVIL_PLAN_UPDATE` and escape hatch fires on that same turn, the loop terminates before the update pipeline ever sees the response.

### 2. Runtime explicitly instructs the model to do the thing that gets dropped

The structured pre-exit repair message tells the model:

1. mutate the remaining file, or
2. if no change is needed, emit `ANVIL_PLAN_UPDATE` with `[x]`, then
3. emit `ANVIL_FINAL`

Code:

- `src/app/mod.rs:846-879`

So the runtime’s own recovery protocol expects checked retirement on the last turn — but the loop ordering can discard exactly that response.

### 3. Checked retire machinery itself is not the primary failure here

The checked-first pipeline does exist and works when it actually runs:

- `src/app/execution_plan.rs:121-168`
- `tests/plan_item_retire.rs:16-48`

This B1 failure is therefore distinct from the original #301 retirement gap. The issue is not merely “cannot retire already-satisfied item”; it is “the final checked update never reaches the retirement pipeline because the loop breaks too early.”

## Why this is distinct from existing issues

### vs #301 / #315

Those were about retirement semantics once the update path is actually processed.

Here, the last `ANVIL_PLAN_UPDATE` is present in the model output but is dropped before `try_update_plan()` runs.

### vs #321

#321 focused on reactive reachability from repeated single-file edit failure into `agent.fix_slice`.

This B1 failure is a **read-only / no-change closure** path after closure hint, with `fixslice_escalation_count=0` and no worker-path evidence. It is a separate late-stage loop-ordering bug.

### vs #287

#287 was a broader `remaining=1` hang / timeout family.

This issue is narrower and source-backed: the agentic loop has a concrete control-flow bug where escape hatch termination precedes late-response plan update parsing.

## Impact

- A run can still fail on the last no-change item even when the model emits the correct checked retirement.
- Late-stage closure mode becomes self-contradictory: runtime asks for `ANVIL_PLAN_UPDATE [x]`, then drops the answer.
- Phase2 worker-on short smoke can remain red after #321 even if the model reaches the correct “no code change needed” conclusion.

## Fix direction

1. **Process `next_token_buffer` before any escape-hatch break**
   - Apply `try_register_plan()` / `try_update_plan()` and ANVIL_FINAL tracking first.
   - Re-evaluate completion / final gate after those updates.

2. **Do not inject a pre-exit repair message and `break` in the same branch**
   - If a repair message is injected, the loop should continue at least one more iteration so the message can actually be sent and its response processed.
   - Otherwise the injected repair message is dead-on-arrival.

3. **Add regression coverage for the exact ordering bug**
   - Setup: one unfinished plan item remains.
   - Model response has zero tool calls, emits `ANVIL_PLAN_UPDATE` with `[x]` for the last item plus `ANVIL_FINAL`.
   - `should_allow_escape_hatch()` is true on that same turn.
   - Expected: the update is applied, the item becomes `AlreadySatisfied`, final gate allows termination, and completion is not `partial`.

4. **Add telemetry/assertion coverage**
   - When two `ANVIL_PLAN_UPDATE` blocks are emitted across the run, `plan_update_count` should reflect both once both responses are processed.
   - For this repro shape, `current-output/route.ts` should produce a checked-retire log before termination.

## Acceptance criteria

- [ ] A late zero-tool-call response containing `ANVIL_PLAN_UPDATE` is still parsed/applied even if escape hatch would otherwise fire that turn
- [ ] The final checked item can reach `AlreadySatisfied` before loop termination
- [ ] `completion_kind` is not `partial` for the reproduced B1 no-change closure path
- [ ] `plan_update_count` reflects the late corrective update instead of staying at `1`
- [ ] A regression test covers the "escape hatch fires on the same turn as the final checked update" path


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: stagnation escape hatch が最後の ANVIL_PLAN_UPDATE を処理する前に break し、checked retire が捨てられる #323

Summary

Reproduction / observed behavior

Result artifact

What the log shows

Source-backed root cause

1. Escape hatch runs before late-response plan parsing

2. Runtime explicitly instructs the model to do the thing that gets dropped

3. Checked retire machinery itself is not the primary failure here

Why this is distinct from existing issues

vs #301 / #315

vs #321

vs #287

Impact

Fix direction

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug: stagnation escape hatch が最後の ANVIL_PLAN_UPDATE を処理する前に break し、checked retire が捨てられる #323

Description

Summary

Reproduction / observed behavior

Result artifact

What the log shows

Source-backed root cause

1. Escape hatch runs before late-response plan parsing

2. Runtime explicitly instructs the model to do the thing that gets dropped

3. Checked retire machinery itself is not the primary failure here

Why this is distinct from existing issues

vs #301 / #315

vs #321

vs #287

Impact

Fix direction

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions