Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 54 additions & 9 deletions commands/gen-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,10 @@ After Claude candidate plan v1 is ready, run iterative challenge/refine rounds w
"${CLAUDE_PLUGIN_ROOT}/scripts/ask-codex.sh" "<review current candidate plan>"
```
- Prompt MUST include current candidate plan, prior disagreements, and unresolved items
- Prompt MUST include the RLCR plan contract: `AC-*` items are current RLCR completion gates; deferred, future, out-of-scope, post-work, or successor-loop goals must be represented as `FUT-*` under `## Future Work / Out of Scope`, optionally with a current-loop handoff AC.
- Prompt MUST require Codex to inspect each AC for deferral semantics. If any AC claims the real work happens outside this RLCR loop, Codex MUST put it under `REQUIRED_CHANGES`, not `OPTIONAL_IMPROVEMENTS`.
- Prompt MUST require a hard keyword scan within each AC body. Treat these strings as blocking unless the AC follows the Handoff AC Pattern and the current-loop verification is complete without performing future work: `TODO`, `TBD`, `deferred`, `future`, `follow-up`, `subsequent`, `next phase`, `next iteration`, `next milestone`, `next loop`, `v2`, `v.next`, `Phase II`, `left for`, `to be implemented in`, `see FUT-`.
- Prompt MUST require AC/Task bidirectional coverage: every `AC-*` is targeted by at least one Task Breakdown row; every Task Breakdown row targets at least one current-scope `AC-*`; no task target may be empty, `-`, `FUT-*`, or `DEC-*`.
- Require output format:
- `AGREE:` points accepted as reasonable
- `DISAGREE:` points considered unreasonable and why
Expand Down Expand Up @@ -384,6 +388,7 @@ Deeply think and generate the plan.md following these rules:
## Acceptance Criteria

Following TDD philosophy, each criterion includes positive and negative tests for deterministic verification.
`AC-*` items are current RLCR completion gates: they must describe work that this implementation loop must complete and verify. Do not encode deferred, future, out-of-scope, post-work, or successor-loop goals as `AC-*`.

- AC-1: <First criterion>
- Positive Tests (expected to PASS):
Expand All @@ -400,6 +405,21 @@ Following TDD philosophy, each criterion includes positive and negative tests fo
- Negative Tests: <...>
...

### Handoff AC Pattern

Use this pattern only when the draft contains a legitimate future goal that must be preserved without making it part of the current RLCR completion gate.

- AC-X: Handoff for <future goal> is complete without performing the future work.
- Future Work Reference: FUT-Y
- Positive Tests (expected to PASS):
- <Current-loop artifact/state/documentation exists>
- <Handoff documentation explains resume commands, prerequisites, and success criteria>
- <The implementation remains in the explicitly chosen current-loop state, e.g. disabled/scaffold/report-only>
- Negative Tests (expected to FAIL):
- <The implementation claims the future goal is complete>
- <The implementation enables or performs out-of-scope future work>
- <The handoff documentation omits resume steps>

## Path Boundaries

Path boundaries define the acceptable range of implementation quality and choices.
Expand Down Expand Up @@ -450,11 +470,22 @@ Each task must include exactly one routing tag:
- `coding`: implemented by Claude
- `analyze`: executed via Codex (`/humanize:ask-codex`)

Every `AC-*` must be covered by at least one task. Every task must target at least one `AC-*`. Do not target `FUT-*`, `DEC-*`, or `-` in the Target AC column.

| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On |
|---------|-------------|-----------|----------------------------|------------|
| task1 | <...> | AC-1 | coding | - |
| task2 | <...> | AC-2 | analyze | task1 |

## Future Work / Out of Scope

Future, deferred, post-work, successor-loop, and out-of-scope items belong here, not under `## Acceptance Criteria`.

- FUT-1: <Future item that is not required for this RLCR loop>
- Source DEC: DEC-1
- Current-loop handoff: AC-X
- Promotion trigger: <Condition or follow-up loop that should promote this to a current-scope AC>

## Claude-Codex Deliberation

### Agreements
Expand Down Expand Up @@ -510,23 +541,33 @@ When `alternative_plan_language` is empty, absent, set to `"English"`, or set to

5. **AC Format**: All acceptance criteria must use AC-X or AC-X.Y format.

6. **Clear Dependencies**: Show what depends on what, not when things happen.
6. **Current-Scope AC Contract**: `AC-*` items are the current RLCR completion gate. Do NOT create deferred ACs. Any deferred, future, out-of-scope, post-work, successor-task, or successor-loop goal must be written as `FUT-*` under `## Future Work / Out of Scope`, optionally linked to a current-loop Handoff AC.

7. **Deferred AC Keyword Guard**: Before finalizing, scan each AC body for deferral markers: `TODO`, `TBD`, `deferred`, `future`, `follow-up`, `subsequent`, `next phase`, `next iteration`, `next milestone`, `next loop`, `v2`, `v.next`, `Phase II`, `left for`, `to be implemented in`, and `see FUT-`. If any marker means the AC's real work is outside this loop, rewrite the item as a current-loop handoff AC plus a `FUT-*` item, or move it entirely to future work.

8. **Handoff AC Pattern**: When preserving a future goal, write a current-loop AC only for the handoff state/artifact/documentation. The handoff AC may reference `FUT-*`, but its positive and negative tests must be fully verifiable in this loop and must not require completing the future work.

9. **AC/Task Bidirectional Coverage**: Every `AC-*` must be covered by at least one Task Breakdown row. Every Task Breakdown row must target at least one current-scope `AC-*`. No row may use an empty target, `-`, `FUT-*`, or `DEC-*` as its Target AC.

10. **DEC/FUT Linkage**: If a resolved decision defers work, the decision resolution must explicitly reference a `FUT-*` item. Each `FUT-*` item caused by a decision must include `Source DEC: DEC-N`. If there is a current-loop handoff, both the DEC and FUT entry should reference the handoff AC.

11. **Clear Dependencies**: Show what depends on what, not when things happen.

7. **TDD-Style Tests**: Each acceptance criterion MUST include both positive tests (expected to pass) and negative tests (expected to fail). This follows Test-Driven Development philosophy and enables deterministic verification.
12. **TDD-Style Tests**: Each acceptance criterion MUST include both positive tests (expected to pass) and negative tests (expected to fail). This follows Test-Driven Development philosophy and enables deterministic verification.

8. **Affirmative Path Boundaries**: Describe upper and lower bounds using affirmative language (what IS acceptable) rather than negative language (what is NOT acceptable).
13. **Affirmative Path Boundaries**: Describe upper and lower bounds using affirmative language (what IS acceptable) rather than negative language (what is NOT acceptable).

9. **Respect Deterministic Designs**: If the draft specifies a fixed approach with no choices, reflect this in the plan by narrowing the path boundaries to match the user's specification.
14. **Respect Deterministic Designs**: If the draft specifies a fixed approach with no choices, reflect this in the plan by narrowing the path boundaries to match the user's specification.

10. **Code Style Constraint**: The generated plan MUST include a section or note instructing that implementation code and comments should NOT contain plan-specific progress terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These terms belong in the plan document, not in the resulting codebase.
15. **Code Style Constraint**: The generated plan MUST include a section or note instructing that implementation code and comments should NOT contain plan-specific progress terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These terms belong in the plan document, not in the resulting codebase.

11. **Draft Completeness Requirement**: The generated plan MUST incorporate ALL information from the input draft document without omission. The draft represents the most valuable human input and must be fully preserved. Any clarifications obtained through Phase 6 should be added incrementally to the draft's original content, never replacing or losing any original requirements. The final plan must be a superset of the draft information plus all clarified details.
16. **Draft Completeness Requirement**: The generated plan MUST incorporate ALL information from the input draft document without omission. The draft represents the most valuable human input and must be fully preserved. Any clarifications obtained through Phase 6 should be added incrementally to the draft's original content, never replacing or losing any original requirements. The final plan must be a superset of the draft information plus all clarified details.

12. **Debate Traceability**: The plan MUST include Codex-first findings, Claude/Codex agreements, resolved disagreements, and unresolved decisions. Unresolved opposite opinions MUST be recorded in `## Pending User Decisions` for explicit user decision.
17. **Debate Traceability**: The plan MUST include Codex-first findings, Claude/Codex agreements, resolved disagreements, and unresolved decisions. Unresolved opposite opinions MUST be recorded in `## Pending User Decisions` for explicit user decision.

13. **Convergence Requirement**: The plan MUST record Claude/Codex agreements, resolved disagreements, and final convergence status in `## Claude-Codex Deliberation`. Stop only when convergence conditions are met or max rounds reached with explicit carry-over decisions.
18. **Convergence Requirement**: The plan MUST record Claude/Codex agreements, resolved disagreements, and final convergence status in `## Claude-Codex Deliberation`. Stop only when convergence conditions are met or max rounds reached with explicit carry-over decisions.

14. **Task Tag Requirement**: The plan MUST include `## Task Breakdown`, and every task MUST be tagged as either `coding` or `analyze` (no untagged tasks, no other tag values).
19. **Task Tag Requirement**: The plan MUST include `## Task Breakdown`, and every task MUST be tagged as either `coding` or `analyze` (no untagged tasks, no other tag values).

---

Expand All @@ -549,6 +590,10 @@ After updating, **read the complete plan file** and verify:
- The structured plan aligns with the original draft content
- Claude/Codex disagreement handling is explicit and correctly reflected
- No contradictions exist between different parts of the document
- No `AC-*` contains deferred, future, out-of-scope, post-work, or successor-loop semantics except as a valid Handoff AC whose current-loop verification is complete without performing future work
- Every `AC-*` is covered by at least one Task Breakdown row, and every Task Breakdown row targets at least one current-scope `AC-*`
- Every decision that defers work links to a `FUT-*` entry, and every such `FUT-*` entry links back with `Source DEC: DEC-N`
- Items under `## Future Work / Out of Scope` use `FUT-*`, not `AC-*`, and are not listed as current-scope Task Breakdown work

If inconsistencies are found, fix them using the Edit tool.

Expand Down
11 changes: 7 additions & 4 deletions prompt-template/codex/full-alignment-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ This is a **mandatory checkpoint** (at configurable intervals). You must conduct
@{{PLAN_FILE}}

You MUST read this plan file first to understand the full scope of work before conducting your review.
Only items under `## Acceptance Criteria` and current-scope Task Breakdown rows are completion gates.
Items under `## Future Work` / `## Out of Scope`, including `FUT-*` items, are informational and MUST NOT block the COMPLETE verdict.
If a current-scope AC or current-scope task is deferred, treat it as incomplete.

---
## Claude's Work Summary
Expand Down Expand Up @@ -103,12 +106,12 @@ The project's `.humanize/rlcr/{{LOOP_TIMESTAMP}}/` directory contains the histor

## Part 6: Output Requirements

- If issues found OR any AC is NOT MET (including deferred ACs), write your findings to @{{REVIEW_RESULT_FILE}}
- If issues found OR any current-scope AC is NOT MET (including deferred current-scope ACs), write your findings to @{{REVIEW_RESULT_FILE}}
- Include specific action items for Claude to address, classified into:
- Mainline Gaps
- Blocking Side Issues
- Queued Side Issues
- **If development is stagnating** (see Part 4), write "STOP" as the last line
- **CRITICAL**: Only write "COMPLETE" as the last line if ALL ACs from the original plan are FULLY MET with no deferrals
- DEFERRED items are considered INCOMPLETE - do NOT output COMPLETE if any AC is deferred
- The ONLY condition for COMPLETE is: all original plan tasks are done, all ACs are met, no deferrals allowed
- **CRITICAL**: Only write "COMPLETE" as the last line if ALL current-scope ACs from the original plan are FULLY MET with no deferrals
- DEFERRED current-scope items are considered INCOMPLETE - do NOT output COMPLETE if any current-scope AC is deferred
- The ONLY condition for COMPLETE is: all current-scope original plan tasks are done, all current-scope ACs are met, no current-scope deferrals allowed
9 changes: 6 additions & 3 deletions prompt-template/codex/regular-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@

You MUST read this plan file first to understand the full scope of work before conducting your review.
This plan contains the complete requirements and implementation details that Claude should be following.
Only items under `## Acceptance Criteria` and current-scope Task Breakdown rows are completion gates.
Items under `## Future Work` / `## Out of Scope`, including `FUT-*` items, are informational and MUST NOT block the COMPLETE verdict.
If a current-scope AC or current-scope task is deferred, treat it as incomplete.

Based on the original plan and @{{PROMPT_FILE}}, Claude claims to have completed the work. Please conduct a thorough critical review to verify this.

Expand All @@ -23,7 +26,7 @@ Below is Claude's summary of the work completed:

- Your task is to conduct a deep critical review, focusing on finding implementation issues and identifying gaps between "plan-design" and actual implementation.
- Relevant top-level guidance documents, phased implementation plans, and other important documentation and implementation references are located under @{{DOCS_PATH}}.
- If Claude planned to defer any tasks to future phases in its summary, DO NOT follow its lead. Instead, you should force Claude to complete ALL tasks as planned.
- If Claude planned to defer any current-scope tasks to future phases in its summary, DO NOT follow its lead. Instead, you should force Claude to complete ALL current-scope tasks as planned.
- Such deferred tasks are considered incomplete work and should be flagged in your review comments, requiring Claude to address them.
- If Claude planned to defer any tasks, please explore the codebase in-depth and draft a detailed implementation plan. This plan should be included in your review comments for Claude to follow.
- Your review should be meticulous and skeptical. Look for any discrepancies, missing features, incomplete implementations.
Expand Down Expand Up @@ -69,8 +72,8 @@ If Claude mostly worked on queued side issues and failed to advance the mainline
- In short, your review comments can include: problems/findings/blockers; claims that don't match reality; implementation plans for deferred work (to be implemented now); implementation plans for unfinished work; goal alignment issues.
- Your output should be structured so Claude can tell which items are mainline gaps, blocking side issues, and queued side issues.
- If after your investigation the actual situation does not match what Claude claims to have completed, or there is pending work to be done, output your review comments to @{{REVIEW_RESULT_FILE}}.
- **CRITICAL**: Only output "COMPLETE" as the last line if ALL tasks from the original plan are FULLY completed with no deferrals
- **CRITICAL**: Only output "COMPLETE" as the last line if ALL current-scope tasks from the original plan are FULLY completed with no deferrals
- DEFERRED items are considered INCOMPLETE - do NOT output COMPLETE if any task is deferred
- UNFINISHED items are considered INCOMPLETE - do NOT output COMPLETE if any task is pending
- The ONLY condition for COMPLETE is: all original plan tasks are done, all ACs are met, no deferrals or pending work allowed
- The ONLY condition for COMPLETE is: all current-scope original plan tasks are done, all current-scope ACs are met, no current-scope deferrals or pending work allowed
- The word COMPLETE on the last line will stop Claude.
27 changes: 27 additions & 0 deletions prompt-template/plan/gen-plan-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
## Acceptance Criteria

Following TDD philosophy, each criterion includes positive and negative tests for deterministic verification.
`AC-*` items are current RLCR completion gates: they must describe work that this implementation loop must complete and verify. Do not encode deferred, future, out-of-scope, post-work, or successor-loop goals as `AC-*`.

- AC-1: <First criterion>
- Positive Tests (expected to PASS):
Expand All @@ -22,6 +23,21 @@ Following TDD philosophy, each criterion includes positive and negative tests fo
- Negative Tests: <...>
...

### Handoff AC Pattern

Use this pattern only when the draft contains a legitimate future goal that must be preserved without making it part of the current RLCR completion gate.

- AC-X: Handoff for <future goal> is complete without performing the future work.
Comment on lines +26 to +30
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the handoff pattern outside acceptance criteria

When the generated plan retains this instructional block, it lives under ## Acceptance Criteria, so RLCR setup/review will treat it as part of the current completion gate: Phase 2 copies this template into the output plan, and the new reviewer prompts define everything under Acceptance Criteria as in scope. The block also contains the new deferral keywords (future, FUT-*, without performing the future work), so plans with no intended future work can be flagged as having deferred/incomplete AC content unless Claude happens to delete or relocate this template text.

Useful? React with 👍 / 👎.

- Future Work Reference: FUT-Y
- Positive Tests (expected to PASS):
- <Current-loop artifact/state/documentation exists>
- <Handoff documentation explains resume commands, prerequisites, and success criteria>
- <The implementation remains in the explicitly chosen current-loop state, e.g. disabled/scaffold/report-only>
- Negative Tests (expected to FAIL):
- <The implementation claims the future goal is complete>
- <The implementation enables or performs out-of-scope future work>
- <The handoff documentation omits resume steps>

## Path Boundaries

Path boundaries define the acceptable range of implementation quality and choices.
Expand Down Expand Up @@ -72,11 +88,22 @@ Each task must include exactly one routing tag:
- `coding`: implemented by Claude
- `analyze`: executed via Codex (`/humanize:ask-codex`)

Every `AC-*` must be covered by at least one task. Every task must target at least one `AC-*`. Do not target `FUT-*`, `DEC-*`, or `-` in the Target AC column.

| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On |
|---------|-------------|-----------|----------------------------|------------|
| task1 | <...> | AC-1 | coding | - |
| task2 | <...> | AC-2 | analyze | task1 |

## Future Work / Out of Scope

Future, deferred, post-work, successor-loop, and out-of-scope items belong here, not under `## Acceptance Criteria`.

- FUT-1: <Future item that is not required for this RLCR loop>
- Source DEC: DEC-1
- Current-loop handoff: AC-X
- Promotion trigger: <Condition or follow-up loop that should promote this to a current-scope AC>

## Claude-Codex Deliberation

### Agreements
Expand Down
Loading