Skip to content

feat: simplify lever classification to primary/secondary/remove#372

Closed
neoneye wants to merge 2 commits intomainfrom
simplify-lever-classification
Closed

feat: simplify lever classification to primary/secondary/remove#372
neoneye wants to merge 2 commits intomainfrom
simplify-lever-classification

Conversation

@neoneye
Copy link
Member

@neoneye neoneye commented Mar 20, 2026

Summary

  • Replaces the 4-way taxonomy (PR feat: consolidate deduplicate_levers — classification, safety valve, B3 fix #365's primary/secondary/absorb/remove) with a simpler 3-way classification: primary, secondary, remove
  • Absorbed levers are now classified as remove with the absorbing lever_id stated in the justification — no separate absorb category
  • Hypothesis: fewer categories means each class gets exercised more, making it easier to validate whether the model is using them correctly

Changes from main (keep/absorb/remove → primary/secondary/remove)

  • LeverClassification enum: keepprimary + secondary, absorb merged into remove
  • OutputLever now includes a classification field (primary or secondary) for downstream use
  • System prompt rewritten with primary/secondary definitions, concrete secondary examples, calibration hint (expect 4–10 removals), and "do not stop early" instruction
  • Safety valve narrowed: "use primary only as a last resort" (replaces blanket "use keep if unsure")
  • B3 fix: conditional ... in both _build_compact_history and all_levers_summary
  • OPTIMIZE_INSTRUCTIONS block documents 5 known failure modes for self-improve analysis
  • enrich_potential_levers.py: accepts optional classification field (backward-compatible)

Bug fixes from iter 45 code review

  • B1 fix: user_prompt field now stores project_context instead of serialized levers JSON — consistent with all other pipeline steps

Test plan

  • Run deduplicate_levers step via self-improve runner against snapshot input
  • Verify all 7 models produce valid primary/secondary/remove classifications
  • Compare removal counts against iter 48 (main baseline) — expect similar or better consolidation
  • Check that no model produces zero removals (blanket-primary failure mode)

🤖 Generated with Claude Code

neoneye and others added 2 commits March 20, 2026 19:49
Replace 4-way taxonomy (keep/absorb + primary/secondary from PR #365)
with 3-way: primary, secondary, remove. Absorbed levers are now
classified as "remove" with the absorbing lever_id in justification.

Hypothesis: fewer categories = more consistent exercise of each class,
easier to validate results.

Also includes best improvements from PR #365:
- Safety valve narrowed ("primary only as a last resort")
- Calibration hint (expect 4-10 removals, do not stop early)
- B3 fix: conditional ellipsis in compact history and lever summary
- OPTIMIZE_INSTRUCTIONS with 5 known failure modes
- classification field preserved in OutputLever for downstream use
- enrich_potential_levers accepts optional classification field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
deduplicate_levers was storing the serialized input levers in user_prompt
instead of the project context. This made the saved raw output misleading
— other pipeline steps store the actual user prompt in this field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@neoneye
Copy link
Member Author

neoneye commented Mar 20, 2026

Iteration 49 Results & Next Steps

Verdict: KEEP — 35/35 runs succeeded, all 3 categories exercised (58% primary, 27% secondary, 15% remove).

Key wins over main (iter 48)

  • Degenerate llama3.1 collapse eliminated (7 levers into "Risk Framing" → gone)
  • Haiku remove rate improved 28% → 39%
  • Primary/secondary triage adds downstream signal that keep-only lacks
  • All remove justifications cite absorbing lever UUID (77% full UUID, 87% excl. qwen3)

Cross-iteration ranking (iter 45 vs 48 vs 49)

Iter 49 (this PR) ranked best. The 4-way taxonomy in PR #365 has a dead category — remove is never used when absorb exists. With 3 categories, all three get exercised. Absorb-info isn't consumed downstream, so a separate absorb category adds complexity without benefit.

Remaining issues (pre-existing, not introduced by this PR)

  • B3: Template-lock in secondary definition — llama3.1 copies it verbatim, producing 0 removes on sovereign_identity
  • B2: Contradictory primary fallback instruction
  • B1: partial_recovery threshold fires on normal 2-call runs in runner.py

Architectural direction

The bigger improvement is moving from 18 sequential LLM calls per plan to 1 batch call. The per-lever approach causes position bias, prevents global consistency, and is 18× more expensive. Plan to implement this as a follow-up PR on top of this one — same 3-way taxonomy, same output schema, just restructured as a single call.

Full analysis: PlanExe-prompt-lab/analysis/49_deduplicate_levers/

@neoneye
Copy link
Member Author

neoneye commented Mar 21, 2026

Superseded by PR #375 (merged). PR #375 combines the batch architecture with categorical taxonomy and prompt fixes.

@neoneye neoneye closed this Mar 21, 2026
@neoneye neoneye deleted the simplify-lever-classification branch March 21, 2026 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant