Implementation of the POETRY algorithm (NeurIPS 2024) for Dafny.
POETRY proves theorems recursively using:
- Verifiable proof sketches with
Admit(...)placeholders - Recursive best-first search exploring multiple proof strategies
- Greedy sketch exploration - pause and recurse when sketch found
- Hybrid reasoning - symbolic (induction) + neural (LLM)
dafny-admitter– creates verifiable sketches withAdmit("…", φ)obligationsdafny-sketcher-cli– symbolic reasoning (induction search, errors)dafny-annotator- oracle training (assertions, helper lemma calls)
pip install -e .# POETRY with LLM (requires LLM setup - see below)
dafny-poetry --file examples/example.dfy --use-llm -v
# Symbolic only (no LLM required)
dafny-poetry --file examples/example.dfy -v
# Compare with legacy greedy algorithm
dafny-poetry --file examples/example.dfy --legacy -vFor LLM-based refinement, set environment variables as needed. See llm.py.
Without LLM setup, POETRY uses only symbolic reasoning (induction search).
dafny-poetry --file FILE.dfy [OPTIONS]Key options:
--max-depth N- Maximum recursion depth (default: 10)--max-branches N- LLM samples per expansion (default: 2)--global-timeout SEC- Overall time limit (default: 600)--local-timeout SEC- Per-level timeout for depth > 1 (default: 120)--use-llm- Enable LLM refinement--llm-tries N- LLM retries per sample (default: 2)-v, --verbose- Detailed logging--legacy- Use old greedy algorithm (for comparison)
Example:
dafny-poetry --file proof.dfy \
--max-depth 15 \
--max-branches 4 \
--use-llm \
-vfrom pathlib import Path
from dafny_poetry.api import verify_dafny_file
result = verify_dafny_file(
Path("proof.dfy"),
max_depth=10,
use_llm=True,
verbose=True
)
print(f"Success: {result.success}")
print(f"Admits: {result.final_admits}/{result.initial_admits}")See API.md for complete API documentation.
- Create initial sketch -
dafny-admitterconverts failing proofs toAdmit(...)calls - Recursive BFS - Search for proof at each level:
- Select best OPEN node (highest score)
- Expand via induction search + LLM samples
- Create child nodes for each candidate
- Greedy recursion - When sketch found (HALF_PROVED):
- Pause current level
- Recurse on sub-goal (Admit to prove)
- If sub-goal succeeds → check if proof complete
- If sub-goal fails → backpropagate, try alternative sketch
- Anytime verifiability - All intermediate states verify with
Admit(...)placeholders
See POETRY_IMPLEMENTATION.md for detailed algorithm description.
✅ Recursive BFS - Best-first search at each recursion level
✅ Greedy sketch exploration - Immediate recursion on first sketch found
✅ Status tracking - OPEN, PROVED, HALF_PROVED, FAILED
✅ Branching - Multiple LLM samples per expansion
✅ Backpropagation - Invalidate failed sketches, try alternatives
✅ Hybrid reasoning - Symbolic induction + neural LLM
POETRY_IMPLEMENTATION.md- Complete algorithm guideAPI.md- Python API referenceIMPLEMENTATION_SUMMARY.md- Implementation overview
This is a faithful implementation of POETRY adapted for Dafny:
- Uses
Admit(...)instead of Isabelle'ssorry - Batch verification via
dafny-admitter(vs. interactive Isabelle) - Scoring via admit reduction (paper uses LLM log probability)
- Pluggable LLM (paper uses trained model)
See implementation notes in POETRY_IMPLEMENTATION.md for details.
- Paper: "Proving Theorems Recursively" (NeurIPS 2024)
- Algorithm: Section 3.2 - Recursive Best-First Search
- Details: Appendix A.1 - Status updates and termination