refactor: replace `simp_alive_split` implementation with just simp. #1220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

alexkeizer wants to merge 16 commits into main from dont-split-ifs-rebased

Collaborator

alexkeizer commented May 14, 2025 •

edited

Loading

This PR changes the implementation of simp_alive_split---which used to call split and split_ifs and thus introduced new subgoals---with a single simp call (and some intros to preserve that behaviour). This is possible as the if statements used in the semantics follow a strict pattern, where one branch of the if is always poison. Such ifs with poison admit straightforward simplification lemmas, which this PR adds, thus removing the need to split on the condition.

NOTE: I've confirmed the new implementation of simp_alive_split successfully closes all the expected goals inAliveStatements, but we ought to wait for the full evaluation to run before merging this PR, as there could be breakage there.

Keep in mind that we do expect the numbers of solved theorems reported by the evaluation to go down, given that we are actively trying to reduce the number of subgoals generated by our automation (hence reducing the number of distinct calls to the SAT solvers).

alexkeizer added 5 commits

May 14, 2025 19:42


          refactor: replace simp_alive_split implementation with just simp.

This PR changes the implementation of `simp_alive_split`---which used to call `split` and `split_ifs` and thus introduced new subgoals---with a single simp call (and some `intros` to preserve that behaviour). This is possible as the `if` statements used in the semantics follow a strict pattern, where one branch of the if is always `poison`. Suchs ifs with poison admit straightforward simplification lemmas, which this PR adds, thus removing the need to split on the condition.


          Merge branch 'main' into dont-split-ifs-rebased

2d4ccff


          Merge remote-tracking branch 'origin/main' into dont-split-ifs-rebased

b3c2522


          refactor: replace constant 0 with 0#w in LLVM semantics

82e641c

The latter is the simp normal form, so it's better we use that to define semantics


          fix: tweak simp_llvm simpset and fix handwritten examples

b46abb5

github-actions bot commented May 15, 2025

Alive Statistics: 90 / 93 (3 failed)


          fix: add fallback simp lemmas for arbitrary if-then-else

dc95094

github-actions bot commented May 15, 2025

Alive Statistics: 90 / 93 (3 failed)


          fix type

8bf2070

github-actions bot commented May 15, 2025

Alive Statistics: 90 / 93 (3 failed)


          feat: work out simproc in TODO note

849b174

github-actions bot commented May 15, 2025

Alive Statistics: 90 / 93 (3 failed)


          feat: implement if-then-else and poison normalization pass

bfa93d3

github-actions bot commented May 16, 2025

Alive Statistics: 90 / 93 (3 failed)

github-actions bot commented May 16, 2025

bitwuzla proved and leanSAT failed theorem 2 in file gexact_proof.lean
bitwuzla proved and leanSAT failed theorem 1 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 2 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 25 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 26 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gsremhcanonicalize_proof.lean
bitwuzla proved and leanSAT failed theorem 3 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 4 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 8 in file gapinthsub_proof.lean
bitwuzla provided counterexample and leanSAT failed theorem 1 in file gapinthrem2_proof.lean
bitwuzla proved and leanSAT failed theorem 12 in file grem_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gadd4_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2008h02h23hMulSub_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2010h11h23hDistributed_proof.lean
bitwuzla provided counterexample and leanSAT failed theorem 1 in file gapinthdiv2_proof.lean
bitwuzla proved and leanSAT failed theorem 7 in file gshlhfactor_proof.lean
bitwuzla proved and leanSAT failed theorem 13 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 14 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 15 in file greassociatehnuw_proof.lean
leanSAT and Bitwuzla solved: 5020
leanSAT and Bitwuzla provided 25 counterexamples
leanSAT and Bitwuzla both failed on 34 theorems
leanSAT failed and Bitwuzla succeeded on 17 theorems
leanSAT succeeded and Bitwuzla failed on 0 theorems
There were 0 inconsistencies
Errors raised: 112
ran rg 'Bitwuzla failed' | wc -l, expected 34, found 0, FAIL
ran rg 'LeanSAT failed' | wc -l, expected 51, found 0, FAIL
ran rg 'LeanSAT provided a counter' | wc -l, expected 25, found 0, FAIL
ran rg 'Bitwuzla provided a counter' | wc -l, expected 25, found 0, FAIL
ran rg 'LeanSAT proved' | wc -l, expected 5020, found 0, FAIL
ran rg 'Bitwuzla proved' | wc -l, expected 5037, found 0, FAIL
error (kernel) deep recursion detected was raised 25 times
error None of the hypotheses are in the supported BitVec fragment after applying preprocessing. was raised 8 times
error The SAT solver timed out while solving the problem. was raised 49 times
error The SMT solver timed out while solving the problem. was raised 30 times

alexkeizer added 2 commits

May 16, 2025 17:10


          fix: add no_index to if-then-else lemma

e950ff0


          Merge remote-tracking branch 'origin/main' into dont-split-ifs-rebased

1b99ae3

github-actions bot commented May 16, 2025

Alive Statistics: 90 / 93 (3 failed)

1 similar comment

github-actions bot commented May 16, 2025

Alive Statistics: 90 / 93 (3 failed)

github-actions bot commented May 16, 2025

bitwuzla proved and leanSAT failed theorem 2 in file gexact_proof.lean
bitwuzla proved and leanSAT failed theorem 1 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 2 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 25 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 26 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gsremhcanonicalize_proof.lean
bitwuzla proved and leanSAT failed theorem 3 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 4 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 8 in file gapinthsub_proof.lean
bitwuzla provided counterexample and leanSAT failed theorem 1 in file gapinthrem2_proof.lean
bitwuzla proved and leanSAT failed theorem 12 in file grem_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gadd4_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2008h02h23hMulSub_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2010h11h23hDistributed_proof.lean
bitwuzla provided counterexample and leanSAT failed theorem 1 in file gapinthdiv2_proof.lean
bitwuzla proved and leanSAT failed theorem 7 in file gshlhfactor_proof.lean
bitwuzla proved and leanSAT failed theorem 13 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 14 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 15 in file greassociatehnuw_proof.lean
leanSAT and Bitwuzla solved: 5054
leanSAT and Bitwuzla provided 5 counterexamples
leanSAT and Bitwuzla both failed on 30 theorems
leanSAT failed and Bitwuzla succeeded on 17 theorems
leanSAT succeeded and Bitwuzla failed on 0 theorems
There were 0 inconsistencies
Errors raised: 104
ran rg 'Bitwuzla failed' | wc -l, expected 30, found 0, FAIL
ran rg 'LeanSAT failed' | wc -l, expected 47, found 0, FAIL
ran rg 'LeanSAT provided a counter' | wc -l, expected 5, found 0, FAIL
ran rg 'Bitwuzla provided a counter' | wc -l, expected 5, found 0, FAIL
ran rg 'LeanSAT proved' | wc -l, expected 5054, found 0, FAIL
ran rg 'Bitwuzla proved' | wc -l, expected 5071, found 0, FAIL
error (kernel) deep recursion detected was raised 25 times
error The SAT solver timed out while solving the problem. was raised 49 times
error The SMT solver timed out while solving the problem. was raised 30 times


          feat: tweak simp_llvm simpset, also use seval in alive_split, and u…

ea4606a

…se `simp_all` instead of `simp`

github-actions bot commented May 16, 2025

Alive Statistics: 90 / 93 (3 failed)

github-actions bot commented May 16, 2025

bitwuzla proved and leanSAT failed theorem 2 in file gexact_proof.lean
bitwuzla proved and leanSAT failed theorem 1 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 2 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 25 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 26 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gsremhcanonicalize_proof.lean
bitwuzla proved and leanSAT failed theorem 3 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 4 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 8 in file gapinthsub_proof.lean
bitwuzla proved and leanSAT failed theorem 12 in file grem_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gadd4_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2008h02h23hMulSub_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2010h11h23hDistributed_proof.lean
bitwuzla proved and leanSAT failed theorem 7 in file gshlhfactor_proof.lean
bitwuzla proved and leanSAT failed theorem 13 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 14 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 15 in file greassociatehnuw_proof.lean
leanSAT and Bitwuzla solved: 4941
leanSAT and Bitwuzla provided 2 counterexamples
leanSAT and Bitwuzla both failed on 30 theorems
leanSAT failed and Bitwuzla succeeded on 17 theorems
leanSAT succeeded and Bitwuzla failed on 0 theorems
There were 0 inconsistencies
Errors raised: 102
ran rg 'Bitwuzla failed' | wc -l, expected 30, found 0, FAIL
ran rg 'LeanSAT failed' | wc -l, expected 47, found 0, FAIL
ran rg 'LeanSAT provided a counter' | wc -l, expected 2, found 0, FAIL
ran rg 'Bitwuzla provided a counter' | wc -l, expected 2, found 0, FAIL
ran rg 'LeanSAT proved' | wc -l, expected 4941, found 0, FAIL
ran rg 'Bitwuzla proved' | wc -l, expected 4958, found 0, FAIL
error (kernel) deep recursion detected was raised 25 times
error The SAT solver timed out while solving the problem. was raised 47 times
error The SMT solver timed out while solving the problem. was raised 30 times


          feat: admit defeat and add a split to the end of simp_alive_split

bb66be8

github-actions bot commented May 16, 2025

bitwuzla proved and leanSAT failed theorem 2 in file gexact_proof.lean
bitwuzla proved and leanSAT failed theorem 1 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 2 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 27 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 28 in file gdivhshift_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gsremhcanonicalize_proof.lean
bitwuzla proved and leanSAT failed theorem 3 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 4 in file gdistribute_proof.lean
bitwuzla proved and leanSAT failed theorem 8 in file gapinthsub_proof.lean
bitwuzla proved and leanSAT failed theorem 13 in file grem_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file gadd4_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2008h02h23hMulSub_proof.lean
bitwuzla proved and leanSAT failed theorem 0 in file g2010h11h23hDistributed_proof.lean
bitwuzla proved and leanSAT failed theorem 7 in file gshlhfactor_proof.lean
bitwuzla proved and leanSAT failed theorem 13 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 14 in file greassociatehnuw_proof.lean
bitwuzla proved and leanSAT failed theorem 15 in file greassociatehnuw_proof.lean
leanSAT and Bitwuzla solved: 6084
leanSAT and Bitwuzla provided 3 counterexamples
leanSAT and Bitwuzla both failed on 16 theorems
leanSAT failed and Bitwuzla succeeded on 17 theorems
leanSAT succeeded and Bitwuzla failed on 0 theorems
There were 0 inconsistencies
Errors raised: 74
ran rg 'Bitwuzla failed' | wc -l, expected 16, found 0, FAIL
ran rg 'LeanSAT failed' | wc -l, expected 33, found 0, FAIL
ran rg 'LeanSAT provided a counter' | wc -l, expected 3, found 0, FAIL
ran rg 'Bitwuzla provided a counter' | wc -l, expected 3, found 0, FAIL
ran rg 'LeanSAT proved' | wc -l, expected 6084, found 0, FAIL
ran rg 'Bitwuzla proved' | wc -l, expected 6101, found 0, FAIL
error (kernel) deep recursion detected was raised 25 times
error The SAT solver timed out while solving the problem. was raised 33 times
error The SMT solver timed out while solving the problem. was raised 16 times

alexkeizer added 3 commits

May 19, 2025 17:28


          feat: ofBool_eq_one_iff

29b5381


          fix: handwrittenExamples file

62c8327


          WIP: fix HandwrittenLargeExamples

93d2c94

alexkeizer marked this pull request as draft

May 21, 2025 10:46

Collaborator Author

alexkeizer commented May 21, 2025

This seems to work well enough, but to avoid causing problems with the evaluation I'll not replace the implementation of the existing tactic, but rather make this into a new tactic, which will be specialized for the new SLLVM dialect (which models proper immediate UB).

alexkeizer mentioned this pull request

feat: prove (a copy of) Alive testcases with proper UB semantics #1260

Merged

github-merge-queue bot pushed a commit that referenced this pull request


          feat: prove (a copy of) Alive testcases with proper UB semantics (#1260)

47a4105

This PR adds a parser and proof automation for the new SLLVM dialect,
based on the existing LLVM parser&proof automation, and incorporating
some new stuff I proposed in #1220.

I copied AliveStatements so I wouldn't interfere with the existing
evaluation, and then semi-manually changed it to
* use the SLLVM dialect (i.e., proper UB semantics)
* instantiate rewrites to 64 bits (as I haven't bothered implementing
MetaSLLVM yet), except for one testcase where bitblasting timed-out for
any width > 7.

All test cases that were proven before are now proven with proper UB
semantics. Two rewrites that we sorried before turned out to just be
plain false (even without UB), so I noted down the counter-example and
ignored those test cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet