Conversation
05bedb8 to
fc717f0
Compare
2562bfa to
af8eb53
Compare
cameel
left a comment
There was a problem hiding this comment.
For now I only checked the overall structure. I will need another pass to actually check the implementation for correctness.
Generally the tool looks like a good idea. Would be even better if this was a proper diff that does not stop at the first difference, but that's much more work. For what we need this is already good enough.
TBH I don't like how long the Python script is, but it still looks useful. I'd just prefer it to be less fuzzy with that it does. I bet it has tons of false positives and false negatives in general usage outside of the AST ID PR.
I left some initial comments, but that's not a full review yet.
| ) | ||
| target_link_libraries(libYulASTComparator PUBLIC solidity) | ||
|
|
||
| add_executable(yulASTComparator main.cpp) |
There was a problem hiding this comment.
| add_executable(yulASTComparator main.cpp) | |
| add_executable(yul-ast-comparator main.cpp) |
Or maybe we should give the executable a shorter name. E.g. yulcmp oryuldiff?
| return object; | ||
| } | ||
|
|
||
| int main(int argc, char* argv[]) |
There was a problem hiding this comment.
All our executables should have standard top-level error handlers to catch failed asserts and unhandled exceptions from lower layers. See for example solc/main.cpp.
| print(f" Skipped: {len(skipped)}") | ||
| print("=" * 50) | ||
|
|
||
| if mismatches: |
There was a problem hiding this comment.
Truthy comparisons are evil.
| if mismatches: | |
| if len(mismatches) != 0: |
| equiv, msg = run_comparator(comparator, yul_a, yul_b) | ||
| if equiv: | ||
| continue | ||
| if "PARSE_ERROR" in msg or "TIMEOUT" in msg or "ERROR" in msg: |
There was a problem hiding this comment.
I don't like how fuzzy these comparisons are. Note how "PARSE_ERROR" in msg overlaps with "ERROR" in msg. If not for the fact that you handle both the same way, they'd not be even usable, because these two errors are not distinguishable this way. Why isn't the kind of error just a separate return variable?
| if "cmdlineTests" in path: | ||
| if path.endswith("output.json"): | ||
| return "json" | ||
| if path.endswith("/output") or path.endswith("/err"): | ||
| return "text" | ||
| return None |
There was a problem hiding this comment.
So e.g. input.json will not be classified as 'json'? Is that intentional?
There was a problem hiding this comment.
The checks you have here are a bit too fuzzy for my taste.
You should at least use pathlib. For example your check will match somethingoutput.json while with pathlib a simple path.name == 'output.json' by default does the right thing. Same with matching the directory names - I'd not use in for those.
The fact that things that are not classified are just skipped is not great either.
Adds a Yul AST comparator tool
yulASTComparatorthat structurally compares two Yul ASTs, treating variable and function names as equivalent if they correspond 1:1 via a scoped bidirectional map. This is useful for verifying that optimizer changes or internal renaming (e.g., switching name-generation schemes) preserve semantic equivalence. The tool reports the first point of divergence with a path and a reason.