-
Notifications
You must be signed in to change notification settings - Fork 13
Node-ID-Based Error Deduplication fix #115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Node-ID-Based Error Deduplication fix #115
Conversation
|
@0xGeorgii you can review the pr now |
|
@0xGeorgii can you kindly review the pr now |
|
@0xGeorgii Can you kindly review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors error deduplication in the type checker from string-based keys to node-ID-based tuples for improved performance and memory efficiency. Instead of deduplicating errors by their content (e.g., "UnknownType:Foo"), errors are now deduplicated per AST node, allowing the same error type to be reported once for each unique location in the source code.
Changes:
- Introduced
ErrorKindenum to categorize the 5 error types requiring deduplication - Updated deduplication logic to use
(ErrorKind, u32)tuples instead of formatted strings - Modified test expectations to reflect that errors are now reported once per AST node rather than once per error content
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| core/type-checker/src/errors.rs | Added ErrorKind enum and kind() method to support node-based deduplication |
| core/type-checker/src/type_checker.rs | Updated deduplication mechanism from string keys to (ErrorKind, u32) tuples and modified all call sites |
| tests/src/type_checker/error_recovery.rs | Updated test assertions to expect multiple errors per unique AST node location |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Please if you like my implementation, i would appreciate complements |
0xGeorgii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical Issues
- Behavioral regression in error deduplication (HIGH)
The old system reported each unique unknown symbol once. The new system reports it once per usage site. For example, UnknownType used 4 times now produces 4 identical error messages. Given the
compiler's output format (errors joined by "; "), this creates noisy, unhelpful output:
1:12: unknown type UnknownType; 1:29: unknown type UnknownType; 1:45: unknown type UnknownType; 1:61: unknown type UnknownType
The root cause is singular (the type doesn't exist), but the user sees 4 messages. Per-location reporting benefits IDE squigglies, but the Inference compiler currently outputs a flat semicolon-joined
string with no IDE diagnostic wiring.
- push_error_dedup becomes effectively dead code (HIGH)
With node-ID-based keys, deduplication only triggers if the same AST node produces the same error kind twice. Since each node is visited exactly once during type checking, this never happens. The
dedup mechanism no longer actually deduplicates anything — it's equivalent to self.errors.push(error).
- Two existing tests will likely break (HIGH)
The PR updates 3 test expectations but misses 2 others:
- test_error_deduplication_same_undefined_struct_multiple_uses — asserts type_count <= 1, but MissingStruct as a type now produces 3 errors (one per node)
- test_error_deduplication_same_undefined_enum_multiple_uses — asserts type_count <= 1, but MissingEnum as a type now produces 2 errors
These tests were not updated and will fail under the new behavior.
Design Issues
- Redundant node_id parameter (MEDIUM)
Every call site passes both the error (containing location) and the node_id from the same AST node. This is error-prone — a caller could mismatch the error's location with the wrong node_id. The
node_id could be derived from the error's Location::offset_start instead.
- ErrorKind is a maintenance burden (MEDIUM)
The new ErrorKind enum mirrors 5 of 30+ TypeCheckError variants with a manual kind() mapping. Adding a new deduplicated error requires updating 3 places (variant, enum, mapping) with no compiler
warning on omission. Using std::mem::discriminant(&error) would eliminate this enum entirely.
- Misleading naming (LOW)
ErrorKind implies comprehensive categorization but only covers 5 variants. DedupKind or DeduplicatedErrorKind would be more accurate.
Style / Process Issues
-
PR references non-existent file implemenration.md (typo in name) in the "Files Modified" section, but this file is not in the diff.
-
Test names are misleading — tests named test_error_deduplication_* now verify the absence of deduplication (per-node reporting), contradicting their stated purpose.
Summary
Migrates error deduplication from string-based keys to node-ID-based tuples
(ErrorKind, u32)for improved performance and memory efficiency.Fixes #71
Fixes #108
AI Assistance Disclosure
This implementation was developed with the assistance of Claude (A). All code was manually reviewed and tested before submission.
Changes
Added
ErrorKindenum incore/type-checker/src/errors.rsUnknownType,UndefinedFunction,UnknownIdentifier,UndefinedStruct,UndefinedEnumkind()method onTypeCheckErrorOption<ErrorKind>for errors that need deduplicationNonefor other error typesModified
TypeCheckerstruct (core/type-checker/src/type_checker.rs)reported_error_keys: FxHashSet<String>→reported_errors: FxHashSet<(ErrorKind, u32)>push_error_dedupsignaturenode_id: u32parameter(ErrorKind, u32)tuple as deduplication key7 call sites updated to pass node IDs:
3 test expectations updated for new behavior:
test_error_deduplication_same_unknown_type_multiple_uses: 1 → 4 errorstest_error_deduplication_same_undefined_function_multiple_calls: 1 → 2 errorstest_error_deduplication_same_unknown_identifier_multiple_uses: 1 → 3 errorsFiles Modified
cargo build -p inference-type-checker
cargo test -p inference-type-checker
Testing