Fix duplicate sites from incorrect Wyckoff position and add builder validation#69
Merged
Fix duplicate sites from incorrect Wyckoff position and add builder validation#69
Conversation
The Mg coordinate [0.23, 0.92, 0.09] sat on the 96i general position in F-43m, generating pairs of atoms 0.14 A apart inside the same S tetrahedron. This produced 384 duplicate type 2 sites with identical vertex/reference atoms. Replace with [0.77, 0.585, 0.585] which sits on the 48h special position, giving the correct 384 unique type 2 sites with no duplicates. Closes #68
Pre-build: check reference structure for same-species atom pairs closer than min_atom_distance (default 0.5 A). Raises ValueError with instructions to disable via .with_min_atom_distance(0). Post-build: check for duplicate PolyhedralSite (by vertex_indices) or DynamicVoronoiSite (by reference_indices). Raises ValueError. Does not apply to VoronoiSite or SphericalSite. Part of #68
Use indices_for_species from tools.py instead of inline list comprehension. Remove assert in favour of caller guard. Reject negative min_atom_distance values. Add explicit VoronoiSite and SphericalSite handling in _validate_unique_sites. Update build() docstring with new ValueError cases. Remove unit assumption from distance threshold (uses lattice units, not necessarily angstroms).
There was a problem hiding this comment.
Pull request overview
Fixes a known argyrodite reference-structure issue that produced duplicate sites, and adds TrajectoryBuilder validation to detect (and optionally prevent) similar problems early.
Changes:
- Correct Mg (type 2) reference coordinate to a 48h Wyckoff position across tests/docs/benchmarks to eliminate duplicate type-2 sites.
- Add TrajectoryBuilder validation: a pre-build same-species minimum-distance check (configurable via
with_min_atom_distance) and a post-build duplicate-site check for reference-index-defined site types. - Add new test coverage and validation/benchmark scripts for the decoupling work and the duplicate-site regression.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
site_analysis/builders.py |
Adds builder-level validation knobs and checks (pre-build atom-distance; post-build duplicate-site detection). |
tests/test_builders.py |
Updates builder reset test to mock validation calls and adds a dedicated validation test suite. |
tests/test_numba_validation.py |
Updates argyrodite reference Mg coordinate used in validation runs. |
tests/benchmark_dynamic_voronoi.py |
Updates argyrodite reference Mg coordinate used in benchmarks. |
tests/benchmark_containment.py |
Updates argyrodite reference Mg coordinate used in benchmarks. |
tests/validate_decoupling.py |
New script to run/compare full argyrodite analyses across site types for correctness + timing. |
tests/benchmark_before_after.py |
New before/after benchmark script for decoupling performance checks on real MD data. |
tests/benchmark_all_site_types.py |
New benchmark script to time per-frame assignment across all four site types. |
docs/source/tutorials/residence_times_and_transitions.md |
Updates tutorial reference coordinate for Mg (type 2) and annotates Wyckoff position. |
docs/source/tutorials/argyrodite_site_definitions.ipynb |
Updates notebook to corrected Mg coordinate and clears execution outputs. |
docs/scripts/generate_notebooks.py |
Keeps generated notebook/tutorial content in sync with corrected Mg coordinate. |
Comments suppressed due to low confidence (1)
tests/test_builders.py:1194
- The “state reset after build” and “reset()” tests now claim to verify the builder resets its entire state, but they don’t set/assert the new
_min_atom_distanceattribute. Consider setting it to a non-default value before build/reset and asserting it returns to the default (0.5) so these tests continue to cover all builder state as the class evolves.
# Set state directly to non-default values
builder._structure = Mock(spec=Structure)
builder._reference_structure = Mock(spec=Structure)
builder._mobile_species = "Li"
builder._atoms = [Mock(spec=Atom)]
builder._align = False
builder._align_species = ["Na"]
builder._align_metric = "max_dist"
builder._mapping_species = ["Cl"]
builder._site_generators = [lambda: []]
# Mock the methods called within build (including validation)
with patch('site_analysis.builders.atoms_from_structure'), \
patch('site_analysis.builders.Trajectory'), \
patch('site_analysis.builders.Site.reset_index'), \
patch.object(builder, '_validate_reference_atom_distances'), \
patch.object(builder, '_validate_unique_sites'):
# Call build
builder.build()
# Verify entire state is reset
self.assertIsNone(builder._structure)
self.assertIsNone(builder._reference_structure)
self.assertIsNone(builder._mobile_species)
self.assertIsNone(builder._atoms)
self.assertTrue(builder._align) # Default is True
self.assertIsNone(builder._align_species)
self.assertEqual(builder._align_metric, 'rmsd')
self.assertIsNone(builder._mapping_species)
self.assertEqual(builder._site_generators, [])
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix with_min_atom_distance docstring to accurately describe when the check runs (whenever a reference structure is set, not only for polyhedral/dynamic Voronoi workflows) - Rename `label` to `index_attr` in _validate_unique_sites to avoid confusion with Site.label - Add with_min_atom_distance to the builder guide method reference - Add Reference Structure Validation section to the builder guide covering close-atom and duplicate-site checks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the type 2 (Mg) site coordinate in the argyrodite reference structure and adds validation to the builder to prevent similar issues silently producing duplicate sites.
Wyckoff fix
The Mg coordinate
[0.23, 0.92, 0.09]was on the 96i general position in F-43m, generating pairs of atoms 0.14 A apart inside the same S tetrahedron. This produced 384 duplicate type 2 sites with identical vertex/reference atoms.Replaced with
[0.77, 0.585, 0.585]on the 48h special position, giving the correct 384 unique type 2 sites.Builder validation
Pre-build check: Before generating sites, validates that no same-species atom pairs in the reference structure are closer than
min_atom_distance(default 0.5 A). RaisesValueErrorwith an explanation and instructions to disable via.with_min_atom_distance(0).Post-build check: After generating sites, checks for duplicate
PolyhedralSite(byvertex_indices) orDynamicVoronoiSite(byreference_indices). Does not apply to coordinate-based site types.Closes #68