Enable NanoPET for atomic-basis spherical targets #527

jwa7 · 2025-03-21T12:59:28Z

Extends NanoPET to enable predictions of spherical targets expressed on an atomic basis. Examples are the electron density decomposed on an auxiliary basis, and the Hamiltonian on the coupled atomic orbital basis.

Overview

Introduces a new target type "atomic_basis_spherical" to allow learning of spherical targets on an atomic basis set.
Only NanoPET is supported at present
Targets with 1 component axis are supported - this covers for example the electron density on a basis and the Hamiltonian/density matrix on a coupled basis
Per-pair targets can be either permutationally symmetrized or unsymmetrized. In both cases, only unique atom pairs are predicted

NanoPET architecture details

Re-uses the existing infrastructure for mapping last layer features to the spherical component heads
Infrastructural changes were required as follows:
- For per-atom targets:
  - Last-layer PET features are sliced just before being passed through the output layer. Slicing is needed to extract the samples of the correct type based on the "center_type" the output block corresponds to.
- For per-pair targets:
  - Again, the last-layer features are sliced according to the "first_atom_type" and "second_atom_type" the output block corresponds to. In the case of permutationally-symmetrized targets, the "s2_pi" key is also used to do this slicing.
  - Pre-last layer transformations are a little more complicated than for per-atom targets. Both the PET node and edge features are used. These are passed through separate heads before being combined in a single tensor that represents the last-layer features for the whole per-pair quantity.
  - Targets can be either permutationally symmetrized or not. In the former case, the PET edge features are symmetrized and thus the last layer features (if outputted) would carry the relevant metadata ("s2_pi" etc), but in the samples labels. as slicing into blocks only occurs at the output layer.
  - Only unique samples are predicted for per-pair targets. Practically this means that triangular off-site blocks in atom type are predicted where "first_atom_type" < "second_atom_type". For blocks where "first_atom_type" == "second_atom_type", samples are triangular in atom index such that "first_atom" <= "second_atom"
A new file "nanopet/modules/samples.py" has been created to handle the construction of samples for the features and outputs (which are in general different for these targets, per-pair ones in particular). These can stay here for now, and perhaps later be moved to metatomic.

Other metatrain infrastructure details

The dataloader has been modified to use join_kwargs={"different_keys": "union"} as this is required for targets on an atomic basis where different systems have different atom types (and therefore keys)
"per_atom.py" has been modified to not perform a sum over samples, also for per-pair targets
"augmentation.py" has been cleaned up a bit, specifically with regards to how the blocks are split along samples
TargetInfo has been modified to include attributes such as target_type: str and sample_kind: str. The former is used to replace all the is_scalar, is_cartesian, ... etc bool flags

Contributor (creator of pull-request) checklist

Tests updated (for new features and bugfixes)?
Documentation updated (for new features)?
~~[ ] Issue referenced (for PRs that solve an issue)?~~

Reviewer checklist

CHANGELOG updated with public API or any other important changes?

📚 Documentation preview 📚: https://metatrain--527.org.readthedocs.build/en/527/

…ingle component)

frostedoyster

Just checked the model part for now, I will check the data augmentation and infrastructure parts later

docs/src/advanced-concepts/fitting-atomic-basis-spherical-targets.rst

src/metatrain/experimental/nanopet/model.py

src/metatrain/experimental/nanopet/modules/augmentation.py

src/metatrain/utils/data/target_info.py

Co-authored-by: Filippo Bigi <[email protected]> Co-authored-by: Paolo Pegolo <paolo.pegolo.epfl.ch>

Co-authored-by: Paolo Pegolo <[email protected]>

jwa7 · 2025-04-16T14:35:01Z

@Luthaf @frostedoyster @ppegolo here's an update - ready for review when you're ready!

On my side, still to do (but shouldn't affect review in the meantime):

Write some tests
Update the changelog
Fix the model export torchscript error - help appreciated, I can't seem to figure it out!

src/metatrain/utils/augmentation.py

src/metatrain/utils/data/get_dataset.py

Luthaf

I'm not sure how much I understand the changes to PET, so I'll let someone else check this part.

One question is how much work would it be to port this to NativePET?

src/metatrain/experimental/nanopet/model.py

Luthaf · 2025-04-17T14:35:46Z

src/metatrain/experimental/nanopet/model.py

+                # symmetrize the PET edge features and pass through its head
+                if (
+                    self.atomic_basis_target_info[output_name]["sample_kind"]
+                    == "per_pair_sym"


why is this a different kind of sample?

Because in this case the PET features are still one whole block (i.e. we haven't sliced by s2_pi/first_atom_type/second_atom_type yet), and because it is symmetrized the normal samples aren't complete in info: we have "duplicated" samples that carry the index s2_pi=+/-1

src/metatrain/experimental/nanopet/modules/samples.py

src/metatrain/experimental/nanopet/trainer.py

Luthaf · 2025-04-17T14:42:00Z

src/metatrain/utils/augmentation.py

+        # First, build the indices that split the block samples by system
+        split_indices: List[int] = []
+
+        if target_type == "spherical":


what's the difference between target_type == "spherical" and target_type == "atomic_basis_spherical"?

atomic basis spherical doesn't have all the atomic samples in a given block - only those with the corresponding to the atom types. The splitting along the samples axis needs more care than for pure spherical targets

Luthaf · 2025-04-17T14:43:00Z

src/metatrain/utils/augmentation.py

+    s2_pi: int,
+) -> List[int]:
+    """
+    Finds the indices that splits a TensorBlock along the samples axis by system index.


Is this an implementation of metatensor/metatensor#627?

src/metatrain/utils/data/target_info.py

frostedoyster

Thanks @jwa7, it looks good. I think it would be nice to have some test. If merging is not urgent, I can also write a test when I have some time. I still haven't looked at the changes in the model

docs/src/advanced-concepts/fitting-atomic-basis-spherical-targets.rst

src/metatrain/experimental/nanopet/trainer.py

src/metatrain/utils/augmentation.py

frostedoyster · 2025-04-24T09:57:41Z

src/metatrain/utils/data/dataset.py

+                tensor_map = join(
+                    [
+                        _empty_tensor_map_like(self[tensor_i][target_key])
+                        for tensor_i in range(len(self))
+                    ],
+                    "samples",
+                    remove_tensor_name=True,
+                    different_keys="union",
+                )


This will be terribly slow (and run out of memory) with a large dataset

Not sure what a good solution is though...

This has now been moved so that it only affects spherical_atomic_basis targets, but is still unresolved. Needs some thought...

src/metatrain/utils/loss.py

…tesian`... etc bools

jwa7 · 2025-05-06T10:05:08Z

An update on the current progress:

From the perspective of the original goals of this PR, this is almost ready. The only things left to do is figure out an issue with samples slicing in the case of per-pair targets.

However, in light of recent discussions, it seems appropriate to put this work on pause temporarily until other issues are figured out. These are:

Refactor CompositionModel #555 (and also Scaler)
Change per_atom to sample_kind #570 (also in metatomic)

Then, on top of these changes, this PR can be steered in the following way:

unify with the target type spherical (i.e. get rid of spherical_atomic_basis)
support reading metadata both from a layout_{target_name}.mts and specification of irreps in options.yaml. The latter will be auto-generated by utility functions, i.e. in metatomic
Potentially implemented also (or instead) in NativePET

PicoCentauri · 2025-08-08T09:15:43Z

nanoPET will not further developed

jwa7 added 7 commits March 21, 2025 11:24

Allow reading of traget type 'atomic_basis_spherical'

66b214d

Missing TargetInfo attribute

580036c

Allow rotational augmentation of targets on atomic spherical basis (s…

e635b48

…ingle component)

use {}_to_device and {}_to_dtype functions in validation batching

826263b

Modify loss to allow for empty blocks

c9d14cb

Allow multi-output training for atomic basis targets

cf17a19

Stubs for documentation

7a1179b

frostedoyster reviewed Mar 27, 2025

View reviewed changes

frostedoyster reviewed Mar 28, 2025

View reviewed changes

src/metatrain/experimental/nanopet/modules/augmentation.py Outdated Show resolved Hide resolved

src/metatrain/utils/data/target_info.py Outdated Show resolved Hide resolved

jwa7 and others added 2 commits April 3, 2025 16:48

Use DiskDataset.

4c2942b

Co-authored-by: Filippo Bigi <[email protected]> Co-authored-by: Paolo Pegolo <paolo.pegolo.epfl.ch>

Linter on docs.

130b110

Co-authored-by: Paolo Pegolo <[email protected]>

jwa7 force-pushed the pet_atomic_basis branch from 04d0e2f to 130b110 Compare April 4, 2025 14:07

jwa7 added 7 commits April 8, 2025 09:04

Merge branch 'main' into pet_atomic_basis

014cdf1

Predict whole matrix, part 1

716bde1

Treat matrices as a single target

3779229

Update docs

4c52ce8

Format, lint

06bf314

Remove uncoupled basis targets for now

4f3d879

Modify per_atom.py

fe6f26d

jwa7 requested review from Luthaf, frostedoyster and ppegolo April 16, 2025 14:24

Merge branch 'main' into pet_atomic_basis

696a62d

jwa7 marked this pull request as ready for review April 16, 2025 14:35

ppegolo reviewed Apr 17, 2025

View reviewed changes

src/metatrain/utils/augmentation.py Show resolved Hide resolved

ppegolo reviewed Apr 17, 2025

View reviewed changes

src/metatrain/utils/data/get_dataset.py Outdated Show resolved Hide resolved

Paolo review comments

a1e06d5

jwa7 self-assigned this Apr 17, 2025

Luthaf reviewed Apr 17, 2025

View reviewed changes

jwa7 added 3 commits April 23, 2025 18:23

Address Guillaume review comments

a857b87

Merge branch 'main' into pet_atomic_basis

73c0919

Lint

25822a5

frostedoyster reviewed Apr 24, 2025

View reviewed changes

Use a str as TargetInfo.target_type instead of is_scalar, `is_car…

3a183d4

…tesian`... etc bools

jwa7 requested review from DavideTisi and abmazitov as code owners April 25, 2025 15:57

jwa7 requested a review from frostedoyster April 25, 2025 16:04

Use a layout TensorMap, not the incredibly slow join

9471b85

jwa7 assigned ppegolo May 6, 2025

Luthaf mentioned this pull request May 21, 2025

Enable electron density learning #491

Closed

2 tasks

jwa7 and others added 7 commits June 17, 2025 18:35

incomplete fixing of the merge

aaff647

Fix bug

e7dc511

Remove prints

647b1bd

Remove other prints

f793a21

Fix device mismatch

eb13745

Linting

ff005fc

Merge branch 'main' into pet_atomic_basis

e8ac068

PicoCentauri closed this Aug 8, 2025

PicoCentauri deleted the pet_atomic_basis branch August 18, 2025 09:33

Enable NanoPET for atomic-basis spherical targets #527

Enable NanoPET for atomic-basis spherical targets #527

Uh oh!

Conversation

jwa7 commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

NanoPET architecture details

Other metatrain infrastructure details

Contributor (creator of pull-request) checklist

Reviewer checklist

Uh oh!

frostedoyster left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jwa7 commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Luthaf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frostedoyster left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jwa7 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PicoCentauri commented Aug 8, 2025

Uh oh!

Uh oh!

jwa7 commented Mar 21, 2025 •

edited

Loading

jwa7 commented Apr 16, 2025 •

edited

Loading

jwa7 commented May 6, 2025 •

edited

Loading