Skip to content

Commit a8aefe5

Browse files
sofiia-chornaceriottm
authored andcommitted
Small text impovements
1 parent d22d837 commit a8aefe5

File tree

3 files changed

+27
-20
lines changed

3 files changed

+27
-20
lines changed

examples/pet-finetuning/c_ft_res.png

-28.3 KB
Loading
5.15 KB
Loading

examples/pet-finetuning/pet-ft.py

Lines changed: 27 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,18 @@
88
Sofiia Chorna `@sofiia-chorna <https://github.com/sofiia-chorna>`_
99
1010
This example demonstrates fine-tuning the PET-MAD model with `metatrain
11-
<https://github.com/metatensor/metatrain>`_ on a small dataset of ethanol structures.
11+
<https://github.com/metatensor/metatrain>`_ and two-stage training strategy from
12+
scratch.
1213
1314
We cover two examples: 1) simple fine-tuning the pretrained PET-MAD model to adapt it to
1415
specialized task by retraining it on a domain-specific dataset, and 2) two-stage
1516
training strategy, first on non-conservative forces for efficiency, followed by
1617
fine-tuning on conservative forces to ensure physical consistency.
1718
18-
PET-MAD is a universal machine-learning forcefield trained on `the MAD dataset
19-
<https://arxiv.org/abs/2506.19674>`_ that aims to incorporate a very high degree of
20-
structural diversity. It uses `the Point-Edge Transformer (PET)
19+
`PET-MAD <https://arxiv.org/abs/2503.14118>`_ is a universal machine-learning forcefield
20+
trained on `the MAD dataset <https://arxiv.org/abs/2506.19674>`_ that aims to
21+
incorporate a very high degree of structural diversity. The model itself is `the
22+
Point-Edge Transformer (PET)
2123
<https://proceedings.neurips.cc/paper_files/paper/2023/file/fb4a7e3522363907b26a86cc5be627ac-Paper-Conference.pdf>`_,
2224
an unconstrained architecture that achieves symmetry compliance through data
2325
augmentation during training.
@@ -72,9 +74,13 @@
7274
#
7375
# DFT-calculated energies often contain systematic shifts due to the choice of
7476
# functional, basis set, or pseudopotentials. If left uncorrected, such shifts can
75-
# mislead the fine-tuning process. We apply a linear correction based on atomic
76-
# compositions to align our fine-tuning dataset with PET-MAD energy reference. First, we
77-
# define a helper function to load reference energies from PET-MAD.
77+
# mislead the fine-tuning process.
78+
#
79+
# On this example we use the sampled subset of ethanol structures from `rMD17 dataset
80+
# <https://doi.org/10.48550/arXiv.2007.09593>`_ with PBE/def2-SVP level of theory which
81+
# is different from the MAD where PBEsol is used. We apply a linear correction based on
82+
# atomic compositions to align our fine-tuning dataset with PET-MAD energy reference.
83+
# First, we define a helper function to load reference energies from PET-MAD.
7884

7985

8086
def load_reference_energies(checkpoint_path):
@@ -95,8 +101,8 @@ def load_reference_energies(checkpoint_path):
95101

96102
# %%
97103
#
98-
# The dataset is composed of 100 structures of ethanol. We fit a linear model based on
99-
# atomic compositions to apply energy correction.
104+
# For demonstration, the dataset is composed only of 100 structures of ethanol. We fit a
105+
# linear model based on atomic compositions to apply energy correction.
100106

101107
dataset = ase.io.read("data/ethanol.xyz", index=":", format="extxyz")
102108

@@ -237,18 +243,15 @@ def display_loss(csv_file):
237243

238244
# %%
239245
#
240-
# The result of running the fine-tuning for 1000 epoches in visualised with
241-
# ``chemiscope`` below:
246+
# The result of running the fine-tuning for 1000 epoches for 1000 structures in
247+
# visualised with ``chemiscope`` below:
242248
import chemiscope # noqa: E402
243249

244250

245251
chemiscope.show_input("full_finetune_example.chemiscope.json")
246252

247253
# %%
248254
#
249-
# The fine-tuning learning curves for 1000 epoches displayed on the figure below. For
250-
# comparison, we also train a model from scratch on the same dataset:
251-
252255
# The figure below shows the training and validation losses over 1000 epochs of
253256
# fine-tuning. For comparison, we also train the model from scratch on the same dataset.
254257
# Expectedly, the pretrained model achieves better accuracy.
@@ -261,11 +264,14 @@ def display_loss(csv_file):
261264
# Two stage training strategy
262265
# ---------------------------
263266
#
264-
# This approach accelerates training by first using non-conservative forces, which
265-
# avoids costly backpropagation, then fine-tuning on conservative forces to ensure
266-
# physical consistency. Non-conservative forces can lead to pathological behavior (see
267-
# `preprint <https://arxiv.org/abs/2412.11569>`_), but this strategy mitigates such
268-
# issues while maintaining efficiency.
267+
# As discussed in <this paper https://arxiv.org/abs/2412.11569>`_, while conservative
268+
# MLIPs are generally better suited for physically accurate simulations, hybrid models
269+
# that support direct non-conservative force predictions can accelerate both training
270+
# and inference. We demonstrate this practical compromise through a two-stage approach:
271+
# first train a model to predict non-conservative forces directly (which avoids the cost
272+
# of backpropagation) and then fine-tuning its energy head to produce conservative
273+
# forces. Although non-conservative forces can lead to unphysical behavior, this
274+
# two-step strategy balances efficiency and physical reliability.
269275
#
270276
# Non-conservative force training
271277
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -357,7 +363,8 @@ def display_loss(csv_file):
357363
# %%
358364
#
359365
# After fine-tuning for 50 epochs, the updated parity plots show improved force
360-
# predictions (left) with conservative forces.
366+
# predictions (left) with conservative forces. The grayscale points in the background
367+
# correspond to the predicted forces from the previous step.
361368
#
362369
# .. image:: c_ft_res.png
363370
# :align: center

0 commit comments

Comments
 (0)