Skip to content

A framework for training energy-based diffusion models with stable, self-consistent scores near the data distribution.

License

Notifications You must be signed in to change notification settings

noegroup/ScoreMD

Repository files navigation

Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

arXiv JAX Colab Colab PyTorch

Animation showing the two modes of our model: independent sampling by diffusion denoising (left) and molecular dynamics simulation (right).

Overview

This repository contains the complete codebase for training and evaluating energy-based diffusion models for molecular dynamics simulations. Our approach enables a single model to perform both independent sampling via diffusion denoising and continuous molecular dynamics simulations through a Fokker-Planck-based regularization scheme.

Method

Visualization of the main idea of this paper.

We introduce a Fokker-Planck-based regularization to train an energy-based diffusion model with stable, self-consistent scores near the data distribution. This regularization ensures that the learned score function corresponds to a consistent energy function, enabling the model to perform both generative sampling and accurate energy-estimation.

Tutorial

We provide minimal working implementations in Jupyter notebooks:

Both notebooks demonstrate how to reproduce similar figures to the one shown above and perform molecular dynamics simulations with a diffusion model on the Müller-Brown potential.

🚀 Getting Started

Installation

pixi

All dependencies are managed with pixi, which ensures fully reproducible environments across different systems.

To set up the environment, run:

pixi install --frozen

To activate the environment, run:

pixi shell

Docker

If you are on an amd64 system (e.g. a Linux machine), you can use the docker image to run the code. To build the docker image, run:

docker build -t scoremd .

To run the docker container, run:

docker run -it --rm -v $(pwd)/outputs:/workspace/outputs -v $(pwd)/storage:/workspace/storage -v $(pwd)/multirun:/workspace/multirun scoremd python train.py ...

Alternative Installation Methods

If you prefer using your own dependency manager (e.g., conda, pip), you can install the dependencies listed in pyproject.toml with your preferred tool.

Quick Start: Toy Systems

We use Hydra for configuration management. You can override any configuration via command-line arguments or configuration files.

Train on example toy systems using the provided configurations:

python train.py dataset=double_well +architecture=mlp/small_potential
python train.py dataset=double_well_2d +architecture=mlp/small_potential

Outputs will be saved to the outputs/ directory.

🧬 Working with Molecules

Important

This repository does not contain all datasets directly. Training data for the toy systems and alanine dipeptide will be downloaded automatically. For the dipeptides, you can download the dataset from this release and place it into the ./storage directory (one subfolder for each dataset, e.g. ./storage/minipeptides/, ./storage/deshaw/). Data for the fast-folder systems can be requested from D. E. Shaw Research, as described in the original paper. If you do not have access to the fast-folder data, this release also provides dummy data generated by our models, which is sufficient for inference.

Running Inference with Pre-Trained Models

We provide pre-trained model weights for all models presented in the paper. For detailed instructions on downloading and using these models, please refer to INFERENCE.md.

Training Models from the Paper

To reproduce the results from our paper, see TRAIN.md for the exact training commands used for each model and dataset.

Evaluation and Plotting Scripts

For implementation details and benchmarking against your own methods, we provide evaluation scripts in the evaluation directory.

Contributing

Feel free to open an issue if you encounter any problems or have questions.

Citation

If you find our work useful, please cite:

@article{plainer2025consistent,
  author = {Plainer, Michael and Wu, Hao and Klein, Leon and G{\"u}nnemann, Stephan and No{\'e}, Frank},
  title = {Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models},
  eprint = {arXiv:2506.17139},
  year = {2025},
}

About

A framework for training energy-based diffusion models with stable, self-consistent scores near the data distribution.

Resources

License

Stars

Watchers

Forks

Packages

No packages published