Ordinal Solvation Framework (OSF)

This repository implements the Ordinal Solvation Framework (OSF) for predicting polymer–solvent solvation behaviour. OSF models solvation behaviour at four resolutions, binary p⁽²⁾, three-state p⁽³⁾, four-state p⁽⁴⁾, and six-state p⁽⁶⁾, and integrates them through ordinal aggregation into a continuous interaction coordinate s, which can also be projected back onto the six-state solvation axis.

The repository is structured for:

inference on new polymer–solvent systems
evaluation using the included reproducibility bundle
reproduction of key model outputs

Installation

Clone the repository and create the environment:

git clone https://github.com/zjlucam/ordinal-solvation-framework.git
cd ordinal-solvation-framework
conda env create -f environment.yml
conda activate osf
python -m pip install -e .

If RDKit is not installed correctly through the environment file on your platform, install it separately:

conda install -c conda-forge rdkit

Quick start

Inference

Run OSF on the example input:

python scripts/predict_osf.py --checkpoint checkpoints/osf_pretrained.pt --input examples/public_inference_example.csv --output predictions.csv --device auto

This generates predictions.csv, including:

p2_pred
p3_pred
p4_pred
p6_pred
s_osf
osf_pred_6

Evaluation

Evaluate the pretrained checkpoint on the included processed bundle:

python scripts/evaluate_osf.py --checkpoint checkpoints/osf_pretrained.pt --bundle data/processed/osf_train_bundle.pt --metrics-out metrics.json --test-predictions-out test_predictions.csv

This generates:

metrics.json
test_predictions.csv

Evaluation reports compact paper-facing metrics for:

full
homopolymer
copolymer
binary_solvent

Each subset includes:

n
accuracy_p2
qwk_p2
adjacent_acc_p6
qwk_p6
adjacent_acc_osf
qwk_osf

Data and reproducibility

The repository includes a public reproducibility bundle containing the dataset used in the study:

data/processed/osf_train_bundle.pt

This bundle contains:

SMILES strings required to regenerate Morgan fingerprints
precomputed non-fingerprint features
labels
split indices
metadata and descriptors required for evaluation

The bundle reproduces the public inference and evaluation workflows included in this repository.

A small example workbook illustrating the structure of the original data format is also included at:

examples/data_schema_example.xlsx

This file is provided for orientation only and is not required for inference or evaluation.

Licensing

Code in this repository is released under the MIT License.
Non-code research artifacts, including the processed reproducibility bundle and pretrained checkpoint, are provided under the terms in LICENSE_DATA.md.

Included files

The repository includes:

source code under src/osf/
command-line scripts under scripts/
configuration files under configs/
example inputs under examples/
public reproducibility bundle at data/processed/osf_train_bundle.pt
pretrained checkpoint at checkpoints/osf_pretrained.pt

Example inputs

The examples/ directory contains small test files for:

single-solvent inference
copolymer inference
binary-solvent inference
unseen external inference

Example:

python scripts/predict_osf.py --checkpoint checkpoints/osf_pretrained.pt --input examples/binary_solvent_example.csv --output binary_predictions.csv --device auto

Optional smoke test

Windows (Anaconda Prompt)

run_scripts\smoke_test_public.bat

PowerShell

.\run_scripts\smoke_test_public.ps1

Expected outputs:

predictions.csv
metrics.json
test_predictions.csv

Repository structure

📦 ordinal-solvation-framework
 ┣ 📂assets             README figures and visual assets
 ┣ 📂checkpoints        Pretrained checkpoint and checkpoint notes
 ┣ 📂configs            Training and inference configuration files
 ┣ 📂data               Public reproducibility bundle and data notes
 ┣ 📂examples           Example CSV and workbook inputs for inference and schema reference
 ┣ 📂notebooks          Exploratory, figure-generation, and SI notebooks
 ┣ 📂run_scripts        Smoke tests and convenience run scripts
 ┣ 📂scripts            CLI entry points for training, inference, evaluation, and data export
 ┣ 📂src/osf
 ┃ ┣ 📂analysis         Ablation, runtime, SHAP, and error analysis
 ┃ ┣ 📂data             Dataset loading, labels, splits, and processed-bundle handling
 ┃ ┣ 📂features         Fingerprints, descriptors, scaling, and feature builders
 ┃ ┣ 📂inference        Prediction and external inference utilities
 ┃ ┣ 📂model            OSF architecture, ordinal aggregation, and losses
 ┃ ┣ 📂plotting         Plotting utilities for figures, dataset visualisation, and SI
 ┃ ┣ 📂training         Training, checkpointing, evaluation, and metrics
 ┃ ┗ 📜core utilities   Config, constants, paths, and I/O helpers
 ┣ 📂tests              Unit and pipeline tests
 ┣ 📜.gitignore
 ┣ 📜LICENSE
 ┣ 📜LICENSE_DATA.md
 ┣ 📜README.md
 ┣ 📜environment.yml
 ┣ 📜pyproject.toml
 ┗ 📜requirements.txt

Citation

If you use this code or model, please cite the associated manuscript.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ordinal Solvation Framework (OSF)

Installation

Quick start

Inference

Evaluation

Data and reproducibility

Licensing

Included files

Example inputs

Optional smoke test

Windows (Anaconda Prompt)

PowerShell

Repository structure

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
assets		assets
checkpoints		checkpoints
configs		configs
data		data
examples		examples
notebooks		notebooks
run_scripts		run_scripts
scripts		scripts
src/osf		src/osf
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_DATA.md		LICENSE_DATA.md
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Ordinal Solvation Framework (OSF)

Installation

Quick start

Inference

Evaluation

Data and reproducibility

Licensing

Included files

Example inputs

Optional smoke test

Windows (Anaconda Prompt)

PowerShell

Repository structure

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages