This repository implements the Ordinal Solvation Framework (OSF) for predicting polymer–solvent solvation behaviour. OSF models solvation behaviour at four resolutions, binary p(2), three-state p(3), four-state p(4), and six-state p(6), and integrates them through ordinal aggregation into a continuous interaction coordinate s, which can also be projected back onto the six-state solvation axis.
The repository is structured for:
- inference on new polymer–solvent systems
- evaluation using the included reproducibility bundle
- reproduction of key model outputs
Clone the repository and create the environment:
git clone https://github.com/zjlucam/ordinal-solvation-framework.git
cd ordinal-solvation-framework
conda env create -f environment.yml
conda activate osf
python -m pip install -e .If RDKit is not installed correctly through the environment file on your platform, install it separately:
conda install -c conda-forge rdkitRun OSF on the example input:
python scripts/predict_osf.py --checkpoint checkpoints/osf_pretrained.pt --input examples/public_inference_example.csv --output predictions.csv --device autoThis generates predictions.csv, including:
p2_predp3_predp4_predp6_preds_osfosf_pred_6
Evaluate the pretrained checkpoint on the included processed bundle:
python scripts/evaluate_osf.py --checkpoint checkpoints/osf_pretrained.pt --bundle data/processed/osf_train_bundle.pt --metrics-out metrics.json --test-predictions-out test_predictions.csvThis generates:
metrics.jsontest_predictions.csv
Evaluation reports compact paper-facing metrics for:
fullhomopolymercopolymerbinary_solvent
Each subset includes:
naccuracy_p2qwk_p2adjacent_acc_p6qwk_p6adjacent_acc_osfqwk_osf
The repository includes a public reproducibility bundle containing the dataset used in the study:
data/processed/osf_train_bundle.pt
This bundle contains:
- SMILES strings required to regenerate Morgan fingerprints
- precomputed non-fingerprint features
- labels
- split indices
- metadata and descriptors required for evaluation
The bundle reproduces the public inference and evaluation workflows included in this repository.
A small example workbook illustrating the structure of the original data format is also included at:
examples/data_schema_example.xlsx
This file is provided for orientation only and is not required for inference or evaluation.
- Code in this repository is released under the MIT License.
- Non-code research artifacts, including the processed reproducibility bundle and pretrained checkpoint, are provided under the terms in
LICENSE_DATA.md.
The repository includes:
- source code under
src/osf/ - command-line scripts under
scripts/ - configuration files under
configs/ - example inputs under
examples/ - public reproducibility bundle at
data/processed/osf_train_bundle.pt - pretrained checkpoint at
checkpoints/osf_pretrained.pt
The examples/ directory contains small test files for:
- single-solvent inference
- copolymer inference
- binary-solvent inference
- unseen external inference
Example:
python scripts/predict_osf.py --checkpoint checkpoints/osf_pretrained.pt --input examples/binary_solvent_example.csv --output binary_predictions.csv --device autorun_scripts\smoke_test_public.bat.\run_scripts\smoke_test_public.ps1Expected outputs:
predictions.csvmetrics.jsontest_predictions.csv
📦 ordinal-solvation-framework
┣ 📂assets README figures and visual assets
┣ 📂checkpoints Pretrained checkpoint and checkpoint notes
┣ 📂configs Training and inference configuration files
┣ 📂data Public reproducibility bundle and data notes
┣ 📂examples Example CSV and workbook inputs for inference and schema reference
┣ 📂notebooks Exploratory, figure-generation, and SI notebooks
┣ 📂run_scripts Smoke tests and convenience run scripts
┣ 📂scripts CLI entry points for training, inference, evaluation, and data export
┣ 📂src/osf
┃ ┣ 📂analysis Ablation, runtime, SHAP, and error analysis
┃ ┣ 📂data Dataset loading, labels, splits, and processed-bundle handling
┃ ┣ 📂features Fingerprints, descriptors, scaling, and feature builders
┃ ┣ 📂inference Prediction and external inference utilities
┃ ┣ 📂model OSF architecture, ordinal aggregation, and losses
┃ ┣ 📂plotting Plotting utilities for figures, dataset visualisation, and SI
┃ ┣ 📂training Training, checkpointing, evaluation, and metrics
┃ ┗ 📜core utilities Config, constants, paths, and I/O helpers
┣ 📂tests Unit and pipeline tests
┣ 📜.gitignore
┣ 📜LICENSE
┣ 📜LICENSE_DATA.md
┣ 📜README.md
┣ 📜environment.yml
┣ 📜pyproject.toml
┗ 📜requirements.txt
If you use this code or model, please cite the associated manuscript.
