Releases · NVIDIA-BioNeMo/KERMT

This release introduces Contrastive KERMT, a graph-transformer foundation model for ADMET (absorption, distribution, metabolism, excretion, toxicity) property prediction. It extends the v1 KERMT architecture with new pretraining objectives that produce stronger downstream representations on multi-task ADMET benchmarks.

What's new in v2

Contrastive KERMT pretraining objective. Keeps the v1 graph-transformer encoder + chemistry-specific vocabulary heads, and adds two new pretraining-only heads:
- Transformer-based SMILES reconstruction decoder.
- In-batch contrastive auxiliary classifier (cMIM).
  All four objectives are jointly optimized under a single unified log-probability factorization. The decoder and contrastive head are pretraining-only and are discarded before downstream fine-tuning, so the inference-time footprint matches v1.
Agent skill suite. Eight SKILL.md-format skills under agent/skills/ for driving the full ADMET research lifecycle with LLM agents (Claude Code, Codex, Nemotron): environment setup, pretrain-from-scratch, continue-pretrain, add-cMIM-pretrain, fine-tune, embed, infer, and monitor. See agent/README.md for installation and use.
Training infrastructure. Mid-epoch resume, atomic checkpoint saves, configurable WandB integration, task-specific multi-task FFN heads with per-task dropout, multi-worker data loaders.

Pretrained weights

NGC Catalog: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/resources/kermt-contrastive
Hugging Face: https://huggingface.co/nvidia/NV-KERMT-70M-v2

Both contain the same bundle: a .pt checkpoint (~282 MB) plus the three pretraining vocabulary files (pretrain_atom_vocab.json, pretrain_bond_vocab.json, pretrain_smiles_vocab.pkl). Load via the codebase in this repository.

License

Source code (this repository): Apache License, Version 2.0. See LICENSE.
Model weights (released on NGC and Hugging Face): NVIDIA Open Model License Agreement.

Companion materials

Manuscript / preprint: Xue et al., Probabilistic Contrastive Pretraining for Multi-task ADME Property Prediction. arXiv:2606.11508

What's Changed

Improvements

Add encoder architecture args (--hidden_size, --depth, --num_attn_head, etc.) to finetune parser

Add save_model_for_restart for finetuning training resume

Add strict_shape_check arg to load_checkpoint

Refactor parameter filtering to use named_parameters

Split param_count into param_count_trainable and param_count_total

cuik-molmaker v0.2 upgrade

Bump cuik-molmaker from 0.1.1 to 0.2

Bug fixes

Disallow inconsistent usage between finetune and predict arguments

Fix TypeError when dynamic_depth="truncnorm" by using rvs() instead of rvs(1)

Fix checkpoint handling, validation, and training robustness

Testing

Add unit tests for scheduler, nn_utils, model building, and featurization

Ensure predict integration test uses consistent args with finetune

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's new in v2

Pretrained weights

License

Companion materials

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Improvements

cuik-molmaker v0.2 upgrade

Bug fixes

Testing

Uh oh!

Releases: NVIDIA-BioNeMo/KERMT

KERMT v2.0.0 — Contrastive KERMT

What's new in v2

Pretrained weights

License

Companion materials

Uh oh!

v1.0.1

What's Changed

Improvements

cuik-molmaker v0.2 upgrade

Bug fixes

Testing

Uh oh!