Releases: NVIDIA-BioNeMo/KERMT
Releases · NVIDIA-BioNeMo/KERMT
KERMT v2.0.0 — Contrastive KERMT
This release introduces Contrastive KERMT, a graph-transformer foundation model for ADMET (absorption, distribution, metabolism, excretion, toxicity) property prediction. It extends the v1 KERMT architecture with new pretraining objectives that produce stronger downstream representations on multi-task ADMET benchmarks.
What's new in v2
- Contrastive KERMT pretraining objective. Keeps the v1 graph-transformer encoder + chemistry-specific vocabulary heads, and adds two new pretraining-only heads:
- Transformer-based SMILES reconstruction decoder.
- In-batch contrastive auxiliary classifier (cMIM).
All four objectives are jointly optimized under a single unified log-probability factorization. The decoder and contrastive head are pretraining-only and are discarded before downstream fine-tuning, so the inference-time footprint matches v1.
- Agent skill suite. Eight
SKILL.md-format skills underagent/skills/for driving the full ADMET research lifecycle with LLM agents (Claude Code, Codex, Nemotron): environment setup, pretrain-from-scratch, continue-pretrain, add-cMIM-pretrain, fine-tune, embed, infer, and monitor. Seeagent/README.mdfor installation and use. - Training infrastructure. Mid-epoch resume, atomic checkpoint saves, configurable WandB integration, task-specific multi-task FFN heads with per-task dropout, multi-worker data loaders.
Pretrained weights
- NGC Catalog: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/resources/kermt-contrastive
- Hugging Face: https://huggingface.co/nvidia/NV-KERMT-70M-v2
Both contain the same bundle: a .pt checkpoint (~282 MB) plus the three pretraining vocabulary files (pretrain_atom_vocab.json, pretrain_bond_vocab.json, pretrain_smiles_vocab.pkl). Load via the codebase in this repository.
License
- Source code (this repository): Apache License, Version 2.0. See
LICENSE. - Model weights (released on NGC and Hugging Face): NVIDIA Open Model License Agreement.
Companion materials
- Manuscript / preprint: Xue et al., Probabilistic Contrastive Pretraining for Multi-task ADME Property Prediction. arXiv:2606.11508
v1.0.1
What's Changed
Improvements
- Add encoder architecture args (--hidden_size, --depth, --num_attn_head, etc.) to finetune parser
- Add save_model_for_restart for finetuning training resume
- Add strict_shape_check arg to load_checkpoint
- Refactor parameter filtering to use named_parameters
- Split param_count into param_count_trainable and param_count_total
cuik-molmaker v0.2 upgrade
- Bump cuik-molmaker from 0.1.1 to 0.2
Bug fixes
- Disallow inconsistent usage between finetune and predict arguments
- Fix TypeError when dynamic_depth="truncnorm" by using rvs() instead of rvs(1)
- Fix checkpoint handling, validation, and training robustness
Testing
- Add unit tests for scheduler, nn_utils, model building, and featurization
- Ensure predict integration test uses consistent args with finetune