Skip to main content

Overview

The model registry is the single source of truth for every MLIP model available in MLIP Arena. It lives at:
mlip_arena/models/registry.yaml
At import time, mlip_arena.models reads the file and exposes the parsed content as the REGISTRY dict:
from mlip_arena.models import REGISTRY

print(REGISTRY["MACE-MP(M)"])
# {'module': 'externals', 'class': 'MACE_MP_Medium', 'family': 'mace-mp', ...}
REGISTRY is then used to populate MLIPMap and MLIPEnum by dynamically importing each model class.

Registry fields

Each top-level key in registry.yaml is the canonical model name. Its value is a dict with the following fields:

Identification and loading

module
string
required
Sub-package under mlip_arena.models that contains the model file. Currently always "externals".
class
string
required
Name of the Python class to import from mlip_arena.models.{module}.{family}. This class is used as the calculator.
family
string
required
Module filename (without .py) inside mlip_arena.models.{module}/. E.g. "mace-mp" maps to mace-mp.py.
package
string
required
Pip-installable package and version required for this model (e.g. "mace-torch==0.3.9").
checkpoint
string
Model checkpoint identifier — a filename, version tag, or URL used by the class constructor to load weights.

Provenance

username
string
HuggingFace Hub username of the model maintainer.
last-update
string (ISO 8601)
Timestamp of the last update to the registry entry.
datetime
string (ISO 8601)
Datetime associated with the model release or upload.
date
string (YYYY-MM-DD)
Publication or release date of the model.
github
string (URL)
Link to the upstream GitHub repository.
doi
string (URL)
DOI or arXiv link to the associated paper.
license
string
SPDX license identifier (e.g. "MIT", "Apache-2.0", "GPL-3.0-only"). null if not specified.

Training data

datasets
string[]
List of training dataset names (e.g. ["MPTrj", "Alexandria"]). Common values:
  • MPTrj — Materials Project trajectories
  • Alexandria — Alexandria crystal dataset
  • OMat — Open Materials dataset (Meta)
  • MPF — Materials Project forces
  • MP22 — Materials Project 2022
  • OC20 / OC22 — Open Catalyst datasets
  • SPICE — Small molecule and protein interaction dataset
  • Proprietary — non-public data

Benchmark tasks

gpu-tasks
string[]
Benchmark tasks the model participates in on GPU. Supported task identifiers:
  • homonuclear-diatomics — diatomic molecule potential energy curves
  • stability — thermodynamic stability prediction
  • combustion — combustion reaction MD
  • eos_bulk — bulk equation of state
  • wbm_ev — WBM energy-volume curves
cpu-tasks
string[]
Benchmark tasks the model participates in on CPU (typically lighter tasks). Common value: eos_alloy.

Capabilities

prediction
string
Compact string listing which physical quantities the model outputs:
CodeQuantity
EEnergy
FForces
SStress
MMagnetic moments
Example: "EFSM" means energy + forces + stress + magnetic moments.
nvt
boolean
Whether the model supports NVT (constant volume/temperature) molecular dynamics.
npt
boolean
Whether the model supports NPT (constant pressure/temperature) molecular dynamics. Some models have known issues with NPT (see inline comments in registry.yaml).

Complete model table

ModelFamilyPredictionNVTNPTLicenseDatasets
MACE-MP(M)mace-mpEFSYesYesMITMPTrj
CHGNetchgnetEFSMYesYesBSD-3-ClauseMPTrj
M3GNetmatglEFSYesYesBSD-3-ClauseMPF
MatterSimmattersimEFSYesYesMITMPTrj, Alexandria, Proprietary
ORBv2orbEFSYesYesApache-2.0MPTrj, Alexandria
SevenNetsevennetEFSYesYesGPL-3.0-onlyMPTrj
eqV2(OMat)fairchemEFSYesNoModified Apache-2.0OMat, MPTrj, Alexandria
MACE-MPAmace-mpEFSYesYesMITMPTrj, Alexandria
eSENfairchemEFSYesYesModified Apache-2.0OMat, MPTrj, Alexandria
EquiformerV2(OC22)equiformerEFYesNoOC22
EquiformerV2(OC20)equiformerEFYesNoOC20
eSCN(OC20)escnEFYesNoOC20
MACE-OFF(M)mace-offEFSYesYesASLSPICE
ANI2xaniEFSYesYesMIT
ALIGNNalignnEFSYesYesMP22
DeepMDdeepmdEFSYesYesMPTrj
ORBorbEFSYesYesApache-2.0MPTrj, Alexandria
EF models output energy and forces only. EFSM models additionally output stress and magnetic moments. Models without a listed license have null in the registry.

Adding a new model

1

Create the calculator file

Add a new Python file under mlip_arena/models/externals/ named after the model family (e.g. myfamily.py). Define a class that wraps the model and implements a calculate method compatible with ASE.
# mlip_arena/models/externals/myfamily.py
from mlip_arena.models.utils import get_freer_device

class MyModel:
    def __init__(self, checkpoint=None, device=None, **kwargs):
        use_device = str(device or get_freer_device())
        # load weights, build model ...

    def calculate(self, atoms, properties, system_changes):
        # compute and populate self.results
        ...
2

Add an entry to registry.yaml

Append a new entry to mlip_arena/models/registry.yaml following the schema above:
MyModel:
  module: externals
  class: MyModel
  family: myfamily
  package: my-package==1.0.0
  checkpoint: my-checkpoint-v1.pt
  username: your-hf-username
  last-update: 2025-01-01T00:00:00
  datetime: 2025-01-01T00:00:00
  datasets:
    - MPTrj
  gpu-tasks:
    - homonuclear-diatomics
  prediction: EFS
  nvt: true
  npt: true
  license: MIT
  github: https://github.com/your-org/your-repo
  doi: https://arxiv.org/abs/xxxx.xxxxx
  date: 2025-01-01
3

Verify the model loads

Install the required package and confirm your model appears in MLIPEnum:
from mlip_arena.models import MLIPEnum

assert "MyModel" in MLIPEnum.__members__, "Model not loaded — check package install and class path"
print(MLIPEnum["MyModel"].value)