Tasks - MLIP Arena

A task in MLIP Arena is one operation on one input structure that produces a result for one sample. Tasks are Python functions decorated with Prefect’s @task, which gives them caching, retry logic, and the ability to run concurrently inside a flow.

What a task is

Tasks are defined in mlip_arena/tasks/<module>.py and exported from mlip_arena/tasks/__init__.py as uppercase names (OPT, EOS, MD, etc.). Each task:

Accepts an atoms: Atoms structure and a calculator: BaseCalculator.
Applies one physical simulation or property calculation.
Returns a dictionary of results.
Caches results keyed on task source code plus all input parameters.

from mlip_arena.tasks import OPT, EOS, MD
from mlip_arena.models import MLIPEnum
from mlip_arena.tasks.utils import get_calculator
from ase.build import bulk

atoms = bulk("Cu", "fcc", a=3.6) * (3, 3, 3)
calc = get_calculator(MLIPEnum["MACE-MP(M)"])

# Run a single structure optimization
result = OPT(atoms=atoms, calculator=calc)
print(result["atoms"])     # relaxed Atoms
print(result["steps"])     # optimizer steps taken
print(result["converged"]) # bool

Tasks vs flows

	Task	Flow
Decorator	`@task`	`@flow`
Scope	One structure, one operation	Many tasks, many structures
Caching	Yes (`TASK_SOURCE + INPUTS`)	No built-in caching
Parallelism	Via `.submit()` from inside a flow	Dispatches tasks to workers
Typical use	Single calculation	Benchmark over all models

Tasks can call other tasks directly (serial) or submit them (parallel). Flows are the entry point that Prefect orchestrates and tracks as a single unit of work.

Prefect @task decorator and caching

Every MLIP Arena task uses the TASK_SOURCE + INPUTS cache policy. This means a task is only re-executed when either its source code or its input arguments change:

# mlip_arena/tasks/optimize.py (lines 53–55)
from prefect.cache_policies import INPUTS, TASK_SOURCE

@task(
    name="OPT",
    task_run_name=_generate_task_run_name,
    cache_policy=TASK_SOURCE + INPUTS,
)
def run(
    atoms: Atoms,
    calculator: BaseCalculator,
    optimizer: Optimizer | str = BFGSLineSearch,
    filter: Filter | str | None = None,
    criterion: dict | None = None,
    symmetry: bool = False,
):
    ...

The same pattern appears on every task (EOS, MD, PHONON, NEB, ELASTICITY).

When running a benchmark over many models, the cache ensures that if one model fails and you re-run the flow, completed calculations are not repeated. Set refresh_cache=True to force re-execution.

Task chaining: EOS calls OPT internally

Tasks are composable. EOS calls OPT as a subtask — first for a full relaxation, then for constrained relaxations at each strained volume:

# mlip_arena/tasks/eos.py (lines 80–127)
from mlip_arena.tasks.optimize import run as OPT

@task(name="EOS", cache_policy=TASK_SOURCE + INPUTS)
def run(atoms, calculator, max_abs_strain=0.1, npoints=11, concurrent=True):
    # Step 1: Full relaxation
    OPT_ = OPT.with_options(refresh_cache=not cache_opt)
    state = OPT_(atoms=atoms, calculator=calculator, filter="FrechetCell")
    relaxed = state.result()["atoms"]
    c0 = relaxed.get_cell()

    # Step 2: Strained relaxations (concurrent or serial)
    factors = np.linspace(1 - max_abs_strain, 1 + max_abs_strain, npoints) ** (1/3)
    if concurrent:
        futures = []
        for f in factors:
            atoms_strained = relaxed.copy()
            atoms_strained.set_cell(c0 * f, scale_atoms=True)
            future = OPT_.submit(atoms=atoms_strained, calculator=calculator, filter=None)
            futures.append(future)
        wait(futures)

The concurrent=True flag enables parallel OPT submissions for all volume points, which is significantly faster when running with a Prefect worker pool.

Available tasks

OPT — Structure optimization

Relax atomic positions and/or cell parameters using ASE optimizers (BFGS, FIRE, LBFGS, etc.) and filters (UnitCell, FrechetCell, ExpCell).

EOS — Equation of state

Compute energy-volume relationship and fit Birch-Murnaghan EOS to extract bulk modulus. Chains OPT internally.

MD — Molecular dynamics

Run NVE, NVT, or NPT simulations with flexible dynamics (VelocityVerlet, Langevin, Nose-Hoover, Berendsen) and temperature/pressure schedules.

PHONON — Phonon calculation

Compute phonon band structure and density of states driven by the phonopy library. Requires phonopy to be installed.

NEB — Nudged elastic band

Find minimum energy paths and transition states between two known endpoint structures.

NEB_FROM_ENDPOINTS

Convenience wrapper around NEB that performs linear or IDPP image interpolation from two endpoint structures.

ELASTICITY

Calculate the full elastic tensor via strain perturbations.

The task registry.yaml structure

mlip_arena/tasks/registry.yaml registers benchmarks (not individual tasks) for the leaderboard. Each entry maps a human-readable benchmark name to its Streamlit page and display category:

# mlip_arena/tasks/registry.yaml
Homonuclear diatomics:
  category: Fundamentals
  task-page: homonuclear-diatomics   # serves/tasks/<task-page>.py
  task-layout: wide
  rank-page: homonuclear-diatomics
  last-update: 2024-09-19

Equation of state:
  category: Fundamentals
  task-page: eos_bulk
  task-layout: wide
  rank-page: eos_bulk
  last-update: 2025-04-29

Stability:
  category: Molecular Dynamics
  task-page: stability
  task-layout: wide
  rank-page: stability

Field	Description
`category`	Navigation group in the leaderboard sidebar
`task-page`	Filename (without `.py`) of the Streamlit page under `serve/tasks/`
`task-layout`	Page layout: `wide` or `centered`
`rank-page`	Filename of the ranking/leaderboard page
`last-update`	Date of the most recent results update

Importing tasks

All tasks are available from mlip_arena.tasks:

from mlip_arena.tasks import OPT, EOS, MD, PHONON, NEB, NEB_FROM_ENDPOINTS, ELASTICITY

The __init__.py wraps imports in a try/except so missing optional dependencies (e.g., phonopy) do not prevent other tasks from loading:

# mlip_arena/tasks/__init__.py (lines 16–27)
try:
    from .elasticity import run as ELASTICITY
    from .eos import run as EOS
    from .md import run as MD
    from .neb import run as NEB
    from .neb import run_from_endpoints as NEB_FROM_ENDPOINTS
    from .optimize import run as OPT
    from .phonon import run as PHONON
except (ImportError, TypeError, NameError) as e:
    logger.warning(e)

​What a task is

​Tasks vs flows

​Prefect @task decorator and caching

​Task chaining: EOS calls OPT internally

​Available tasks

OPT — Structure optimization

EOS — Equation of state

MD — Molecular dynamics

PHONON — Phonon calculation

NEB — Nudged elastic band

NEB_FROM_ENDPOINTS

ELASTICITY

​The task registry.yaml structure

​Importing tasks

What a task is

Tasks vs flows

Prefect @task decorator and caching

Task chaining: EOS calls OPT internally

Available tasks

The task registry.yaml structure

Importing tasks