API Documentation#

Public API#

Public API for BATTER.

This module collects the stable entry points intended for external consumption. They fall into four broad categories:

  • Configuration helpers – load and dump RunConfig / SimulationConfig objects.

  • Execution – orchestrate complete workflows from a YAML definition.

  • Portable results – inspect and copy artifacts produced by a run.

  • Utilities – clone the state of an execution for reproducibility.

Typical usage#

Run a workflow from a top-level YAML:

from batter.api import run_from_yaml
run_from_yaml("examples/mabfe_example.yaml")

Inspect FE records stored in a work directory:

from batter.api import list_fe_runs, load_fe_run
runs = list_fe_runs("work/adrb2")
latest = runs.iloc[-1]["run_id"]
# pass ``ligand`` when the run contains more than one ligand
record = load_fe_run("work/adrb2", latest, ligand="LIG1")

Run FE analysis on an existing execution:

from batter.api import run_analysis_from_execution
run_analysis_from_execution("work/adrb2", latest, ligand="LIG1")

For more examples, refer to docs/getting_started.rst and the tutorials.

class batter.api.ArtifactStore(root: Path | str, manifest_name: str = 'manifest.json')[source]

Bases: object

Portable store with a relocatable root and JSON manifest.

Parameters:
  • root (path-like) – Store root directory (e.g., a run’s work directory).

  • manifest_name (str) – File name for the manifest JSON under root (default: “manifest.json”).

Examples

>>> store = ArtifactStore("work/at1r_aai")
>>> p = store.put_file(Path("results.txt"), name="fe/latest", dst_rel=Path("fe/results.txt"))
>>> store.save_manifest()
>>> # move directory to a new cluster...
>>> store2 = ArtifactStore("new_root/at1r_aai"); store2.load_manifest()
>>> store2.path("fe/latest")
new_root/at1r_aai/fe/results.txt
list_artifacts(*, prefix: str | None = None, kind: Literal['file', 'dir', None] = None) List[Artifact][source]

Inspect manifest entries, optionally filtering by name or kind.

Parameters:
  • prefix (str, optional) – When provided, only artifacts whose logical name starts with prefix are returned.

  • kind ({‘file’, ‘dir’, None}, optional) – Restrict results to files or directories. None (default) returns both.

Returns:

Matching artifacts in alphabetical order.

Return type:

list[Artifact]

load_manifest() None[source]

Load the manifest JSON from root.

path(name: str) Path[source]

Resolve an artifact name to an absolute path under the current root.

put_dir(src_dir: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]

Copy a directory under the store and record it in the manifest.

Notes

  • No per-file hashing; use put_file() for critical files.

put_file(src: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]

Copy a file under the store and record it in the manifest.

Parameters:
  • src (path-like) – Source file path (must exist and be a file).

  • name (str) – Logical artifact name to register under.

  • dst_rel (path-like, optional) – Relative destination path. Defaults to name.replace('/', '_').

  • overwrite_manifest_entry (bool) – If True, allows replacing an existing manifest entry with the same name.

Returns:

Absolute destination path.

Return type:

pathlib.Path

rebase(new_root: Path | str) ArtifactStore[source]

Create a new store view with the same manifest but a different root.

Parameters:

new_root (path-like) – Target root directory.

Returns:

New store pointing to new_root.

Return type:

ArtifactStore

save_manifest() Path[source]

Write the manifest JSON under root (atomic).

class batter.api.FERecord(*, run_id: str, ligand: str, mol_name: str, system_name: str, fe_type: str, temperature: float, method: Literal['mbar', 'ti']='mbar', total_dG: float, total_se: float = 0.0, components: List[str] = <factory>, created_at: str = <factory>, windows: List[WindowResult] = <factory>, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None, include_in_analysis: bool = True, status: Literal['success', 'failed', 'unbound']='success')[source]

Bases: BaseModel

A full FE result bundle (portable, versioned).

Parameters:
  • run_id (str) – Unique run identifier.

  • ligand (str) – Ligand identifier.

  • mol_name (str) – Molecule resname.

  • system_name (str) – Logical system name.

  • fe_type (str) – Protocol type (e.g., ‘uno_rest’, ‘asfe’).

  • temperature (float) – Simulation temperature (K).

  • method ({“mbar”,”ti”}) – Integration method.

  • total_dG (float) – Total free energy (kcal/mol).

  • total_se (float) – Standard error (kcal/mol).

  • components (list[str]) – Active components in this run.

  • created_at (str) – ISO-8601 timestamp (UTC, Z-suffix).

  • windows (list[WindowResult]) – Per-window results.

  • canonical_smiles (str, optional) – Canonicalised ligand SMILES captured during parameterization.

  • original_name (str, optional) – Original ligand identifier or title when known.

  • original_path (str, optional) – Source path of the ligand before staging.

  • protocol (str) – Logical protocol used to generate the result (e.g., "abfe").

  • analysis_start_step (int, optional) – First production step included in analysis.

  • n_bootstraps (int, optional) – Number of MBAR bootstrap resamples used during analysis.

  • include_in_analysis (bool) – Whether downstream aggregate analyses, such as Cinnabar export, should use this record.

  • status ({“success”,”failed”,”unbound”}) – Final status recorded for the ligand.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

analysis_start_step: int | None
canonical_smiles: str | None
components: List[str]
created_at: str
fe_type: str
include_in_analysis: bool
ligand: str
method: Literal['mbar', 'ti']
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

mol_name: str
n_bootstraps: int | None
original_name: str | None
original_path: str | None
protocol: str
run_id: str
status: Literal['success', 'failed', 'unbound']
system_name: str
temperature: float
total_dG: float
total_se: float
windows: List[WindowResult]
class batter.api.FEResultsRepository(store: ArtifactStore)[source]

Bases: object

index() DataFrame[source]
ligand_dir(run_id: str, ligand: str) Path[source]
load(run_id: str, ligand: str) FERecord[source]
record_failure(run_id: str, ligand: str, system_name: str, temperature: float, *, status: Literal['failed', 'unbound'], reason: str | None = None, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None) None[source]
save(rec: FERecord, copy_from: Path | None = None) None[source]
set_analysis_inclusion(*, run_id: str, ligand: str, include: bool, analysis_start_step: int | None = None, n_bootstraps: int | None = None) int[source]

Set include_in_analysis for matching rows in results/index.csv.

class batter.api.RunConfig(*, version: int = 1, protocol: Literal['abfe', 'rbfe', 'asfe', 'md']='abfe', backend: Literal['local', 'slurm']='local', create: CreateArgs, fe_sim: Dict[str, ~typing.Any] | ~batter.config.run.FESimArgs | ~batter.config.run.MDSimArgs=<factory>, run: RunSection, rbfe: RBFENetworkArgs | None = None)[source]

Bases: BaseModel

Top-level YAML config.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

backend: Literal['local', 'slurm']
create: CreateArgs
fe_sim: Dict[str, Any] | FESimArgs | MDSimArgs
classmethod load(path: Path | str) RunConfig[source]

Load and validate a run configuration from disk.

Parameters:

path (str or pathlib.Path) – Location of the YAML file to parse.

Returns:

Fully validated configuration object.

Return type:

RunConfig

model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod model_validate_yaml(yaml_text: str) RunConfig[source]

Validate a run configuration from an in-memory YAML string.

Parameters:

yaml_text (str) – Raw YAML content describing the run configuration.

Returns:

Validated configuration model.

Return type:

RunConfig

protocol: Literal['abfe', 'rbfe', 'asfe', 'md']
rbfe: RBFENetworkArgs | None
resolved_sim_config() SimulationConfig[source]

Build the effective simulation configuration for this run.

Returns:

Simulation parameters derived from create and fe_sim sections.

Return type:

SimulationConfig

run: RunSection
version: int
with_base_dir(base_dir: Path) RunConfig[source]

Return a copy with relative paths resolved against base_dir.

class batter.api.SimSystem(name: str, root: Path, topology: Path | None = None, coordinates: Path | None = None, protein: Path | None = None, ligands: Path, ...]=(), lipid_mol: Tuple[str, ...]=(), other_mol: Tuple[str, ...]=(), anchors: Tuple[str, ...]=(), meta: SystemMeta = <factory>)[source]

Bases: object

Immutable descriptor of a simulation system and its on-disk artifacts.

Parameters:
  • name (str) – Logical system name (e.g., "AT1R_AAI").

  • root (pathlib.Path) – Working directory where artifacts live. This directory is considered relocatable; other modules should store relative paths when possible.

  • topology (pathlib.Path, optional) – Path to an explicit topology (e.g., AMBER PRMTOP). May be None if the builder generates it later.

  • coordinates (pathlib.Path, optional) – Coordinates or restart file (e.g., RST7/INPCRD).

  • protein (pathlib.Path, optional) – Input protein structure file (PDB/mmCIF).

  • ligands (tuple[pathlib.Path, …]) – One or more ligand structure files.

  • lipid_mol (tuple[str, …]) – Lipid names present in the system (e.g., ("POPC",)).

  • other_mol (tuple[str, …]) – Other cofactor present in the system``).

  • anchors (tuple[str, …]) – Anchor atoms in the form "RESID@ATOM" (e.g., "85@CA").

  • meta (SystemMeta) – Free-form metadata bundle for provenance (e.g., software versions).

anchors: Tuple[str, ...]
coordinates: Path | None
ligands: Tuple[Path, ...]
lipid_mol: Tuple[str, ...]
meta: SystemMeta
name: str
other_mol: Tuple[str, ...]
path(*parts: str | Path) Path[source]

Join root with the provided path segments.

Parameters:

*parts (str or Path) – Relative path components appended in order.

Returns:

Absolute path pointing inside root.

Return type:

pathlib.Path

protein: Path | None
root: Path
topology: Path | None
with_artifacts(**kw) SimSystem[source]

Return a new SimSystem with updated artifact attributes.

Examples

>>> sys = SimSystem(name="X", root=Path("work/X"))
>>> sys2 = sys.with_artifacts(topology=Path("work/X/top.prmtop"))
with_meta(**updates: Any) SimSystem[source]

Return a copy of the system with merged metadata.

Parameters:

**updates – Keyword arguments forwarded to SystemMeta.merge().

Returns:

Copy of the system containing the updated metadata bundle.

Return type:

SimSystem

class batter.api.SimulationConfig(*, system_name: str, fe_type: ~typing.Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md'], dec_int: ~typing.Literal['mbar', 'ti'] = 'mbar', remd: ~typing.Literal['yes', 'no'] = 'no', remd_nstlim: int = 100, slurm_header_dir: ~pathlib.Path = <factory>, infe: bool = False, p1: str = '', p2: str = '', p3: str = '', other_mol: ~typing.List[str] = <factory>, lipid_mol: ~typing.List[str] = <factory>, solv_shell: float | None = 15.0, rocklin_correction: ~typing.Literal['yes', 'no'] = 'no', release_eq: ~typing.List[float] = <factory>, ti_points: int | None = 0, lambdas: ~typing.List[float] = <factory>, component_windows: ~typing.Dict[str, ~typing.List[float]] = <factory>, sdr_dist: float | None = 0.0, dec_method: str | None = None, blocks: int = 0, unbound_threshold: ~typing.Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, lig_distance_force: float = 0.0, lig_angle_force: float = 0.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 0.0, lig_com_force: float = 0.0, water_model: ~typing.Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC'] = 'TIP3P', buffer_x: float = 10.0, buffer_y: float = 10.0, buffer_z: float = 15.0, lig_buffer: float = 10.0, neutralize_only: ~typing.Literal['yes', 'no'] = 'no', cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, hmr: ~typing.Literal['yes', 'no'] = 'no', enable_mcwat: ~typing.Literal['yes', 'no'] = 'yes', temperature: float = 298.15, eq_steps: int = 1000000, n_steps_dict: ~typing.Dict[str, int] = <factory>, l1_x: float | None = None, l1_y: float | None = None, l1_z: float | None = None, l1_range: float | None = None, min_adis: float | None = None, max_adis: float | None = None, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 2500, cut: float = 9.0, gamma_ln: float = 1.0, barostat: ~typing.Literal[1, 2] = 2, dt: float = 0.004, all_atoms: ~typing.Literal['yes', 'no'] = 'no', receptor_ff: str = 'protein.ff14SB', ligand_ff: str = 'gaff2', lipid_ff: str = 'lipid21', ligand_dict: ~typing.Dict[str, ~typing.Any] = <factory>, rng: int = 0, ion_def: ~typing.List[~typing.Any] = <factory>, dic_n_steps: ~typing.Dict[str, int] = <factory>, rest: ~typing.List[float] = <factory>, neut: str = '', protein_align: str = 'name CA', receptor_segment: str | None = None, components: ~typing.List[str] = <factory>, component_lambdas: ~typing.Dict[str, ~typing.List[float]] = <factory>, membrane_simulation: bool = True)[source]

Bases: BaseModel

Simulation configuration for ABFE/ASFE/RBFE workflows. Values are fed by RunConfig.resolved_sim_config(), which merges create: and fe_sim:.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

all_atoms: Literal['yes', 'no']
analysis_start_step: int
anion: str
barostat: Literal[1, 2]
blocks: int
buffer_x: float
buffer_y: float
buffer_z: float
cation: str
component_lambdas: Dict[str, List[float]]
component_windows: Dict[str, List[float]]
components: List[str]
cut: float
dec_int: Literal['mbar', 'ti']
dec_method: str | None
dic_n_steps: Dict[str, int]
dt: float
enable_mcwat: Literal['yes', 'no']
eq_steps: int
fe_type: Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md']
classmethod from_sections(create: CreateArgs, fe: FESimArgs, *, protocol: str | None = None, fe_type: str | None = None, slurm_header_dir: Path | None = None, run_remd: str | bool | None = None) SimulationConfig[source]

Construct a SimulationConfig from run sections.

Parameters:
  • create (CreateArgs) – System creation inputs taken from the create YAML section.

  • fe (FESimArgs) – Free-energy simulation overrides from the fe_sim section.

  • run_remd ({“yes”,”no”}, optional) – Whether REMD execution is enabled (controls submission only; REMD inputs are always written during preparation).

Returns:

Fully merged simulation configuration ready for downstream use.

Return type:

SimulationConfig

gamma_ln: float
hmr: Literal['yes', 'no']
infe: bool
ion_conc: float
ion_def: List[Any]
l1_range: float | None
l1_x: float | None
l1_y: float | None
l1_z: float | None
lambdas: List[float]
lig_angle_force: float
lig_buffer: float
lig_com_force: float
lig_dihcf_force: float
lig_distance_force: float
ligand_dict: Dict[str, Any]
ligand_ff: str
lipid_ff: str
lipid_mol: List[str]
max_adis: float | None
membrane_simulation: bool
min_adis: float | None
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_bootstraps: int
n_steps_dict: Dict[str, int]
neut: str
neutralize_only: Literal['yes', 'no']
ntpr: int
ntwe: int
ntwr: int
ntwx: int
other_mol: List[str]
p1: str
p2: str
p3: str
protein_align: str
rec_com_force: float
receptor_ff: str
receptor_segment: str | None
release_eq: List[float]
remd: Literal['yes', 'no']
remd_nstlim: int
rest: List[float]
rng: int
rocklin_correction: Literal['yes', 'no']
sdr_dist: float | None
slurm_header_dir: Path
solv_shell: float | None
system_name: str
temperature: float
ti_points: int | None
to_dict() Dict[str, Any][source]
unbound_threshold: float
water_model: Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC']
class batter.api.WindowResult(*, component: str, lam: float, dG: float, dG_se: float = 0.0, n_samples: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]

Bases: BaseModel

Result for a single lambda window/component.

Parameters:
  • component (str) – Component key (e.g., ‘e’, ‘v’, ‘z’).

  • lam (float) – Lambda value in [0, 1].

  • dG (float) – Free-energy increment (kcal/mol).

  • dG_se (float) – Standard error (kcal/mol).

  • n_samples (int) – Samples (or effective sample size).

  • meta (dict) – Extra metadata.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

component: str
dG: float
dG_se: float
lam: float
meta: Dict[str, Any]
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_samples: int
batter.api.clone_execution(work_dir: Path, src_run_id: str, dst_run_id: str | None = None, *, dst_root: Path | None = None, mode: Literal['copy', 'hardlink', 'symlink'] = 'hardlink', only_equil: bool = True, reset_states: bool = True, overwrite: bool = False) Path[source]
batter.api.dump_run_config(cfg: RunConfig, path: Path | str) None[source]

Serialize a run configuration to YAML.

Parameters:
  • cfg (RunConfig) – Configuration object to export.

  • path (str or pathlib.Path) – Destination path for the YAML file.

batter.api.list_fe_runs(work_dir: str | Path) pd.DataFrame[source]

Return an index of FE runs contained in a portable work directory.

Parameters:

work_dir (str or Path) – Path to the root directory of a BATTER execution (portable layout).

Returns:

DataFrame with one row per stored FE run. Columns include run_id, ligand, mol_name, system_name, temperature, total_dG, total_se, canonical_smiles, original_name, original_path, protocol, analysis_start_step, n_bootstraps, status, failure_reason, and created_at.

Return type:

pandas.DataFrame

batter.api.load_fe_run(work_dir: str | Path, run_id: str, ligand: str | None = None) FERecord[source]

Load a single FE record by run_id from a portable work directory.

Parameters:
  • work_dir (str or Path) – Root directory of the BATTER execution.

  • run_id (str) – Identifier of the FE run to load (as returned by list_fe_runs()).

  • ligand (str, optional) – Ligand identifier when multiple ligands were processed in the run. If omitted, the sole ligand is selected automatically or a ValueError is raised when multiple matches exist.

Returns:

Structured record containing total ΔG, standard error, components, and per-window results.

Return type:

FERecord

batter.api.load_run_config(path: Path | str) RunConfig[source]

Read a run-level YAML file and return a validated configuration.

Parameters:

path (str or pathlib.Path) – Location of the run YAML file.

Returns:

Parsed run configuration.

Return type:

RunConfig

batter.api.load_sim_config(path: Path | str) SimulationConfig

Load a simulation configuration from YAML.

Parameters:

path (str or pathlib.Path) – Path to the simulation YAML file.

Returns:

Validated simulation configuration.

Return type:

SimulationConfig

batter.api.read_cinnabar_outputs(bundle_dir: str | Path, *, require_absolute: bool = False)[source]

Read a generated Cinnabar export bundle from disk.

Parameters:
  • bundle_dir (str or Path) – Directory containing cinnabar_relative.csv and optional absolute and SFC correction CSVs produced by the Cinnabar export.

  • require_absolute (bool, optional) – When True, raise if the bundle does not contain cinnabar_absolute.csv.

Returns:

Relative and absolute tables. Each table includes uncorrected columns and SFC correction columns when those outputs are present, with free-energy units stored in a unit column. The *_uncorrected columns are sourced from Cinnabar’s CSVs, and the *_cycle_closure columns are sourced from the SFC CSVs.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

batter.api.run_analysis_from_execution(work_dir: str | Path, run_id: str | None = None, *, ligand: str | None = None, components: Sequence[str] | None = None, n_workers: int | None = None, analysis_start_step: int | None = None, n_bootstraps: int | None = None, overwrite: bool = True, raise_on_error: bool = True) None[source]

Run FE analysis for a partially finished/finished execution.

Parameters:
  • work_dir (str or Path) – Root directory containing the portable execution store.

  • run_id (str, optional) – Identifier of the execution (e.g., run-20240101). When omitted, the most recently modified execution under <work_dir>/executions is used.

  • ligand (str, optional) – Ligand identifier to target when only a subset should be analyzed.

  • components (sequence of str, optional) – Components to include during analysis (overrides sim_cfg.components).

  • n_workers (int, optional) – Number of worker processes requested for the analysis handler.

  • analysis_start_step (int, optional) – First production step to include in analysis (per window); overrides config.

  • n_bootstraps (int, optional) – Number of MBAR bootstrap resamples; overrides config.

  • overwrite (bool, optional) – When True (default), overwrite any existing analysis results for the run_id. When False, skip ligands that already have analysis outputs.

  • raise_on_error (bool, optional) – When True (default) propagate errors raised by the analysis handler. Set to False to log the failure and continue with other ligands.

batter.api.run_from_yaml(path: Path | str, on_failure: Literal['prune', 'raise', 'retry'] = None, run_overrides: Dict[str, Any] | None = None) None[source]

Execute a BATTER workflow described by a YAML file.

batter.api.save_sim_config(cfg: SimulationConfig, path: Path | str) None

Write a simulation configuration to YAML.

Parameters:
  • cfg (SimulationConfig) – Configuration object to serialise.

  • path (str or pathlib.Path) – Output file path for the YAML representation.

Config Modules#

class batter.config.run.CreateArgs(*, system_name: str | None = 'unnamed_system', protein_input: Path | None = None, system_input: Path | None = None, system_coordinate: Path | None = None, protein_align: str | None = 'name CA', ligand_paths: dict[str, ~pathlib.Path | str]=<factory>, ligand_input: Path | None = None, ligand_ff: str = 'gaff2', retain_lig_prot: bool = True, param_method: Literal['amber', 'openff']='amber', param_charge: str = 'am1bcc', param_outdir: Path | None = None, anchor_atoms: list[str] = <factory>, lipid_mol: list[str] = <factory>, other_mol: list[str] = <factory>, overwrite: bool = True, extra_restraints: str | None = None, extra_restraint_fc: float = 10.0, extra_conformation_restraints: Path | None = None, receptor_ff: str = 'protein.ff14SB', lipid_ff: str = 'lipid21', solv_shell: float = 15.0, cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, neutralize_only: Literal['yes', 'no']='no', water_model: str = 'TIP3P', l1_range: float = 6.0, min_adis: float = 3.0, max_adis: float = 7.0)[source]

Bases: BaseModel

Inputs for system creation and staging.

Notes

This section mirrors the create block in the run YAML file.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

anchor_atoms: list[str]
anion: str
cation: str
extra_conformation_restraints: Path | None
extra_restraint_fc: float
extra_restraints: str | None
ion_conc: float
l1_range: float
ligand_ff: str
ligand_input: Path | None
ligand_paths: dict[str, Path | str]
lipid_ff: str
lipid_mol: list[str]
max_adis: float
min_adis: float
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

neutralize_only: Literal['yes', 'no']
other_mol: list[str]
overwrite: bool
param_charge: str
param_method: Literal['amber', 'openff']
param_outdir: Path | None
protein_align: str | None
protein_input: Path | None
receptor_ff: str
resolve_paths(base: Path) CreateArgs[source]

Return a copy where path fields are absolute relative to base.

retain_lig_prot: bool
solv_shell: float
system_coordinate: Path | None
system_input: Path | None
system_name: str | None
water_model: str
class batter.config.run.FESimArgs(*, dec_int: str = 'mbar', remd: RemdArgs = <factory>, rocklin_correction: Literal['yes', 'no']='no', lambdas: List[float] = <factory>, component_lambdas: Dict[str, ~typing.List[float]]=<factory>, blocks: int = 0, lig_buffer: float = 15.0, lig_distance_force: float = 5.0, lig_angle_force: float = 250.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 10.0, lig_com_force: float = 10.0, buffer_x: float = 20.0, buffer_y: float = 20.0, buffer_z: float = 20.0, eq_steps: Annotated[int, ~annotated_types.Ge(ge=0)] = 1000000, n_steps: Dict[str, int]=<factory>, ntpr: int = 100, ntwr: int = 2500, ntwe: int = 0, ntwx: int = 25000, cut: float = 9.0, gamma_ln: float = 1.0, dt: float = 0.004, hmr: Literal['yes', 'no']='no', enable_mcwat: Literal['yes', 'no']='yes', temperature: float = 298.15, barostat: int = 2, unbound_threshold: Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: Annotated[int, ~annotated_types.Ge(ge=0)] = 0)[source]

Bases: BaseModel

Free-energy simulation knobs loaded from the fe_sim section.

The fields feed directly into batter.config.simulation.SimulationConfig overrides. fe_type is resolved internally from protocol rather than being set by users.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

analysis_start_step: int
barostat: int
blocks: int
buffer_x: float
buffer_y: float
buffer_z: float
component_lambdas: Dict[str, List[float]]
cut: float
dec_int: str
dt: float
enable_mcwat: Literal['yes', 'no']
eq_steps: int
gamma_ln: float
hmr: Literal['yes', 'no']
lambdas: List[float]
lig_angle_force: float
lig_buffer: float
lig_com_force: float
lig_dihcf_force: float
lig_distance_force: float
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_bootstraps: int
n_steps: Dict[str, int]
ntpr: int
ntwe: int
ntwr: int
ntwx: int
rec_com_force: float
remd: RemdArgs
rocklin_correction: Literal['yes', 'no']
temperature: float
unbound_threshold: float
class batter.config.run.KartografMapperArgs(*, atom_max_distance: float = 0.95, map_exact_ring_matches_only: bool = True, allow_partial_fused_rings: bool = True, allow_bond_breaks: bool = False, filter_element_changes: bool = True, filter_mismatched_attached_h_count: bool = False)[source]

Bases: BaseModel

Kartograf atom mapper option overrides for RBFE.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

allow_bond_breaks: bool
allow_partial_fused_rings: bool
atom_max_distance: float
filter_element_changes: bool
filter_mismatched_attached_h_count: bool
map_exact_ring_matches_only: bool
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class batter.config.run.LomapMapperArgs(*, time: int | None = None, threed: bool | None = None, max3d: float | None = None, element_change: bool | None = None, shift: bool | None = None)[source]

Bases: BaseModel

LoMap atom mapper option overrides for RBFE.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

element_change: bool | None
max3d: float | None
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

shift: bool | None
threed: bool | None
time: int | None
class batter.config.run.MDSimArgs(*, dt: float = 0.004, temperature: float = 298.15, eq_steps: Annotated[int, Ge(ge=0)] = 100000, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 25000, cut: float = 9.0, gamma_ln: float = 1.0, barostat: int = 2, hmr: Literal['yes', 'no'] = 'yes', enable_mcwat: Literal['yes', 'no'] = 'yes')[source]

Bases: BaseModel

Simulation overrides used when protocol == "md".

These runs reuse the equilibration steps from ABFE but never schedule FE windows, so only generic MD knobs are required (no lambdas, SDR restraints, etc.).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

barostat: int
cut: float
dt: float
enable_mcwat: Literal['yes', 'no']
eq_steps: int
gamma_ln: float
hmr: Literal['yes', 'no']
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

ntpr: int
ntwe: int
ntwr: int
ntwx: int
temperature: float
class batter.config.run.RBFENetworkArgs(*, mapping: str | None = 'default', atom_mapper: Literal['kartograf', 'lomap']='kartograf', kartograf: KartografMapperArgs = <factory>, lomap: LomapMapperArgs = <factory>, konnektor_layout: str | None = None, both_directions: bool = False, mapping_file: Path | None = None)[source]

Bases: BaseModel

RBFE network mapping controls.

Users can specify a mapping strategy by name (mapping) or provide an explicit mapping file (mapping_file).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

atom_mapper: Literal['kartograf', 'lomap']
both_directions: bool
kartograf: KartografMapperArgs
konnektor_layout: str | None
lomap: LomapMapperArgs
mapping: str | None
mapping_file: Path | None
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resolve_paths(base: Path) RBFENetworkArgs[source]
class batter.config.run.RunConfig(*, version: int = 1, protocol: Literal['abfe', 'rbfe', 'asfe', 'md']='abfe', backend: Literal['local', 'slurm']='local', create: CreateArgs, fe_sim: Dict[str, ~typing.Any] | ~batter.config.run.FESimArgs | ~batter.config.run.MDSimArgs=<factory>, run: RunSection, rbfe: RBFENetworkArgs | None = None)[source]

Bases: BaseModel

Top-level YAML config.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

backend: Literal['local', 'slurm']
create: CreateArgs
fe_sim: Dict[str, Any] | FESimArgs | MDSimArgs
classmethod load(path: Path | str) RunConfig[source]

Load and validate a run configuration from disk.

Parameters:

path (str or pathlib.Path) – Location of the YAML file to parse.

Returns:

Fully validated configuration object.

Return type:

RunConfig

model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod model_validate_yaml(yaml_text: str) RunConfig[source]

Validate a run configuration from an in-memory YAML string.

Parameters:

yaml_text (str) – Raw YAML content describing the run configuration.

Returns:

Validated configuration model.

Return type:

RunConfig

protocol: Literal['abfe', 'rbfe', 'asfe', 'md']
rbfe: RBFENetworkArgs | None
resolved_sim_config() SimulationConfig[source]

Build the effective simulation configuration for this run.

Returns:

Simulation parameters derived from create and fe_sim sections.

Return type:

SimulationConfig

run: RunSection
version: int
with_base_dir(base_dir: Path) RunConfig[source]

Return a copy with relative paths resolved against base_dir.

class batter.config.run.RunSection(*, output_folder: Path, system_type: Literal['MABFE', 'MASFE'] | None=None, only_fe_preparation: bool = False, on_failure: Literal['raise', 'prune', 'retry']='raise', max_workers: int | None = None, max_active_jobs: Annotated[int | None, ~annotated_types.Ge(ge=0)] = 1000, batch_mode: bool = False, batch_gpus: Annotated[int | None, ~annotated_types.Ge(ge=0)] = None, batch_gpus_per_task: Annotated[int, ~annotated_types.Ge(ge=1)] = 1, batch_srun_extra: List[str] = <factory>, dry_run: bool = False, clean_failures: bool = False, remd: Literal['yes', 'no']='no', run_id: str = 'auto', allow_run_id_mismatch: bool = False, slurm_header_dir: Path | None = None, email_sender: str = 'nobody@stanford.edu', email_on_completion: str | None = None, slurm: SlurmConfig = <factory>)[source]

Bases: BaseModel

Run-related settings, including where outputs land.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

allow_run_id_mismatch: bool
batch_gpus: int | None
batch_gpus_per_task: int
batch_mode: bool
batch_srun_extra: List[str]
clean_failures: bool
dry_run: bool
email_on_completion: str | None
email_sender: str
max_active_jobs: int | None
max_workers: int | None
model_config = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

on_failure: Literal['raise', 'prune', 'retry']
only_fe_preparation: bool
output_folder: Path
remd: Literal['yes', 'no']
resolve_paths(base: Path) RunSection[source]

Return a copy where output_folder is absolute relative to base.

run_id: str
slurm: SlurmConfig
slurm_header_dir: Path | None
system_type: Literal['MABFE', 'MASFE'] | None
class batter.config.run.SlurmConfig(*, partition: str | None = None, time: str | None = None, nodes: int | None = None, ntasks_per_node: int | None = None, mem_per_cpu: str | None = None, gres: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, extra_sbatch: List[str] = <factory>)[source]

Bases: BaseModel

SLURM-specific configuration.

Parameters:
  • partition (str, optional) – SLURM partition/queue name.

  • time (str, optional) – Walltime in the HH:MM:SS format.

  • nodes (int, optional) – Number of nodes to request.

  • ntasks_per_node (int, optional) – Number of tasks per node.

  • mem_per_cpu (str, optional) – Memory per CPU (e.g., 16G).

  • gres (str, optional) – Generic resource string (e.g., GPU spec).

  • account (str, optional) – Account to charge for jobs.

  • qos (str, optional) – QoS string if required by the cluster.

  • constraint (str, optional) – Constraint string passed to sbatch.

  • extra_sbatch (list[str]) – Additional arguments appended to the sbatch submission command.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

account: str | None
constraint: str | None
extra_sbatch: List[str]
gres: str | None
mem_per_cpu: str | None
model_config = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

nodes: int | None
ntasks_per_node: int | None
partition: str | None
qos: str | None
time: str | None
to_sbatch_flags() List[str][source]

Produce a flat list of sbatch command-line flags.

Returns:

Sequence suitable for passing to subprocess.run().

Return type:

list of str

class batter.config.simulation.SimulationConfig(*, system_name: str, fe_type: ~typing.Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md'], dec_int: ~typing.Literal['mbar', 'ti'] = 'mbar', remd: ~typing.Literal['yes', 'no'] = 'no', remd_nstlim: int = 100, slurm_header_dir: ~pathlib.Path = <factory>, infe: bool = False, p1: str = '', p2: str = '', p3: str = '', other_mol: ~typing.List[str] = <factory>, lipid_mol: ~typing.List[str] = <factory>, solv_shell: float | None = 15.0, rocklin_correction: ~typing.Literal['yes', 'no'] = 'no', release_eq: ~typing.List[float] = <factory>, ti_points: int | None = 0, lambdas: ~typing.List[float] = <factory>, component_windows: ~typing.Dict[str, ~typing.List[float]] = <factory>, sdr_dist: float | None = 0.0, dec_method: str | None = None, blocks: int = 0, unbound_threshold: ~typing.Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, lig_distance_force: float = 0.0, lig_angle_force: float = 0.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 0.0, lig_com_force: float = 0.0, water_model: ~typing.Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC'] = 'TIP3P', buffer_x: float = 10.0, buffer_y: float = 10.0, buffer_z: float = 15.0, lig_buffer: float = 10.0, neutralize_only: ~typing.Literal['yes', 'no'] = 'no', cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, hmr: ~typing.Literal['yes', 'no'] = 'no', enable_mcwat: ~typing.Literal['yes', 'no'] = 'yes', temperature: float = 298.15, eq_steps: int = 1000000, n_steps_dict: ~typing.Dict[str, int] = <factory>, l1_x: float | None = None, l1_y: float | None = None, l1_z: float | None = None, l1_range: float | None = None, min_adis: float | None = None, max_adis: float | None = None, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 2500, cut: float = 9.0, gamma_ln: float = 1.0, barostat: ~typing.Literal[1, 2] = 2, dt: float = 0.004, all_atoms: ~typing.Literal['yes', 'no'] = 'no', receptor_ff: str = 'protein.ff14SB', ligand_ff: str = 'gaff2', lipid_ff: str = 'lipid21', ligand_dict: ~typing.Dict[str, ~typing.Any] = <factory>, rng: int = 0, ion_def: ~typing.List[~typing.Any] = <factory>, dic_n_steps: ~typing.Dict[str, int] = <factory>, rest: ~typing.List[float] = <factory>, neut: str = '', protein_align: str = 'name CA', receptor_segment: str | None = None, components: ~typing.List[str] = <factory>, component_lambdas: ~typing.Dict[str, ~typing.List[float]] = <factory>, membrane_simulation: bool = True)[source]

Bases: BaseModel

Simulation configuration for ABFE/ASFE/RBFE workflows. Values are fed by RunConfig.resolved_sim_config(), which merges create: and fe_sim:.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

all_atoms: Literal['yes', 'no']
analysis_start_step: int
anion: str
barostat: Literal[1, 2]
blocks: int
buffer_x: float
buffer_y: float
buffer_z: float
cation: str
component_lambdas: Dict[str, List[float]]
component_windows: Dict[str, List[float]]
components: List[str]
cut: float
dec_int: Literal['mbar', 'ti']
dec_method: str | None
dic_n_steps: Dict[str, int]
dt: float
enable_mcwat: Literal['yes', 'no']
eq_steps: int
fe_type: Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md']
classmethod from_sections(create: CreateArgs, fe: FESimArgs, *, protocol: str | None = None, fe_type: str | None = None, slurm_header_dir: Path | None = None, run_remd: str | bool | None = None) SimulationConfig[source]

Construct a SimulationConfig from run sections.

Parameters:
  • create (CreateArgs) – System creation inputs taken from the create YAML section.

  • fe (FESimArgs) – Free-energy simulation overrides from the fe_sim section.

  • run_remd ({“yes”,”no”}, optional) – Whether REMD execution is enabled (controls submission only; REMD inputs are always written during preparation).

Returns:

Fully merged simulation configuration ready for downstream use.

Return type:

SimulationConfig

gamma_ln: float
hmr: Literal['yes', 'no']
infe: bool
ion_conc: float
ion_def: List[Any]
l1_range: float | None
l1_x: float | None
l1_y: float | None
l1_z: float | None
lambdas: List[float]
lig_angle_force: float
lig_buffer: float
lig_com_force: float
lig_dihcf_force: float
lig_distance_force: float
ligand_dict: Dict[str, Any]
ligand_ff: str
lipid_ff: str
lipid_mol: List[str]
max_adis: float | None
membrane_simulation: bool
min_adis: float | None
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_bootstraps: int
n_steps_dict: Dict[str, int]
neut: str
neutralize_only: Literal['yes', 'no']
ntpr: int
ntwe: int
ntwr: int
ntwx: int
other_mol: List[str]
p1: str
p2: str
p3: str
protein_align: str
rec_com_force: float
receptor_ff: str
receptor_segment: str | None
release_eq: List[float]
remd: Literal['yes', 'no']
remd_nstlim: int
rest: List[float]
rng: int
rocklin_correction: Literal['yes', 'no']
sdr_dist: float | None
slurm_header_dir: Path
solv_shell: float | None
system_name: str
temperature: float
ti_points: int | None
to_dict() Dict[str, Any][source]
unbound_threshold: float
water_model: Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC']
batter.config.utils.coerce_yes_no(value: Any) str | None[source]

Normalize boolean-like values into "yes" or "no".

Parameters:

value – Input flag provided by the user. Supported types include bool, numeric scalars, or strings such as "true" and "0".

Returns:

"yes" or "no" when the flag can be interpreted. None is returned unchanged to preserve optional semantics.

Return type:

str or None

Raises:

ValueError – If the value cannot be coerced into a boolean switch.

batter.config.utils.expand_env_vars(data: Any, *, base_dir: Path | None = None) Any[source]

Recursively expand environment variables in a YAML-derived structure.

Parameters:
  • data – Parsed YAML content to normalise.

  • base_dir (Path, optional) – Base directory for resolving relative (./) paths.

Returns:

Structure with string values expanded.

Return type:

Any

batter.config.utils.normalize_optional_path(value: Any) Path | None[source]

Resolve optional path-like values into pathlib.Path objects.

Parameters:

value – Path candidate that may be None or an empty string. Strings may contain environment variables or ~.

Returns:

Expanded path when provided; None if the value is empty.

Return type:

pathlib.Path or None

batter.config.utils.sanitize_ligand_name(name: str) str[source]

Convert a ligand identifier into a filesystem-safe token.

Parameters:

name (str) – Original ligand identifier, often derived from filenames or keys.

Returns:

Uppercase alphanumeric token with unsafe characters replaced by underscores.

Return type:

str

batter.config.utils.sanitize_user_ligand_name(name: str) str[source]

Sanitize and validate a user-provided ligand identifier.

Reserved names that conflict with BATTER directory layout are rejected.

RBFE Helpers#

RBFE network helpers.

class batter.rbfe.RBFENetwork(ligands: Tuple[str, ...], pairs: Tuple[Tuple[str, str], ...])[source]

Record the RBFE simulation mapping as ligand pairs.

Parameters:
  • ligands (Sequence[str]) – Ordered ligand identifiers participating in the network.

  • pairs (Sequence[tuple[str, str]]) – Directed pairs describing simulations to run (reference, target).

static default_mapping(ligands: Sequence[str]) List[Tuple[str, str]][source]

Default RBFE mapping: first ligand paired to each subsequent ligand.

classmethod from_ligands(ligands: Sequence[str], mapping_fn: Callable[[Sequence[str]], Iterable[Tuple[str, str]]] | None = None) RBFENetwork[source]

Build an RBFE network from ligand identifiers and a mapping function.

Parameters:
  • ligands (Sequence[str]) – Ordered ligand identifiers.

  • mapping_fn (callable, optional) – Function that returns iterable of (ref, target) pairs. When omitted, defaults to mapping the first ligand to all others.

ligands: Tuple[str, ...]
pairs: Tuple[Tuple[str, str], ...]
to_mapping() dict[source]

Return a JSON-serializable mapping payload.

batter.rbfe.draw_explicit_konnektor_network(pairs: Sequence[Sequence[str] | tuple[str, str]], ligand_files: Mapping[str, Path], plot_path: Path, hmr: bool = True, atom_mapper: str = 'kartograf', kartograf_options: Any | None = None, lomap_options: Any | None = None) None[source]

Build an explicit Konnektor network from pairs and draw it.

batter.rbfe.filter_element_changes(molA: rdkit.Chem.Mol, molB: rdkit.Chem.Mol, mapping: dict[int, int]) dict[int, int][source]

Forces a mapping to exclude any alchemical element changes in the core

batter.rbfe.filter_mismatched_attached_h_count(molA: rdkit.Chem.Mol, molB: rdkit.Chem.Mol, mapping: dict[int, int]) dict[int, int][source]

Exclude mapped heavy-atom pairs where the number of directly attached H differs. This helps avoid HMR mass mismatches for ‘common/core’ atoms.

batter.rbfe.konnektor_pairs(ligands: Sequence[str], ligand_files: Mapping[str, Path], layout: str | None = None, plot_path: Path | None = None, hmr: bool = True, atom_mapper: str = 'kartograf', kartograf_options: Any | None = None, lomap_options: Any | None = None) List[Tuple[str, str]][source]

Build RBFE pairs using Konnektor network planners.

batter.rbfe.load_mapping_file(path: Path) List[Tuple[str, str]][source]

Load RBFE mapping pairs from a file.

Supported formats:
  • JSON/YAML: list of pairs, or dict with ‘pairs’/’edges’, or adjacency mapping.

  • Text: one pair per line, separated by ‘~’, ‘,’ or whitespace.

batter.rbfe.resolve_mapping_fn(name: str | None) Callable[[Sequence[str]], Iterable[Tuple[str, str]]][source]

Resolve a mapping function by name.

Orchestrator Modules#

batter.orchestrate.run#

Top-level orchestration entry for BATTER runs.

This module wires: YAML (RunConfig) → shared system build → bulk ligand staging → single param job (“param_ligands”) → per-ligand pipelines → FE record save.

batter.orchestrate.run.run_from_yaml(path: Path | str, on_failure: Literal['prune', 'raise', 'retry'] = None, run_overrides: Dict[str, Any] | None = None) None[source]

Execute a BATTER workflow described by a YAML file.

Selection helpers for choosing the correct pipeline implementation.

batter.orchestrate.pipeline_utils.select_pipeline(protocol: str, sim_cfg: SimulationConfig, only_fe_prep: bool, *, sys_params: SystemParams | dict | None = None, partition: str | None = None) Pipeline[source]

Return the protocol-specific pipeline for a run.

Parameters:
  • protocol (str) – Name of the requested protocol ("abfe", "rbfe", "asfe", or "md").

  • sim_cfg (SimulationConfig) – Validated simulation configuration produced by RunConfig.

  • only_fe_prep (bool) – When True, truncate the pipeline after FE preparation steps.

  • sys_params (SystemParams or dict, optional) – Extra parameters passed to system-level pipeline steps.

Returns:

Pipeline instance tailored to the requested protocol.

Return type:

Pipeline

Raises:

ValueError – If the protocol name is not recognised.

Utilities for configuring execution backends used by the orchestrator.

batter.orchestrate.backend.register_local_handlers(backend: LocalBackend) None[source]

Register built-in pipeline handlers on the local backend.

Parameters:

backend (LocalBackend) – Backend instance that should receive the default handler mapping.

Raises:

RuntimeError – If optional handler dependencies (for example openff-toolkit) are missing.

Execution Modules#

Interfaces shared by execution backends.

class batter.exec.base.ExecBackend(*args, **kwargs)[source]

Protocol implemented by execution backends.

name: str
run(step: Step, system: SimSystem, params: Dict) ExecResult[source]

Execute step for system.

Parameters:
  • step (Step) – Step metadata as produced by the pipeline.

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Backend-specific parameters, potentially including resources.

Returns:

Execution artifacts and job identifiers.

Return type:

ExecResult

class batter.exec.base.Resources(time: str | None = None, cpus: int | None = None, gpus: int | None = None, mem: str | None = None, partition: str | None = None, account: str | None = None, extra: Mapping[str, str]=<factory>)[source]

Resource hints supplied to execution backends.

Parameters:
  • time (str, optional) – Walltime (e.g., "02:00:00").

  • cpus (int, optional) – CPU cores per task.

  • gpus (int, optional) – Number of GPUs required.

  • mem (str, optional) – Memory request (e.g., "16G").

  • partition (str, optional) – Scheduler partition or queue.

  • account (str, optional) – Scheduler account.

  • extra (Mapping[str, str], optional) – Backend-specific SBATCH-style flags.

account: str | None
cpus: int | None
extra: Mapping[str, str]
gpus: int | None
mem: str | None
partition: str | None
time: str | None

Execution backend for running pipelines locally.

class batter.exec.local.LocalBackend(max_workers: int | None = None)[source]

In-process execution backend with optional parallel orchestration.

Parameters:

max_workers (int, optional) – Maximum number of worker processes to use when run_parallel() is invoked. None lets the backend auto-detect resources; 0 or 1 forces serial execution.

name: str = 'local'
register(step_name: str, handler: Callable[[Step, SimSystem, Mapping], ExecResult]) None[source]

Register a callable to execute step_name.

Parameters:
  • step_name (str) – Identifier of the step (matches batter.pipeline.step.Step.name).

  • handler (Callable[[Step, SimSystem, Mapping], ExecResult]) – Function responsible for executing the step.

run(step: Step, system: SimSystem, params: Mapping) ExecResult[source]

Execute step for system on the local machine.

Parameters:
  • step – Pipeline step metadata.

  • system – Simulation system descriptor.

  • params – Step parameters, typically generated by the orchestration layer.

Returns:

Artifacts and job identifiers (empty for local execution).

Return type:

ExecResult

run_parallel(pipeline: Pipeline, systems: Iterable[SimSystem], *, max_workers: int | None = None, description: str = '', batch_size: str | int = 'auto', verbose: int = 10, prefer: str = 'processes', backend: str | None = None) Dict[str, Mapping[str, ExecResult]][source]

Execute pipeline for multiple systems in parallel.

Parameters:
  • pipeline – Pipeline object providing the sequence of steps to execute.

  • systems (Iterable[SimSystem]) – Collection of systems to process.

  • max_workers (int, optional) – Override the configured worker cap; None falls back to the value provided at construction time.

  • description (str, optional) – Human-readable label used in debug logging.

  • batch_size, verbose, prefer, backend – Joblib configuration knobs forwarded to joblib.Parallel.

Returns:

Mapping of system.name to per-step results.

Return type:

dict

Raises:

RuntimeError – When one or more systems fail.

Execution backend that submits steps to Slurm via sbatch.

class batter.exec.slurm.SlurmBackend(*args, **kwargs)[source]

Slurm backend that materializes lightweight job scripts.

name: str = 'slurm'
run(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Submit step to Slurm.

Parameters:
  • step (Step) – Pipeline step metadata.

  • system (SimSystem) – Simulation system whose root directory stores scripts and logs.

  • params (dict) – Backend-specific options. Recognised keys include resources, env (exported variables), and payload (shell snippet).

Returns:

Artifacts referencing the generated script and log paths together with the submitted job identifier (if available).

Return type:

ExecResult

class batter.exec.slurm_mgr.SlurmJobManager(poll_s: float = 60.0, max_retries: int = 3, resubmit_backoff_s: float = 30.0, registry_file: Path | None = None, dry_run: bool = False, sbatch_flags: Sequence[str] | None = None, submit_retry_limit: int = 3, submit_retry_delay_s: float = 60.0, max_active_jobs: int | None = None, partition: str | None = None, batch_mode: bool = False, batch_gpus: int | None = None, gpus_per_task: int = 1, srun_extra: Sequence[str] | None = None, stage: str | None = None, header_root: Path | None = None, **_ignored: Any)[source]

Submit, monitor, and resubmit Slurm jobs for BATTER executions.

Parameters:
  • poll_s (float, optional) – Poll interval (seconds) between status checks.

  • max_retries (int, optional) – Maximum automatic resubmissions per workdir (excluding TIMEOUT and COMPLETED-without-sentinel).

  • resubmit_backoff_s (float, optional) – Sleep before resubmitting a job after detecting termination/missing state.

  • registry_file (pathlib.Path, optional) – JSONL queue file for cross-process coordination.

  • dry_run (bool, optional) – When True, do not submit; record that submission would occur.

  • sbatch_flags (Sequence[str], optional) – Global sbatch flags appended to every submission.

  • submit_retry_limit (int, optional) – Number of retries for the submission command itself.

  • submit_retry_delay_s (float, optional) – Delay between submission retries.

  • max_active_jobs (int, optional) – Cap on concurrent jobs for the user (checked via one squeue -u call).

  • partition (str, optional) – Partition filter used by max_active_jobs checks.

Other Parameters:
  • batch_mode, batch_gpus, gpus_per_task, srun_extra, stage, header_root – Accepted for compatibility with older code paths. This manager does not implement batch execution; values are stored/ignored.

  • **_ignored – Extra kwargs are accepted and ignored for compatibility.

add(spec: SlurmJobSpec) None[source]

Queue spec for later submission and optionally persist to registry.

Parameters:

spec (SlurmJobSpec) – Job specification.

clear() None[source]

Clear in-memory queue/retry book and remove on-disk registry if present.

ensure_running(spec: SlurmJobSpec) None[source]

Ensure the spec is submitted or already done/active.

Parameters:

spec (SlurmJobSpec) – Job spec.

Notes

This method does not register specs; it’s a one-off submit-if-needed.

jobs() List[SlurmJobSpec][source]

Return merged in-memory + registry specs (dedup by workdir).

set_stage(stage: str | None) None[source]

Set the active stage filter for registry loading/submission.

Parameters:

stage (str or None) – Stage key such as equil, fe_equil, fe, etc. If None, stage filtering is disabled.

wait_all() None[source]

Submit/monitor all registered jobs and block until completion.

wait_for_slot(poll_s: float | None = None, user: str | None = None, partition: str | None = None) None[source]

Block until active jobs drop below max_active_jobs.

Parameters:
  • poll_s (float, optional) – Polling interval in seconds (defaults to poll_s).

  • user (str, optional) – Unix username (defaults to $USER).

  • partition (str, optional) – Partition to filter on (defaults to manager partition).

wait_until_done(specs: Iterable[SlurmJobSpec]) None[source]

Legacy interface: monitor a given set until complete.

class batter.exec.slurm_mgr.SlurmJobSpec(workdir: Path, script_rel: str = 'SLURMM-run', finished_name: str = 'FINISHED', failed_name: str = 'FAILED', name: str | None = None, stage: str | None = None, body_rel: str | None = None, header_name: str | None = None, header_template: Path | None = None, header_root: Path | None = None, batch_script: Path | None = None, extra_sbatch: Sequence[str] = <factory>, extra_env: Dict[str, str]=<factory>, submit_dir: Path | None = None, alt_script_names: Sequence[str] = ('SLURMM-run', 'SLURMM-Run', 'slurmm-run', 'run.sh'))[source]

Descriptor for a Slurm job managed by SlurmJobManager.

Parameters:
  • workdir (pathlib.Path) – Working directory containing submission scripts and sentinel files.

  • script_rel (str, optional) – Preferred relative submission script path.

  • finished_name (str, optional) – Sentinel file name indicating success.

  • failed_name (str, optional) – Sentinel file name indicating failure.

  • name (str, optional) – Friendly display name.

  • stage (str, optional) – Logical stage used for registry filtering.

  • extra_sbatch (Sequence[str], optional) – Extra sbatch flags (job-specific).

  • extra_env (dict, optional) – Extra environment variables to export (job-specific).

  • submit_dir (pathlib.Path, optional) – Directory to submit from (defaults to workdir).

Notes

The remaining fields are legacy compatibility fields used by older BATTER versions and/or existing registry entries. The manager may ignore them.

alt_script_names: Sequence[str] = ('SLURMM-run', 'SLURMM-Run', 'slurmm-run', 'run.sh')
batch_script: Path | None = None
body_rel: str | None = None
extra_env: Dict[str, str]
extra_sbatch: Sequence[str]
failed_name: str = 'FAILED'
failed_path() Path[source]

Sentinel path signalling failure.

finished_name: str = 'FINISHED'
finished_path() Path[source]

Sentinel path signalling successful completion.

header_name: str | None = None
header_root: Path | None = None
header_template: Path | None = None
jobid_path() Path[source]

Path containing the most recent Slurm job identifier.

name: str | None = None
resolve_script_abs() Path[source]

Return the absolute path to the submission script.

Returns:

Existing script path if found, otherwise the preferred path.

Return type:

pathlib.Path

script_arg() str[source]

Return the submission-script path argument for sbatch.

Returns:

Script path relative to submit_dir when possible.

Return type:

str

script_rel: str = 'SLURMM-run'
stage: str | None = None
submit_dir: Path | None = None
workdir: Path

Helpers for constructing AMBER mdin control files.

class batter.exec.amber.mdin.AmberMdin(*, cut: float = 9.0, ioutfm: int = 1, ntb: int = 1, ntxo: int = 2)[source]

Mutable representation of an AMBER mdin file.

Parameters:
  • cut (float, optional) – Non-bonded cutoff in Å (default: 9.0).

  • ioutfm (int, optional) – Output format flag (1 → NetCDF).

  • ntb (int, optional) – Periodic boundary condition flag.

  • ntxo (int, optional) – Restart write format.

add_block(name: str, params: Dict[str, object] | None = None) None[source]

Append a named control block.

add_raw(line: str) None[source]

Append a raw line verbatim to the output.

apply_defaults(*, cut: float = 9.0, ioutfm: int = 1, ntb: int = 1, ntxo: int = 2) None[source]

Initialise with a baseline cntrl block.

override_block(block_name: str, param_dict: Dict[str, object]) None[source]

Merge param_dict into an existing block or create the block.

save(filename: str | Path) None[source]

Write the mdin file to filename.

to_string() str[source]

Render the mdin contents as text.

update_param(block_name: str, key: str, value: object) None[source]

Update a single parameter within block_name.

batter.exec.amber.mdin.apply_disang(mdin: AmberMdin, *, filename: str = 'disang.rest') None[source]

Reference a DISANG restraint file.

batter.exec.amber.mdin.apply_membrane_npt(mdin: AmberMdin, *, temp: float = 298.15, steps: int = 50000, barostat: int = 2, dt: float = 0.004) None[source]

Configure semi-isotropic NPT suitable for membranes.

batter.exec.amber.mdin.apply_minimization(mdin: AmberMdin, *, steps: int = 5000) None[source]

Enable energy minimisation for steps iterations.

batter.exec.amber.mdin.apply_npt(mdin: AmberMdin, *, temp: float = 298.15, steps: int = 50000, barostat: int = 2, dt: float = 0.004) None[source]

Configure standard NPT dynamics.

batter.exec.amber.mdin.apply_restraints(mdin: AmberMdin, *, mask: str, weight: float = 50.0) None[source]

Add positional restraints.

batter.exec.amber.mdin.apply_ti(mdin: AmberMdin, *, lbd_val: float, timask1: str, timask2: str, scmask1: str, scmask2: str, crgmask: str) None[source]

Configure thermodynamic integration (TI) parameters.

batter.exec.amber.mdin.apply_wt_end(mdin: AmberMdin) None[source]

Append the &wt type='END' control line.

Slurm-backed equilibration handler.

batter.exec.handlers.equil.equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Submit and register the equilibration job with the Slurm manager.

Parameters:
  • step (Step) – Pipeline step metadata (unused but provided for symmetry).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Raw handler payload; validated into StepPayload.

Returns:

Result containing either existing artifacts (when already finished) or the work directory to be monitored by the manager.

Return type:

ExecResult

Raises:
  • FileNotFoundError – If the expected submission script is missing.

  • RuntimeError – When payload['job_mgr'] is not a SlurmJobManager.

Handlers that queue free-energy equilibration and production jobs.

batter.exec.handlers.fe.fe_equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Queue equilibration jobs for each component of a ligand.

Parameters:
  • step, system (ignored) – Included for parity with the handler signature.

  • params (dict) – Handler payload containing the job manager and configuration values.

Returns:

Number of jobs enqueued (without waiting for completion).

Return type:

ExecResult

batter.exec.handlers.fe.fe_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Queue production jobs for each component/window combination.

Parameters:
  • step, system (ignored) – Provided for handler API compatibility.

  • params (dict) – Handler payload containing the job manager and configuration values.

Returns:

Number of jobs enqueued (without waiting for completion).

Return type:

ExecResult

Run post-processing analysis on free-energy simulations.

batter.exec.handlers.fe_analysis.analyze_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Run FE analysis for a ligand rooted at <system.root>/fe.

Parameters:
  • step (Step) – Pipeline metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Mapping with the generated Results.dat and optional timeseries artefacts.

Return type:

ExecResult

Parameterise ligands and populate per-ligand artifacts.

batter.exec.handlers.param_ligands.copy_ligand_params(src_dir: Path, child_dir: Path, residue_name: str) None[source]

Copy lig.* artifacts into child_dir/params using residue_name.

batter.exec.handlers.param_ligands.param_ligands(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Run the ligand parametrisation pipeline and index results.

Parameters:
  • step (Step) – Pipeline metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Mapping containing the parameter store path, JSON index, manifest, and raw hashes.

Return type:

ExecResult

Prepare equilibration inputs for a ligand.

batter.exec.handlers.prepare_equil.prepare_equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Build equilibration inputs for the current ligand.

Parameters:
  • step (Step) – Pipeline step metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Contains the output directory and any generated metadata.

Return type:

ExecResult

Prepare alchemical FE inputs for a ligand.

batter.exec.handlers.prepare_fe.prepare_fe_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Construct the initial FE directory layout for a ligand.

Parameters:
  • step (Step) – Pipeline metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Metadata describing the generated directories.

Return type:

ExecResult

batter.exec.handlers.prepare_fe.prepare_fe_windows_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Expand FE windows for each requested component:
  • copies <comp>-1 to <comp>-2, <comp>-3, … (depending on lambda schedule)

  • keeps run scripts consistent in each window (builders call write_run_file)

  • writes artifacts/fe/windows.json summarizing windows

Builders re-use the same interface; here we just iterate components and request per-window builds by calling with win >= 1.

Prepare complex systems (protein/ligand/membrane) for simulations.

batter.exec.handlers.system_prep.system_prep(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Prepare a system by aligning components and generating reference structures.

Parameters:
  • step (Step) – Pipeline metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Paths to generated reference structures and a metadata dictionary with anchor and membrane information.

Return type:

ExecResult

Minimal system-preparation handler for MASFE workflows.

batter.exec.handlers.system_prep_masfe.system_prep_masfe(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]

Prepare a MASFE solvation system by staging ligands and overrides.

Parameters:
  • step (Step) – Pipeline metadata (unused).

  • system (SimSystem) – Simulation system descriptor.

  • params (dict) – Handler payload validated into StepPayload.

Returns:

Manifest of staged ligands and paths to generated files.

Return type:

ExecResult

Parameterisation Modules#

Ligand parameterisation helpers for GAFF/GAFF2 and OpenFF workflows.

class batter.param.ligand.LigandFactory[source]

Factory that chooses the appropriate loader/processor by file extension.

create_ligand(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None) LigandProcessing[source]

Instantiate a concrete LigandProcessing subclass.

Parameters:
  • ligand_file, index, output_dir, ligand_name, charge, retain_lig_prot,

  • ligand_ff, unique_mol_names – Forwarded to the underlying processor.

Returns:

Processor configured for the detected file type.

Return type:

LigandProcessing

Raises:

ValueError – If the file extension is unsupported.

class batter.param.ligand.LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]

Base class for ligand processing and parameterization.

It loads a ligand, determines a unique residue/name, estimates the charge, and generates AMBER/OpenFF parameters.

Parameters:
  • ligand_file – Input ligand path (SDF/MOL2/PDB depending on subclass).

  • index – 1-based index used for stable name generation.

  • output_dir – Output folder for generated files.

  • ligand_name – Optional preferred name; will be uniquified to 3 chars.

  • charge – Charge method for OpenFF pre-charge or quick estimate (e.g., "am1bcc").

  • retain_lig_prot – If True, keep hydrogen atoms from input.

  • ligand_ff – One of "gaff" or "gaff2" or an OpenFF release like "openff-2.2.0".

  • unique_mol_names – Existing names to avoid collisions.

Variables:
  • ligand_object (SmallMoleculeComponent)

  • openff_molecule (Molecule)

  • ligand_charge (float) – Estimated total charge (integer).

  • atomnames (list[str]) – Atom names extracted from generated PDB (AMBER path).

fetch_from_existing_db(database: str | Path) bool[source]

Search and copy ligand artifacts from a local database.

Parameters:

database – Directory containing <name>.(frcmod|lib|prmtop|inpcrd|mol2|pdb|json|sdf).

Returns:

True if a full, matching entry was found and copied.

Return type:

bool

property ligand_sdf_path: str

Path to the canonicalised SDF stored on disk.

Type:

str

property name: str

Three-character residue name used for generated artifacts.

Type:

str

prepare_ligand_parameters() None[source]

Generate parameters using either AMBER (GAFF/GAFF2) or OpenFF path.

Notes

  • OpenFF path first creates AMBER artifacts for tleap-based system build.

  • Writes a <name>.json metadata file to the output folder.

prepare_ligand_parameters_amberff(charge_method: str = 'bcc') None[source]

Prepare ligand parameters using AMBER (GAFF/GAFF2): mol2/frcmod/lib/prmtop.

Parameters:

charge_method – Antechamber charge method (e.g., "bcc" or "gas").

prepare_ligand_parameters_openff() None[source]

Prepare ligand parameters using OpenFF toolkit (and AMBER bootstrap).

Behavior#

  • Runs a fast AMBER bootstrap (GAFF2 + gas charges) so tleap artifacts exist.

  • Generates an OpenFF prmtop for downstream if you prefer OpenMM/OpenFF.

property smiles: str

Canonical SMILES with explicit hydrogens.

Type:

str

to_dict() Dict[str, Any][source]
class batter.param.ligand.MOL2_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
class batter.param.ligand.PDB_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
class batter.param.ligand.SDF_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
batter.param.ligand.batch_ligand_process(ligand_paths: Sequence[str | Path] | Mapping[str, str | Path], output_path: str | Path, retain_lig_prot: bool = True, ligand_ph: float = 7.0, ligand_ff: str = 'gaff2', charge_method: str = 'am1bcc', overwrite: bool = False, run_with_slurm: bool = False, max_slurm_jobs: int = 50, run_with_slurm_kwargs: Dict[str, Any] | None = None, job_extra_directives: List[str] | None = None, on_failure: Literal['prune', 'retry', 'raise'] | None = None) Tuple[List[str], Dict[str, Tuple[str, str]]][source]

Parameterise ligands into a content-addressed store.

Artifacts for each ligand are written under:

<output_path>/<hash_id>/*

where hash_id = sha256(canonical_smiles + ligand_ff + retain).hexdigest()[:12].

Parameters:
  • ligand_paths – List of file paths or mapping {alias: path}. Only the file path affects hashing.

  • output_path – Output directory for the content-addressed store.

  • retain_lig_prot – Whether to retain hydrogens from inputs.

  • ligand_ph – Target protonation pH (reserved for future use).

  • ligand_ff – Force field (‘gaff’/’gaff2’ or a valid OpenFF release name).

  • charge_method – Charge method for ligand.

  • overwrite – If True, re-parameterize even if <hash_id> already exists.

  • run_with_slurm – If True, distribute parametrization with Dask+SLURM (same behavior as before).

  • max_slurm_jobs, run_with_slurm_kwargs, job_extra_directives – SLURM/Dask configuration.

Returns:

  • list of str – Hash identifiers in processing order (duplicates preserved).

  • dict – Mapping from the provided input path to (hash_id, canonical_smiles).

Pipeline Modules#

class batter.pipeline.pipeline.Pipeline(steps: List[Step])[source]

Bases: object

Directed acyclic pipeline of Step objects.

Parameters:

steps (list[Step]) – Steps that form a DAG. Dependencies are given by Step.requires.

Notes

  • A simple topological sort is performed before execution.

  • Backends must implement a run(step, system) -> ExecResult method.

adjacency() Dict[str, List[str]][source]

Return the adjacency list describing the DAG.

Returns:

Mapping of each step to the steps that depend on it.

Return type:

dict[str, list[str]]

dependencies(step_name: str) List[str][source]

Retrieve the declared dependencies for step_name.

Parameters:

step_name (str) – Step identifier.

Returns:

Names of prerequisite steps.

Return type:

list[str]

Raises:

KeyError – If step_name does not exist in the pipeline.

describe() List[Dict[str, Any]][source]

Return a serialisable summary of the pipeline.

Returns:

Each entry contains name, requires, and payload_type keys.

Return type:

list of dict

ordered_steps() List[Step][source]

Return steps in execution order.

run(backend, system) Dict[str, ExecResult][source]

Execute steps in topological order.

Parameters:
  • backend – Object providing run(step, system) -> ExecResult.

  • system – The SimSystem descriptor.

Returns:

Mapping from step name to execution result.

Return type:

dict[str, ExecResult]

Raises:

RuntimeError – If a required dependency has not been produced.

class batter.pipeline.pipeline.PipelineState(results: Dict[str, ~batter.pipeline.step.ExecResult]=<factory>)[source]

Bases: object

In-memory state of a pipeline execution.

Variables:

results (dict[str, ExecResult]) – Per-step execution results.

results: Dict[str, ExecResult]
class batter.pipeline.step.ExecResult(job_ids: List[str] = <factory>, artifacts: Mapping[str, ~typing.Any]=<factory>)[source]

Execution result returned by a backend.

Parameters:
  • job_ids (list[str]) – Scheduler or process identifiers (may be empty for local runs).

  • artifacts (Mapping[str, Any]) – Named outputs (paths, metrics, small JSON blobs).

artifacts: Mapping[str, Any]
job_ids: List[str]
class batter.pipeline.step.Step(name: str, requires: List[str] = <factory>, payload: Any = None)[source]

One unit of work in the pipeline.

Parameters:
  • name (str) – Unique step name (e.g., "prepare_fe").

  • requires (list[str]) – Names of steps that must complete before this step can run.

  • payload (Any, optional) – Typed payload consumed by the backend. Typically a StepPayload.

Notes

  • Steps are immutable descriptors. Execution is handled by a backend.

  • The backend decides how to interpret params (e.g., templates, flags).

name: str
property params: Any

Backwards-compatible alias for payload.

payload: Any
replace(**updates: Any) Step[source]

Return a new Step with selected attributes updated.

Parameters:

**updates – Keyword overrides for any of the dataclass fields (name, requires, or payload).

Returns:

Fresh step instance containing the requested updates.

Return type:

Step

requires: List[str]
class batter.pipeline.payloads.StepPayload(*, sim: SimulationConfig | None = None, sys_params: SystemParams | None = None, **extra_data: Any)[source]

Typed payload passed to pipeline step handlers.

The payload binds the SimulationConfig and SystemParams objects used by most handlers while permitting arbitrary extra values for backwards compatibility or specialised needs.

Parameters:
  • sim (SimulationConfig, optional) – Resolved simulation configuration for the step.

  • sys_params (SystemParams, optional) – Shared system-level parameters.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

copy_with(**updates: Any) StepPayload[source]

Create a new StepPayload with additional updates.

Parameters:

**updates – Keyword overrides applied to the current payload.

Returns:

New payload containing the merged data.

Return type:

StepPayload

get(item: str, default: Any = None) Any[source]

Safe lookup for a payload value with a default.

Parameters:
  • item (str) – Key to fetch.

  • default (Any, optional) – Value returned when the key is missing or None.

Returns:

Requested value or the default.

Return type:

Any

model_config = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sim: SimulationConfig | None
sys_params: SystemParams | None
to_mapping() Dict[str, Any][source]

Convert the payload (including extras) to a plain dictionary.

Returns:

Merged representation of fields and extras.

Return type:

dict[str, Any]

class batter.pipeline.payloads.SystemParams(*, param_outdir: Path | None = None, system_name: str | None = None, protein_input: Path | None = None, system_input: Path | None = None, system_coordinate: Path | None = None, ligand_paths: Dict[str, ~pathlib.Path]=<factory>, yaml_dir: Path | None = None, anchor_atoms: tuple[str, ...]=(), extra_restraints: str | None = None, extra_restraint_fc: float | None = None, extra_conformation_restraints: Path | None = None, **extra_data: Any)[source]

System-level inputs shared by multiple pipeline steps.

This wrapper normalises common fields (paths, anchor atoms, etc.) while still allowing arbitrary extra keys. Paths are converted to pathlib.Path instances, making downstream usage safer.

Parameters:
  • param_outdir (Path, optional) – Directory where ligand parameter outputs should be written.

  • system_name (str, optional) – Logical system name propagated to child steps.

  • protein_input, system_input, system_coordinate (Path, optional) – Paths to the protein topology/coordinate inputs if supplied.

  • ligand_paths (dict[str, Path]) – Mapping of ligand identifiers to staged files.

  • yaml_dir (Path, optional) – Directory containing the originating YAML (useful for resolving relatives).

  • anchor_atoms (tuple[str, …]) – Anchor atom labels used for restraint placement.

  • extra_restraints (str, optional) – Optional positional restraint selection string.

  • extra_restraint_fc (float, optional) – Force constant (kcal/mol/Å^2) applied to extra_restraints.

  • extra_conformation_restraints (Path, optional) – Path to a conformational restraint JSON file.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

anchor_atoms: tuple[str, ...]
copy_with(**updates: Any) SystemParams[source]

Create a new SystemParams with additional updates.

Parameters:

**updates – Keyword overrides applied atop the existing data.

Returns:

A new instance incorporating the updates.

Return type:

SystemParams

extra_conformation_restraints: Path | None
extra_restraint_fc: float | None
extra_restraints: str | None
get(item: str, default: Any = None) Any[source]

Safe lookup for a field or extra value with a default.

Parameters:
  • item (str) – Key to fetch.

  • default (Any, optional) – Value returned when the key is missing or None.

Returns:

Requested value or the default.

Return type:

Any

ligand_paths: Dict[str, Path]
model_config = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

param_outdir: Path | None
protein_input: Path | None
system_coordinate: Path | None
system_input: Path | None
system_name: str | None
to_mapping() Dict[str, Any][source]

Convert the model (including extras) to a plain dictionary.

Returns:

Merged view of standard fields and extras.

Return type:

dict[str, Any]

yaml_dir: Path | None

Runtime Modules#

class batter.runtime.portable.Artifact(name: str, relpath: Path, kind: Literal['file', 'dir']='file', sha256: str = '', size: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]

Bases: object

A single artifact tracked by the manifest.

Parameters:
  • name (str) – Logical name (e.g., “fe/index” or “traj/lig1.zarr”).

  • relpath (pathlib.Path) – Path relative to the store root.

  • kind ({“file”,”dir”}) – File or directory artifact.

  • sha256 (str) – SHA-256 of the file (empty for directories).

  • size (int) – Size in bytes (files only; 0 for directories).

  • meta (dict) – Free-form metadata (component, lambda, etc.).

kind: Literal['file', 'dir']
meta: Dict[str, Any]
name: str
relpath: Path
sha256: str
size: int
class batter.runtime.portable.ArtifactManifest[source]

Bases: object

In-memory manifest for a portable artifact store.

Notes

  • Paths are relative to enable rebasing the store to a new root.

  • Serialize with to_dict() / from_dict().

add(art: Artifact, overwrite: bool = False) None[source]
exists(name: str) bool[source]
classmethod from_dict(d: Dict[str, Any]) ArtifactManifest[source]
get(name: str) Artifact[source]
items() List[Artifact][source]

Return all registered artifacts sorted by name.

Returns:

Snapshot of the manifest contents.

Return type:

list[Artifact]

names() List[str][source]
to_dict() Dict[str, Any][source]
class batter.runtime.portable.ArtifactStore(root: Path | str, manifest_name: str = 'manifest.json')[source]

Bases: object

Portable store with a relocatable root and JSON manifest.

Parameters:
  • root (path-like) – Store root directory (e.g., a run’s work directory).

  • manifest_name (str) – File name for the manifest JSON under root (default: “manifest.json”).

Examples

>>> store = ArtifactStore("work/at1r_aai")
>>> p = store.put_file(Path("results.txt"), name="fe/latest", dst_rel=Path("fe/results.txt"))
>>> store.save_manifest()
>>> # move directory to a new cluster...
>>> store2 = ArtifactStore("new_root/at1r_aai"); store2.load_manifest()
>>> store2.path("fe/latest")
new_root/at1r_aai/fe/results.txt
list_artifacts(*, prefix: str | None = None, kind: Literal['file', 'dir', None] = None) List[Artifact][source]

Inspect manifest entries, optionally filtering by name or kind.

Parameters:
  • prefix (str, optional) – When provided, only artifacts whose logical name starts with prefix are returned.

  • kind ({‘file’, ‘dir’, None}, optional) – Restrict results to files or directories. None (default) returns both.

Returns:

Matching artifacts in alphabetical order.

Return type:

list[Artifact]

load_manifest() None[source]

Load the manifest JSON from root.

path(name: str) Path[source]

Resolve an artifact name to an absolute path under the current root.

put_dir(src_dir: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]

Copy a directory under the store and record it in the manifest.

Notes

  • No per-file hashing; use put_file() for critical files.

put_file(src: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]

Copy a file under the store and record it in the manifest.

Parameters:
  • src (path-like) – Source file path (must exist and be a file).

  • name (str) – Logical artifact name to register under.

  • dst_rel (path-like, optional) – Relative destination path. Defaults to name.replace('/', '_').

  • overwrite_manifest_entry (bool) – If True, allows replacing an existing manifest entry with the same name.

Returns:

Absolute destination path.

Return type:

pathlib.Path

rebase(new_root: Path | str) ArtifactStore[source]

Create a new store view with the same manifest but a different root.

Parameters:

new_root (path-like) – Target root directory.

Returns:

New store pointing to new_root.

Return type:

ArtifactStore

save_manifest() Path[source]

Write the manifest JSON under root (atomic).

class batter.runtime.fe_repo.FERecord(*, run_id: str, ligand: str, mol_name: str, system_name: str, fe_type: str, temperature: float, method: Literal['mbar', 'ti']='mbar', total_dG: float, total_se: float = 0.0, components: List[str] = <factory>, created_at: str = <factory>, windows: List[WindowResult] = <factory>, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None, include_in_analysis: bool = True, status: Literal['success', 'failed', 'unbound']='success')[source]

A full FE result bundle (portable, versioned).

Parameters:
  • run_id (str) – Unique run identifier.

  • ligand (str) – Ligand identifier.

  • mol_name (str) – Molecule resname.

  • system_name (str) – Logical system name.

  • fe_type (str) – Protocol type (e.g., ‘uno_rest’, ‘asfe’).

  • temperature (float) – Simulation temperature (K).

  • method ({“mbar”,”ti”}) – Integration method.

  • total_dG (float) – Total free energy (kcal/mol).

  • total_se (float) – Standard error (kcal/mol).

  • components (list[str]) – Active components in this run.

  • created_at (str) – ISO-8601 timestamp (UTC, Z-suffix).

  • windows (list[WindowResult]) – Per-window results.

  • canonical_smiles (str, optional) – Canonicalised ligand SMILES captured during parameterization.

  • original_name (str, optional) – Original ligand identifier or title when known.

  • original_path (str, optional) – Source path of the ligand before staging.

  • protocol (str) – Logical protocol used to generate the result (e.g., "abfe").

  • analysis_start_step (int, optional) – First production step included in analysis.

  • n_bootstraps (int, optional) – Number of MBAR bootstrap resamples used during analysis.

  • include_in_analysis (bool) – Whether downstream aggregate analyses, such as Cinnabar export, should use this record.

  • status ({“success”,”failed”,”unbound”}) – Final status recorded for the ligand.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

analysis_start_step: int | None
canonical_smiles: str | None
components: List[str]
created_at: str
fe_type: str
include_in_analysis: bool
ligand: str
method: Literal['mbar', 'ti']
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

mol_name: str
n_bootstraps: int | None
original_name: str | None
original_path: str | None
protocol: str
run_id: str
status: Literal['success', 'failed', 'unbound']
system_name: str
temperature: float
total_dG: float
total_se: float
windows: List[WindowResult]
class batter.runtime.fe_repo.FEResultsRepository(store: ArtifactStore)[source]
index() DataFrame[source]
ligand_dir(run_id: str, ligand: str) Path[source]
load(run_id: str, ligand: str) FERecord[source]
record_failure(run_id: str, ligand: str, system_name: str, temperature: float, *, status: Literal['failed', 'unbound'], reason: str | None = None, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None) None[source]
save(rec: FERecord, copy_from: Path | None = None) None[source]
set_analysis_inclusion(*, run_id: str, ligand: str, include: bool, analysis_start_step: int | None = None, n_bootstraps: int | None = None) int[source]

Set include_in_analysis for matching rows in results/index.csv.

class batter.runtime.fe_repo.WindowResult(*, component: str, lam: float, dG: float, dG_se: float = 0.0, n_samples: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]

Result for a single lambda window/component.

Parameters:
  • component (str) – Component key (e.g., ‘e’, ‘v’, ‘z’).

  • lam (float) – Lambda value in [0, 1].

  • dG (float) – Free-energy increment (kcal/mol).

  • dG_se (float) – Standard error (kcal/mol).

  • n_samples (int) – Samples (or effective sample size).

  • meta (dict) – Extra metadata.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

component: str
dG: float
dG_se: float
lam: float
meta: Dict[str, Any]
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_samples: int

Systems Modules#

class batter.systems.core.CreateSystemLike(*args, **kwargs)[source]

Structural typing interface for inputs to a system builder.

Notes

This Protocol is intentionally minimal to avoid import cycles with Pydantic models. Any object with these attributes (e.g., a Pydantic model instance) satisfies the protocol.

anchor_atoms: Sequence[str]
ligand_ff: str
ligand_paths: Sequence[Path]
lipid_mol: Sequence[str]
other_mol: Sequence[str]
overwrite: bool
protein_input: Path | None
retain_lig_prot: bool
system_coordinate: Path | None
system_name: str
system_topology: Path | None
class batter.systems.core.SimSystem(name: str, root: Path, topology: Path | None = None, coordinates: Path | None = None, protein: Path | None = None, ligands: Path, ...]=(), lipid_mol: Tuple[str, ...]=(), other_mol: Tuple[str, ...]=(), anchors: Tuple[str, ...]=(), meta: SystemMeta = <factory>)[source]

Immutable descriptor of a simulation system and its on-disk artifacts.

Parameters:
  • name (str) – Logical system name (e.g., "AT1R_AAI").

  • root (pathlib.Path) – Working directory where artifacts live. This directory is considered relocatable; other modules should store relative paths when possible.

  • topology (pathlib.Path, optional) – Path to an explicit topology (e.g., AMBER PRMTOP). May be None if the builder generates it later.

  • coordinates (pathlib.Path, optional) – Coordinates or restart file (e.g., RST7/INPCRD).

  • protein (pathlib.Path, optional) – Input protein structure file (PDB/mmCIF).

  • ligands (tuple[pathlib.Path, …]) – One or more ligand structure files.

  • lipid_mol (tuple[str, …]) – Lipid names present in the system (e.g., ("POPC",)).

  • other_mol (tuple[str, …]) – Other cofactor present in the system``).

  • anchors (tuple[str, …]) – Anchor atoms in the form "RESID@ATOM" (e.g., "85@CA").

  • meta (SystemMeta) – Free-form metadata bundle for provenance (e.g., software versions).

anchors: Tuple[str, ...]
coordinates: Path | None
ligands: Tuple[Path, ...]
lipid_mol: Tuple[str, ...]
meta: SystemMeta
name: str
other_mol: Tuple[str, ...]
path(*parts: str | Path) Path[source]

Join root with the provided path segments.

Parameters:

*parts (str or Path) – Relative path components appended in order.

Returns:

Absolute path pointing inside root.

Return type:

pathlib.Path

protein: Path | None
root: Path
topology: Path | None
with_artifacts(**kw) SimSystem[source]

Return a new SimSystem with updated artifact attributes.

Examples

>>> sys = SimSystem(name="X", root=Path("work/X"))
>>> sys2 = sys.with_artifacts(topology=Path("work/X/top.prmtop"))
with_meta(**updates: Any) SimSystem[source]

Return a copy of the system with merged metadata.

Parameters:

**updates – Keyword arguments forwarded to SystemMeta.merge().

Returns:

Copy of the system containing the updated metadata bundle.

Return type:

SimSystem

class batter.systems.core.SystemBuilder(*args, **kwargs)[source]

Interface for creating or updating on-disk artifacts for a system.

build(system, args)[source]

Materialize artifacts for system using args, returning an updated SimSystem. Implementations must be idempotent: calling build twice with the same inputs must produce the same state without corrupting outputs.

build(system: SimSystem, args: CreateSystemLike) SimSystem[source]
class batter.systems.core.SystemMeta(ligand: str | None = None, residue_name: str | None = None, mode: str | None = None, param_dir_dict: Dict[str, str]=<factory>, extras: Dict[str, ~typing.Any]=<factory>)[source]

Structured metadata attached to a SimSystem.

Parameters:
  • ligand (str, optional) – Ligand identifier associated with the system (if applicable).

  • residue_name (str, optional) – Residue name used for the ligand.

  • mode (str, optional) – High-level mode indicator (e.g., "MABFE" vs "MASFE").

  • param_dir_dict (dict[str, str]) – Mapping from residue names to parameter storage directories.

  • extras (dict[str, Any]) – Additional context stored alongside the known fields.

extras: Dict[str, Any]
classmethod from_mapping(data: Mapping[str, Any] | None) SystemMeta[source]

Construct a SystemMeta from a mapping-like object.

Parameters:

data (mapping or None) – Source metadata. If already a SystemMeta, it is returned.

Returns:

Normalised metadata object.

Return type:

SystemMeta

get(key: str, default: Any = None) Any[source]

Retrieve a value by key with an optional default.

Parameters:
  • key (str) – Metadata key.

  • default (Any, optional) – Value returned when the key is missing.

Returns:

Stored value or the default.

Return type:

Any

ligand: str | None
merge(**updates: Any) SystemMeta[source]

Create a new SystemMeta with updated values.

Parameters:

**updates – Keyword overrides applied to the existing metadata.

Returns:

New instance containing the merged metadata.

Return type:

SystemMeta

mode: str | None
param_dir_dict: Dict[str, str]
residue_name: str | None
to_dict() Dict[str, Any][source]

Convert the metadata to a plain dictionary.

Returns:

All known fields plus extra entries.

Return type:

dict[str, Any]

class batter.systems.mabfe.MABFEBuilder(*args, **kwargs)[source]

Builder for membrane/absolute free-energy (MABFE) systems.

This builder prepares a shared working directory under system.root and, optionally, stages all ligands at once into per-ligand subfolders.

Directory layout (relative to system.root):

inputs/           # canonical copies of user-provided inputs
artifacts/        # files produced by builders (e.g., PRMTOP, RST7)
simulations/
  <LIG1>/inputs/ligand.<ext>
          artifacts/
  <LIG2>/inputs/ligand.<ext>
          artifacts/
  ...
build(system: SimSystem, args: CreateSystemLike) SimSystem[source]

Prepare the shared system area (stage protein/topology/coordinates/inputs).

Uses the actual suffixes from user inputs (no hard-coded extensions).

build_all_ligands(parent: SimSystem, lig_paths: Sequence[Path], overwrite: bool = False) Dict[str, SimSystem][source]

Stage all ligands at once under parent.root/simulations/<NAME>/....

Ligands are copied as inputs/ligand.<ext> using each source’s suffix.

static make_child_for_ligand(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]

Create a single per-ligand child system under simulations/<NAME>/ with ligand.<ext>.

batter.systems.mabfe.make_ligand_subsystem(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
batter.systems.mabfe.prepare_subsystems_for_ligands(parent: SimSystem, lig_paths: Iterable[Path]) Dict[str, SimSystem][source]
class batter.systems.masfe.MASFEBuilder(*args, **kwargs)[source]

Builder for membrane-free (solvation) absolute free-energy (MASFE) systems.

This builder prepares a shared working directory under system.root and, optionally, stages all ligands at once into per-ligand subfolders.

Differences vs MABFE:

  • No protein/topology/coordinates are required or staged.

  • The resulting SimSystem stores None for protein, topology, and coordinates.

Directory layout (relative to system.root):

inputs/           # canonical copies of user-provided ligand inputs
artifacts/        # files produced by builders
simulations/
  <LIG1>/inputs/ligand.<ext>
          artifacts/
  <LIG2>/inputs/ligand.<ext>
          artifacts/
  ...
build(system: SimSystem, args: CreateSystemLike) SimSystem[source]

Prepare the shared system area (stage ligand inputs).

Uses the actual suffixes from user inputs (no hard-coded extensions).

build_all_ligands(parent: SimSystem, lig_paths: Sequence[Path], overwrite: bool = False) Dict[str, SimSystem][source]

Stage all ligands at once under parent.root/simulations/<NAME>/....

Ligands are copied as inputs/ligand.<ext> using each source’s suffix.

static make_child_for_ligand(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]

Create a single per-ligand child system under simulations/<NAME>/ with ligand.<ext>.

batter.systems.masfe.make_ligand_subsystem_masfe(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
batter.systems.masfe.prepare_subsystems_for_ligands_masfe(parent: SimSystem, lig_paths: Iterable[Path]) Dict[str, SimSystem][source]

Analysis Modules#

class batter.analysis.analysis.BoreschAnalysis(disangfile, k_r, k_a, temperature)[source]

Bases: FEAnalysisBase

Initialize the Boresch analysis with the disang file and parameters.

Parameters:
  • disangfile (str) – The path to the disang file containing the anchor atoms.

  • k_r (float) – The force constant for the translation restraint.

  • k_a (float) – The force constant for the angle and dihedral restraints. They are the same (they don’t have to be).

  • temperature (float) – The temperature in Kelvin for the analysis.

static fe_int(r1_0, a1_0, t1_0, a2_0, t2_0, t3_0, k_r, k_a, temperature)[source]

Calculate the analytical free energy of boresch restraint. from BAT.py

plot_convergence(ax=None, **kwargs)[source]

no convergence for analytical results

run_analysis()[source]

Run the analytical analysis for Boresch restraint.

class batter.analysis.analysis.FEAnalysisBase[source]

Bases: ABC

Minimal interface shared across component analysis routines.

Variables:

results (dict) – Storage for the scalar FE, uncertainty, convergence tables, and FE time series generated by subclasses.

property convergence
dump(filename='results.json')[source]

Store results to JSON (omit heavy convergence tables).

property fe
property fe_error
property fe_timeseries
abstractmethod plot_convergence(ax=None, **kwargs)[source]
abstractmethod run_analysis()[source]
class batter.analysis.analysis.MBARAnalysis(lig_folder: str, component: str, windows: List[int], temperature: float, energy_unit: str = 'kcal/mol', analysis_start_step: int = 0, detect_equil: bool = True, n_bootstraps: int = 0, n_jobs: int = 6, load: bool = False, dt: float = 0.0, ntwx: int | None = None)[source]

Bases: FEAnalysisBase

Post-process a single component with alchemlyb.estimators.MBAR.

Parameters:
  • lig_folder (str) – Absolute path to the ligand work directory.

  • component (str) – Component identifier (e.g., "e" or "m").

  • windows (list[int]) – Lambda windows present for the component.

  • temperature (float) – Simulation temperature in Kelvin.

  • energy_unit ({“kcal/mol”, “kJ/mol”, “kT”}, optional) – Output energy unit. Internally every value is accumulated in units of kT and converted before publishing the results.

  • analysis_start_step (int, optional) – Discard frames with step <= this value before analysis.

  • detect_equil (bool, optional) – When True the equilibration time of each window is detected and the pre-equilibrated portion is discarded.

  • n_bootstraps (int, optional) – Number of bootstrap samples handed to MBAR.

  • n_jobs (int, optional) – Level of joblib parallelism when parsing windows.

  • load (bool, optional) – When True reuse cached *_df_list.pickle files if available.

property data_list: List[DataFrame]
get_mbar_data() None[source]

Parse and cache the not reduced potentials for all lambda windows.

Notes

The concatenated dataframe is stored in self._u_df while the list of per-window frames is available via data_list.

plot_block_convergence(ax=None, **kwargs)[source]
plot_convergence(save_path: str | None = None, title: str | None = None)[source]
plot_overlap_matrix(ax=None, **kwargs)[source]
plot_time_convergence(ax=None, **kwargs)[source]
run_analysis() None[source]
property u_df: DataFrame
class batter.analysis.analysis.RESTMBARAnalysis(lig_folder: str, component: str, windows: List[int], temperature: float, energy_unit: str = 'kcal/mol', analysis_start_step: int = 0, detect_equil: bool = True, n_bootstraps: int = 0, n_jobs: int = 6, load: bool = False, dt: float = 0.0, ntwx: int | None = None)[source]

Bases: MBARAnalysis

MBAR analysis variant for restraint components that require cpptraj traces.

class batter.analysis.analysis.SilenceAlchemlybOnly[source]

Bases: object

batter.analysis.analysis.analyze_lig_task(lig_path: str, lig: str, components: List[str], rest: Tuple[float, float, float, float, float], temperature: float, water_model: str, component_windows_dict: Dict[str, List[int]], rocklin_correction: bool = False, analysis_start_step: int = 0, raise_on_error: bool = True, mol: str = 'LIG', n_workers: int = 4, n_bootstraps: int = 0, dt: float = 0.0, ntwx: int = 0)[source]

Analyze one lig under lig_path for the requested components.

batter.analysis.analysis.generate_results_rest(md_sim_files: List[str], comp: str, blocks: int = 5, top: str = 'full') None[source]

Build a cpptraj input on the fly using ‘restraints.in’ template in cwd, swapping the topology to ../{comp}-1/{top}.prmtop and appending trajins.

Helpers for converting BATTER RBFE results into Cinnabar FEMap objects.

class batter.analysis.cinnabar.CinnabarConversionResult(femap: 'Any', edge_summary: 'pd.DataFrame', raw_signed: 'pd.DataFrame', merge_bidirectional: 'bool' = True, exp_summary: 'pd.DataFrame | None' = None, absolute_summary: 'pd.DataFrame | None' = None, absolute_warning: 'str | None' = None, ligand_assets: 'dict[str, dict[str, str]]'=<factory>, edge_assets: 'dict[str, dict[str, str]]'=<factory>)[source]

Bases: object

absolute_summary: DataFrame | None = None
absolute_warning: str | None = None
edge_assets: dict[str, dict[str, str]]
edge_summary: DataFrame
exp_summary: DataFrame | None = None
femap: Any
ligand_assets: dict[str, dict[str, str]]
merge_bidirectional: bool = True
raw_signed: DataFrame
batter.analysis.cinnabar.auto_write_rbfe_cinnabar_for_run(work_dir: str | Path, run_id: str, *, out_dir: str | Path | None = None, combine_by_run_first: bool = True, merge_bidirectional: bool = True, write_plots: bool = True, write_cycle_closure: bool = True, absolute_offset: float = 0.0) dict[str, Any][source]

Write a per-run RBFE Cinnabar bundle plus a replicate-aware follow-up note.

batter.analysis.cinnabar.build_batter_rbfe_cinnabar(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]
batter.analysis.cinnabar.build_batter_rbfe_cinnabar_by_run(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) dict[str, CinnabarConversionResult][source]
batter.analysis.cinnabar.build_batter_rbfe_cinnabar_from_runs(runs: Sequence[tuple[str | Path, str]], *, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]
batter.analysis.cinnabar.convert_cinnabar_outputs_to_csv(bundle_dir: str | Path, out_dir: str | Path, *, relative_name: str = 'relative.csv', absolute_name: str = 'absolute.csv', require_absolute: bool = False) dict[str, Path][source]

Load a Cinnabar bundle directory and rewrite merged relative/absolute CSVs.

batter.analysis.cinnabar.dataframe_to_cinnabar(rbfe_df: DataFrame, *, ligand_column: str = 'ligand', dg_column: str = 'total_dG', se_column: str = 'total_se', run_column: str = 'run_id', status_column: str = 'status', success_value: str = 'success', temperature_column: str = 'temperature', edge_separator: str = '~', source: str = 'BATTER_RBFE', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]

Convert an RBFE dataframe into a Cinnabar FEMap and summary tables.

batter.analysis.cinnabar.load_batter_rbfe_results(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~') DataFrame[source]

Load stored BATTER FE records and keep only RBFE-like edge rows.

batter.analysis.cinnabar.load_batter_rbfe_results_from_runs(runs: Sequence[tuple[str | Path, str]], *, ligands: Sequence[str] | None = None, edge_separator: str = '~') DataFrame[source]

Load RBFE rows from explicit (work_dir, run_id) inputs.

batter.analysis.cinnabar.read_cinnabar_outputs(bundle_dir: str | Path, *, require_absolute: bool = False) tuple[DataFrame, DataFrame][source]

Read merged relative and absolute Cinnabar tables from an export bundle.

The *_uncorrected columns are copied from Cinnabar’s original relative and absolute CSVs. The *_cycle_closure columns are merged from the SFC outputs, cycle_closure_edges.csv and cycle_closure_nodes.csv.

batter.analysis.cinnabar.summarize_directionality(edge_summary: DataFrame) dict[str, Any][source]

Summarize whether an edge table contains reciprocal directional pairs.

batter.analysis.cinnabar.write_cinnabar_outputs(result: CinnabarConversionResult, out_dir: str | Path, *, method_name: str = 'BATTER', target_name: str = '', write_plots: bool = True, absolute_offset: float = 0.0, write_cycle_closure: bool = True) dict[str, Path][source]

Write stable on-disk outputs for a converted Cinnabar bundle.

State-function based free-energy correction for RBFE networks.

Acknowledgement#

This module implements the matrix-based State-Function Based Free Energy Correction (SFC) workflow for BATTER’s analysis API, following the article and supporting information cited below.

Reference#

Liu, R.; Lai, Y.; Yao, Y.; Huang, W.; Zhong, Y.; Luo, H.-B.; Li, Z. State Function-Based Correction: A Simple and Efficient Free-Energy Correction Algorithm for Large-Scale Relative Binding Free-Energy Calculations. J. Phys. Chem. Lett. 2025, 16, 23, 5763-5768. doi:10.1021/acs.jpclett.5c01119

The historical cycle_closure_* function names are kept for compatibility with the existing BATTER Cinnabar integration. They now run SFC/WSFC rather than the earlier cycle-enumeration WCC algorithm.

class batter.analysis.cycle_closure.CycleClosureEdge(label_a: str, label_b: str, ddg: float, uncertainties: tuple[float, ...] = ())[source]

Bases: object

One directed RBFE edge used as SFC input.

Parameters:
  • label_a, label_b – Ligand labels defining the edge direction.

  • ddg (float) – Relative free energy for label_a -> label_b.

  • uncertainties (tuple[float, …]) – Optional standard-error columns. Each supplied column creates one WSFC estimate using uncertainty-derived weights.

ddg: float
label_a: str
label_b: str
uncertainties: tuple[float, ...] = ()
class batter.analysis.cycle_closure.CycleClosureResult(reference: str, reference_free_energy: float, node_results: DataFrame, edge_results: DataFrame, cycles: tuple[tuple[str, ...], ...] = (), iterations: tuple[int, ...] = (), converged: tuple[bool, ...] = (), method: str = 'sfc', schemes: tuple[str, ...] = ())[source]

Bases: object

SFC result tables and metadata.

converged: tuple[bool, ...] = ()
cycles: tuple[tuple[str, ...], ...] = ()
edge_results: DataFrame
iterations: tuple[int, ...] = ()
method: str = 'sfc'
node_results: DataFrame
reference: str
reference_free_energy: float
schemes: tuple[str, ...] = ()
batter.analysis.cycle_closure.StateFunctionCorrectionEdge

alias of CycleClosureEdge

batter.analysis.cycle_closure.StateFunctionCorrectionResult

alias of CycleClosureResult

batter.analysis.cycle_closure.calculate_cycle_closure(edges: Iterable[CycleClosureEdge | Sequence[object]], *, reference: str | None = None, reference_free_energy: float = 0.0, reference_weight: float = 1000000.0, require_cycles: bool | None = None, **_compat_kwargs) CycleClosureResult[source]

Run SFC/WSFC correction on an RBFE graph.

require_cycles and extra keyword arguments are accepted for compatibility with the previous WCC implementation. SFC does not enumerate cycles and can operate on any connected RBFE graph.

batter.analysis.cycle_closure.calculate_state_function_correction(edges: Iterable[CycleClosureEdge | Sequence[object]], *, reference: str | None = None, reference_free_energy: float = 0.0, reference_weight: float = 1000000.0, require_cycles: bool | None = None, **_compat_kwargs) CycleClosureResult

Run SFC/WSFC correction on an RBFE graph.

require_cycles and extra keyword arguments are accepted for compatibility with the previous WCC implementation. SFC does not enumerate cycles and can operate on any connected RBFE graph.

batter.analysis.cycle_closure.cycle_closure_from_dataframe(df: DataFrame, *, label_a_col: str = 'labelA', label_b_col: str = 'labelB', ddg_col: str | None = None, uncertainty_cols: Sequence[str] | None = None, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult[source]

Build SFC input from a dataframe and run the correction.

batter.analysis.cycle_closure.cycle_closure_from_file(path: str | Path, *, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult[source]

Read an SFC-style input file and run state-function correction.

batter.analysis.cycle_closure.read_cycle_closure_file(path: str | Path) DataFrame[source]

Read a whitespace-delimited SFC input file.

The first three columns are named labelA, labelB, and ddG. Additional columns are treated as standard-error columns named std1, std2, etc.

batter.analysis.cycle_closure.read_state_function_correction_file(path: str | Path) DataFrame

Read a whitespace-delimited SFC input file.

The first three columns are named labelA, labelB, and ddG. Additional columns are treated as standard-error columns named std1, std2, etc.

batter.analysis.cycle_closure.state_function_correction_from_dataframe(df: DataFrame, *, label_a_col: str = 'labelA', label_b_col: str = 'labelB', ddg_col: str | None = None, uncertainty_cols: Sequence[str] | None = None, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult

Build SFC input from a dataframe and run the correction.

batter.analysis.cycle_closure.state_function_correction_from_file(path: str | Path, *, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult

Read an SFC-style input file and run state-function correction.

Utilities for inspecting replica-exchange simulations.

class batter.analysis.remd.RemdLog(inputfile: str)[source]

Bases: object

Read and analyse AMBER remlog files.

The parser reconstructs the replica $leftrightarrow$ state mapping at each exchange step and reports high-level metrics such as average single-pass duration and the number of round trips.

Parameters:

inputfile (str) – Path to the remlog text file produced by AMBER.

analyze() Dict[str, float | List[float]][source]

Summarise the replica trajectory.

Returns:

Dictionary with the same keys as get_remd_info().

Return type:

dict

classmethod get_remd_info(inputfile: str) Dict[str, float | List[float]][source]

Convenience helper that parses and analyses a remlog file.

Parameters:

inputfile (str) – Path to the remlog text file.

Returns:

Same structure as analyze().

Return type:

dict

batter.analysis.remd.plot_trajectory(replica_trajectory, figsize=(10, 6), alpha=0.8, linewidth=1.5, subplot=False, ncols=4)[source]

Visualise the replica walk through thermodynamic states.

Parameters:
  • replica_trajectory (numpy.ndarray) – Array of shape (n_replica, n_step + 1) containing state indices.

  • figsize (tuple, optional) – Base figure size. When subplot=True the width/height apply to each panel instead of the aggregate.

  • alpha (float, optional) – Line transparency used for individual replica traces.

  • linewidth (float, optional) – Width of trajectory lines.

  • subplot (bool, optional) – When True, render one subplot per replica; otherwise plot all replicas on a shared axis.

  • ncols (int, optional) – Number of subplot columns when subplot=True.

Small numerical helpers used across batter.analysis.

batter.analysis.utils.MakeChunksWithSize(istart: int, istop: int, size: int) List[List[int]][source]

Build index chunks covering [istart, istop) with approximately size elements.

Parameters:
  • istart (int) – Starting index (inclusive).

  • istop (int) – Stopping index (exclusive).

  • size (int) – Target chunk size prior to merging trailing fragments.

Returns:

Collection of contiguous index lists.

Return type:

list[list[int]]

batter.analysis.utils.MakeGroupedChunks(ene: ndarray, size: int) List[List[int]][source]

Merge adjacent chunks when their means are statistically indistinguishable.

Parameters:
  • ene (numpy.ndarray) – One-dimensional array containing the energy trace used for grouping.

  • size (int) – Requested minimum chunk size prior to the adaptive merge step.

Returns:

List of index groups representing contiguous frames with similar means.

Return type:

list[list[int]]

batter.analysis.utils.SizedChunks(lst: Iterable[int], n: int) Generator[List[int], None, None][source]

Yield successive n-sized chunks from an iterable.

Parameters:
  • lst (Iterable[int]) – Source iterable that should be partitioned. The iterable is consumed, so pass a sequence (e.g. range) if it needs to be reused.

  • n (int) – Requested chunk size.

Yields:

list[int] – Consecutive slices of length n (the final chunk may be shorter).

batter.analysis.utils.exclude_outliers(df: DataFrame, iclam: int) DataFrame[source]

Remove energy spikes that would destabilise MBAR fits.

Parameters:
  • df (pandas.DataFrame) – Reduced potential values with time points along the rows and lambda states in the columns.

  • iclam (int) – Index of the reference lambda column. The algorithm analyses this column to decide which trajectory chunks should be discarded.

Returns:

Filtered dataframe with the same columns as df but potentially fewer rows if outliers were detected.

Return type:

pandas.DataFrame

Notes

The implementation mirrors the heuristics used in the original fe-toolkit scripts: frames are chunked into ~200-sample blocks, grouped via a Welch t-test, and discarded whenever any lambda exhibits a value more than + 1000 kcal/mol below the reference median (after correcting for mixed precision offsets).