API Documentation#
Public API#
Public API for BATTER.
This module collects the stable entry points intended for external consumption. They fall into four broad categories:
Configuration helpers – load and dump
RunConfig/SimulationConfigobjects.Execution – orchestrate complete workflows from a YAML definition.
Portable results – inspect and copy artifacts produced by a run.
Utilities – clone the state of an execution for reproducibility.
Typical usage#
Run a workflow from a top-level YAML:
from batter.api import run_from_yaml
run_from_yaml("examples/mabfe_example.yaml")
Inspect FE records stored in a work directory:
from batter.api import list_fe_runs, load_fe_run
runs = list_fe_runs("work/adrb2")
latest = runs.iloc[-1]["run_id"]
# pass ``ligand`` when the run contains more than one ligand
record = load_fe_run("work/adrb2", latest, ligand="LIG1")
Run FE analysis on an existing execution:
from batter.api import run_analysis_from_execution
run_analysis_from_execution("work/adrb2", latest, ligand="LIG1")
For more examples, refer to docs/getting_started.rst and the tutorials.
- class batter.api.ArtifactStore(root: Path | str, manifest_name: str = 'manifest.json')[source]
Bases:
objectPortable store with a relocatable root and JSON manifest.
- Parameters:
root (path-like) – Store root directory (e.g., a run’s work directory).
manifest_name (str) – File name for the manifest JSON under
root(default: “manifest.json”).
Examples
>>> store = ArtifactStore("work/at1r_aai") >>> p = store.put_file(Path("results.txt"), name="fe/latest", dst_rel=Path("fe/results.txt")) >>> store.save_manifest() >>> # move directory to a new cluster... >>> store2 = ArtifactStore("new_root/at1r_aai"); store2.load_manifest() >>> store2.path("fe/latest") new_root/at1r_aai/fe/results.txt
- list_artifacts(*, prefix: str | None = None, kind: Literal['file', 'dir', None] = None) List[Artifact][source]
Inspect manifest entries, optionally filtering by name or kind.
- Parameters:
prefix (str, optional) – When provided, only artifacts whose logical name starts with
prefixare returned.kind ({‘file’, ‘dir’, None}, optional) – Restrict results to files or directories.
None(default) returns both.
- Returns:
Matching artifacts in alphabetical order.
- Return type:
list[Artifact]
- load_manifest() None[source]
Load the manifest JSON from
root.
- path(name: str) Path[source]
Resolve an artifact name to an absolute path under the current root.
- put_dir(src_dir: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]
Copy a directory under the store and record it in the manifest.
Notes
No per-file hashing; use
put_file()for critical files.
- put_file(src: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]
Copy a file under the store and record it in the manifest.
- Parameters:
src (path-like) – Source file path (must exist and be a file).
name (str) – Logical artifact name to register under.
dst_rel (path-like, optional) – Relative destination path. Defaults to
name.replace('/', '_').overwrite_manifest_entry (bool) – If True, allows replacing an existing manifest entry with the same name.
- Returns:
Absolute destination path.
- Return type:
pathlib.Path
- rebase(new_root: Path | str) ArtifactStore[source]
Create a new store view with the same manifest but a different root.
- Parameters:
new_root (path-like) – Target root directory.
- Returns:
New store pointing to
new_root.- Return type:
ArtifactStore
- save_manifest() Path[source]
Write the manifest JSON under
root(atomic).
- class batter.api.FERecord(*, run_id: str, ligand: str, mol_name: str, system_name: str, fe_type: str, temperature: float, method: Literal['mbar', 'ti']='mbar', total_dG: float, total_se: float = 0.0, components: List[str] = <factory>, created_at: str = <factory>, windows: List[WindowResult] = <factory>, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None, include_in_analysis: bool = True, status: Literal['success', 'failed', 'unbound']='success')[source]
Bases:
BaseModelA full FE result bundle (portable, versioned).
- Parameters:
run_id (str) – Unique run identifier.
ligand (str) – Ligand identifier.
mol_name (str) – Molecule resname.
system_name (str) – Logical system name.
fe_type (str) – Protocol type (e.g., ‘uno_rest’, ‘asfe’).
temperature (float) – Simulation temperature (K).
method ({“mbar”,”ti”}) – Integration method.
total_dG (float) – Total free energy (kcal/mol).
total_se (float) – Standard error (kcal/mol).
components (list[str]) – Active components in this run.
created_at (str) – ISO-8601 timestamp (UTC, Z-suffix).
windows (list[WindowResult]) – Per-window results.
canonical_smiles (str, optional) – Canonicalised ligand SMILES captured during parameterization.
original_name (str, optional) – Original ligand identifier or title when known.
original_path (str, optional) – Source path of the ligand before staging.
protocol (str) – Logical protocol used to generate the result (e.g.,
"abfe").analysis_start_step (int, optional) – First production step included in analysis.
n_bootstraps (int, optional) – Number of MBAR bootstrap resamples used during analysis.
include_in_analysis (bool) – Whether downstream aggregate analyses, such as Cinnabar export, should use this record.
status ({“success”,”failed”,”unbound”}) – Final status recorded for the ligand.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- analysis_start_step: int | None
- canonical_smiles: str | None
- components: List[str]
- created_at: str
- fe_type: str
- include_in_analysis: bool
- ligand: str
- method: Literal['mbar', 'ti']
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- mol_name: str
- n_bootstraps: int | None
- original_name: str | None
- original_path: str | None
- protocol: str
- run_id: str
- status: Literal['success', 'failed', 'unbound']
- system_name: str
- temperature: float
- total_dG: float
- total_se: float
- windows: List[WindowResult]
- class batter.api.FEResultsRepository(store: ArtifactStore)[source]
Bases:
object- index() DataFrame[source]
- ligand_dir(run_id: str, ligand: str) Path[source]
- load(run_id: str, ligand: str) FERecord[source]
- record_failure(run_id: str, ligand: str, system_name: str, temperature: float, *, status: Literal['failed', 'unbound'], reason: str | None = None, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None) None[source]
- save(rec: FERecord, copy_from: Path | None = None) None[source]
- set_analysis_inclusion(*, run_id: str, ligand: str, include: bool, analysis_start_step: int | None = None, n_bootstraps: int | None = None) int[source]
Set
include_in_analysisfor matching rows inresults/index.csv.
- class batter.api.RunConfig(*, version: int = 1, protocol: Literal['abfe', 'rbfe', 'asfe', 'md']='abfe', backend: Literal['local', 'slurm']='local', create: CreateArgs, fe_sim: Dict[str, ~typing.Any] | ~batter.config.run.FESimArgs | ~batter.config.run.MDSimArgs=<factory>, run: RunSection, rbfe: RBFENetworkArgs | None = None)[source]
Bases:
BaseModelTop-level YAML config.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- backend: Literal['local', 'slurm']
- create: CreateArgs
- classmethod load(path: Path | str) RunConfig[source]
Load and validate a run configuration from disk.
- Parameters:
path (str or pathlib.Path) – Location of the YAML file to parse.
- Returns:
Fully validated configuration object.
- Return type:
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod model_validate_yaml(yaml_text: str) RunConfig[source]
Validate a run configuration from an in-memory YAML string.
- Parameters:
yaml_text (str) – Raw YAML content describing the run configuration.
- Returns:
Validated configuration model.
- Return type:
- protocol: Literal['abfe', 'rbfe', 'asfe', 'md']
- rbfe: RBFENetworkArgs | None
- resolved_sim_config() SimulationConfig[source]
Build the effective simulation configuration for this run.
- Returns:
Simulation parameters derived from
createandfe_simsections.- Return type:
- run: RunSection
- version: int
- class batter.api.SimSystem(name: str, root: Path, topology: Path | None = None, coordinates: Path | None = None, protein: Path | None = None, ligands: Path, ...]=(), lipid_mol: Tuple[str, ...]=(), other_mol: Tuple[str, ...]=(), anchors: Tuple[str, ...]=(), meta: SystemMeta = <factory>)[source]
Bases:
objectImmutable descriptor of a simulation system and its on-disk artifacts.
- Parameters:
name (str) – Logical system name (e.g.,
"AT1R_AAI").root (pathlib.Path) – Working directory where artifacts live. This directory is considered relocatable; other modules should store relative paths when possible.
topology (pathlib.Path, optional) – Path to an explicit topology (e.g., AMBER PRMTOP). May be
Noneif the builder generates it later.coordinates (pathlib.Path, optional) – Coordinates or restart file (e.g., RST7/INPCRD).
protein (pathlib.Path, optional) – Input protein structure file (PDB/mmCIF).
ligands (tuple[pathlib.Path, …]) – One or more ligand structure files.
lipid_mol (tuple[str, …]) – Lipid names present in the system (e.g.,
("POPC",)).other_mol (tuple[str, …]) – Other cofactor present in the system``).
anchors (tuple[str, …]) – Anchor atoms in the form
"RESID@ATOM"(e.g.,"85@CA").meta (SystemMeta) – Free-form metadata bundle for provenance (e.g., software versions).
- anchors: Tuple[str, ...]
- coordinates: Path | None
- ligands: Tuple[Path, ...]
- lipid_mol: Tuple[str, ...]
- meta: SystemMeta
- name: str
- other_mol: Tuple[str, ...]
- path(*parts: str | Path) Path[source]
Join
rootwith the provided path segments.- Parameters:
*parts (str or Path) – Relative path components appended in order.
- Returns:
Absolute path pointing inside
root.- Return type:
pathlib.Path
- protein: Path | None
- root: Path
- topology: Path | None
- with_artifacts(**kw) SimSystem[source]
Return a new
SimSystemwith updated artifact attributes.Examples
>>> sys = SimSystem(name="X", root=Path("work/X")) >>> sys2 = sys.with_artifacts(topology=Path("work/X/top.prmtop"))
- with_meta(**updates: Any) SimSystem[source]
Return a copy of the system with merged metadata.
- Parameters:
**updates – Keyword arguments forwarded to
SystemMeta.merge().- Returns:
Copy of the system containing the updated metadata bundle.
- Return type:
SimSystem
- class batter.api.SimulationConfig(*, system_name: str, fe_type: ~typing.Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md'], dec_int: ~typing.Literal['mbar', 'ti'] = 'mbar', remd: ~typing.Literal['yes', 'no'] = 'no', remd_nstlim: int = 100, slurm_header_dir: ~pathlib.Path = <factory>, infe: bool = False, p1: str = '', p2: str = '', p3: str = '', other_mol: ~typing.List[str] = <factory>, lipid_mol: ~typing.List[str] = <factory>, solv_shell: float | None = 15.0, rocklin_correction: ~typing.Literal['yes', 'no'] = 'no', release_eq: ~typing.List[float] = <factory>, ti_points: int | None = 0, lambdas: ~typing.List[float] = <factory>, component_windows: ~typing.Dict[str, ~typing.List[float]] = <factory>, sdr_dist: float | None = 0.0, dec_method: str | None = None, blocks: int = 0, unbound_threshold: ~typing.Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, lig_distance_force: float = 0.0, lig_angle_force: float = 0.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 0.0, lig_com_force: float = 0.0, water_model: ~typing.Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC'] = 'TIP3P', buffer_x: float = 10.0, buffer_y: float = 10.0, buffer_z: float = 15.0, lig_buffer: float = 10.0, neutralize_only: ~typing.Literal['yes', 'no'] = 'no', cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, hmr: ~typing.Literal['yes', 'no'] = 'no', enable_mcwat: ~typing.Literal['yes', 'no'] = 'yes', temperature: float = 298.15, eq_steps: int = 1000000, n_steps_dict: ~typing.Dict[str, int] = <factory>, l1_x: float | None = None, l1_y: float | None = None, l1_z: float | None = None, l1_range: float | None = None, min_adis: float | None = None, max_adis: float | None = None, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 2500, cut: float = 9.0, gamma_ln: float = 1.0, barostat: ~typing.Literal[1, 2] = 2, dt: float = 0.004, all_atoms: ~typing.Literal['yes', 'no'] = 'no', receptor_ff: str = 'protein.ff14SB', ligand_ff: str = 'gaff2', lipid_ff: str = 'lipid21', ligand_dict: ~typing.Dict[str, ~typing.Any] = <factory>, rng: int = 0, ion_def: ~typing.List[~typing.Any] = <factory>, dic_n_steps: ~typing.Dict[str, int] = <factory>, rest: ~typing.List[float] = <factory>, neut: str = '', protein_align: str = 'name CA', receptor_segment: str | None = None, components: ~typing.List[str] = <factory>, component_lambdas: ~typing.Dict[str, ~typing.List[float]] = <factory>, membrane_simulation: bool = True)[source]
Bases:
BaseModelSimulation configuration for ABFE/ASFE/RBFE workflows. Values are fed by RunConfig.resolved_sim_config(), which merges create: and fe_sim:.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- all_atoms: Literal['yes', 'no']
- analysis_start_step: int
- anion: str
- barostat: Literal[1, 2]
- blocks: int
- buffer_x: float
- buffer_y: float
- buffer_z: float
- cation: str
- component_lambdas: Dict[str, List[float]]
- component_windows: Dict[str, List[float]]
- components: List[str]
- cut: float
- dec_int: Literal['mbar', 'ti']
- dec_method: str | None
- dic_n_steps: Dict[str, int]
- dt: float
- enable_mcwat: Literal['yes', 'no']
- eq_steps: int
- fe_type: Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md']
- classmethod from_sections(create: CreateArgs, fe: FESimArgs, *, protocol: str | None = None, fe_type: str | None = None, slurm_header_dir: Path | None = None, run_remd: str | bool | None = None) SimulationConfig[source]
Construct a
SimulationConfigfrom run sections.- Parameters:
create (CreateArgs) – System creation inputs taken from the
createYAML section.fe (FESimArgs) – Free-energy simulation overrides from the
fe_simsection.run_remd ({“yes”,”no”}, optional) – Whether REMD execution is enabled (controls submission only; REMD inputs are always written during preparation).
- Returns:
Fully merged simulation configuration ready for downstream use.
- Return type:
- gamma_ln: float
- hmr: Literal['yes', 'no']
- infe: bool
- ion_conc: float
- ion_def: List[Any]
- l1_range: float | None
- l1_x: float | None
- l1_y: float | None
- l1_z: float | None
- lambdas: List[float]
- lig_angle_force: float
- lig_buffer: float
- lig_com_force: float
- lig_dihcf_force: float
- lig_distance_force: float
- ligand_dict: Dict[str, Any]
- ligand_ff: str
- lipid_ff: str
- lipid_mol: List[str]
- max_adis: float | None
- membrane_simulation: bool
- min_adis: float | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_bootstraps: int
- n_steps_dict: Dict[str, int]
- neut: str
- neutralize_only: Literal['yes', 'no']
- ntpr: int
- ntwe: int
- ntwr: int
- ntwx: int
- other_mol: List[str]
- p1: str
- p2: str
- p3: str
- protein_align: str
- rec_com_force: float
- receptor_ff: str
- receptor_segment: str | None
- release_eq: List[float]
- remd: Literal['yes', 'no']
- remd_nstlim: int
- rest: List[float]
- rng: int
- rocklin_correction: Literal['yes', 'no']
- sdr_dist: float | None
- slurm_header_dir: Path
- solv_shell: float | None
- system_name: str
- temperature: float
- ti_points: int | None
- to_dict() Dict[str, Any][source]
- unbound_threshold: float
- water_model: Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC']
- class batter.api.WindowResult(*, component: str, lam: float, dG: float, dG_se: float = 0.0, n_samples: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
BaseModelResult for a single lambda window/component.
- Parameters:
component (str) – Component key (e.g., ‘e’, ‘v’, ‘z’).
lam (float) – Lambda value in [0, 1].
dG (float) – Free-energy increment (kcal/mol).
dG_se (float) – Standard error (kcal/mol).
n_samples (int) – Samples (or effective sample size).
meta (dict) – Extra metadata.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- component: str
- dG: float
- dG_se: float
- lam: float
- meta: Dict[str, Any]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_samples: int
- batter.api.clone_execution(work_dir: Path, src_run_id: str, dst_run_id: str | None = None, *, dst_root: Path | None = None, mode: Literal['copy', 'hardlink', 'symlink'] = 'hardlink', only_equil: bool = True, reset_states: bool = True, overwrite: bool = False) Path[source]
- batter.api.dump_run_config(cfg: RunConfig, path: Path | str) None[source]
Serialize a run configuration to YAML.
- Parameters:
cfg (RunConfig) – Configuration object to export.
path (str or pathlib.Path) – Destination path for the YAML file.
- batter.api.list_fe_runs(work_dir: str | Path) pd.DataFrame[source]
Return an index of FE runs contained in a portable work directory.
- Parameters:
work_dir (str or Path) – Path to the root directory of a BATTER execution (portable layout).
- Returns:
DataFrame with one row per stored FE run. Columns include
run_id,ligand,mol_name,system_name,temperature,total_dG,total_se,canonical_smiles,original_name,original_path,protocol,analysis_start_step,n_bootstraps,status,failure_reason, andcreated_at.- Return type:
pandas.DataFrame
- batter.api.load_fe_run(work_dir: str | Path, run_id: str, ligand: str | None = None) FERecord[source]
Load a single FE record by
run_idfrom a portable work directory.- Parameters:
work_dir (str or Path) – Root directory of the BATTER execution.
run_id (str) – Identifier of the FE run to load (as returned by
list_fe_runs()).ligand (str, optional) – Ligand identifier when multiple ligands were processed in the run. If omitted, the sole ligand is selected automatically or a ValueError is raised when multiple matches exist.
- Returns:
Structured record containing total ΔG, standard error, components, and per-window results.
- Return type:
FERecord
- batter.api.load_run_config(path: Path | str) RunConfig[source]
Read a run-level YAML file and return a validated configuration.
- Parameters:
path (str or pathlib.Path) – Location of the run YAML file.
- Returns:
Parsed run configuration.
- Return type:
- batter.api.load_sim_config(path: Path | str) SimulationConfig
Load a simulation configuration from YAML.
- Parameters:
path (str or pathlib.Path) – Path to the simulation YAML file.
- Returns:
Validated simulation configuration.
- Return type:
- batter.api.read_cinnabar_outputs(bundle_dir: str | Path, *, require_absolute: bool = False)[source]
Read a generated Cinnabar export bundle from disk.
- Parameters:
bundle_dir (str or Path) – Directory containing
cinnabar_relative.csvand optional absolute and SFC correction CSVs produced by the Cinnabar export.require_absolute (bool, optional) – When
True, raise if the bundle does not containcinnabar_absolute.csv.
- Returns:
Relative and absolute tables. Each table includes uncorrected columns and SFC correction columns when those outputs are present, with free-energy units stored in a
unitcolumn. The*_uncorrectedcolumns are sourced from Cinnabar’s CSVs, and the*_cycle_closurecolumns are sourced from the SFC CSVs.- Return type:
tuple[pandas.DataFrame, pandas.DataFrame]
- batter.api.run_analysis_from_execution(work_dir: str | Path, run_id: str | None = None, *, ligand: str | None = None, components: Sequence[str] | None = None, n_workers: int | None = None, analysis_start_step: int | None = None, n_bootstraps: int | None = None, overwrite: bool = True, raise_on_error: bool = True) None[source]
Run FE analysis for a partially finished/finished execution.
- Parameters:
work_dir (str or Path) – Root directory containing the portable execution store.
run_id (str, optional) – Identifier of the execution (e.g.,
run-20240101). When omitted, the most recently modified execution under<work_dir>/executionsis used.ligand (str, optional) – Ligand identifier to target when only a subset should be analyzed.
components (sequence of str, optional) – Components to include during analysis (overrides
sim_cfg.components).n_workers (int, optional) – Number of worker processes requested for the analysis handler.
analysis_start_step (int, optional) – First production step to include in analysis (per window); overrides config.
n_bootstraps (int, optional) – Number of MBAR bootstrap resamples; overrides config.
overwrite (bool, optional) – When
True(default), overwrite any existing analysis results for the run_id. WhenFalse, skip ligands that already have analysis outputs.raise_on_error (bool, optional) – When
True(default) propagate errors raised by the analysis handler. Set toFalseto log the failure and continue with other ligands.
- batter.api.run_from_yaml(path: Path | str, on_failure: Literal['prune', 'raise', 'retry'] = None, run_overrides: Dict[str, Any] | None = None) None[source]
Execute a BATTER workflow described by a YAML file.
- batter.api.save_sim_config(cfg: SimulationConfig, path: Path | str) None
Write a simulation configuration to YAML.
- Parameters:
cfg (SimulationConfig) – Configuration object to serialise.
path (str or pathlib.Path) – Output file path for the YAML representation.
Config Modules#
- class batter.config.run.CreateArgs(*, system_name: str | None = 'unnamed_system', protein_input: Path | None = None, system_input: Path | None = None, system_coordinate: Path | None = None, protein_align: str | None = 'name CA', ligand_paths: dict[str, ~pathlib.Path | str]=<factory>, ligand_input: Path | None = None, ligand_ff: str = 'gaff2', retain_lig_prot: bool = True, param_method: Literal['amber', 'openff']='amber', param_charge: str = 'am1bcc', param_outdir: Path | None = None, anchor_atoms: list[str] = <factory>, lipid_mol: list[str] = <factory>, other_mol: list[str] = <factory>, overwrite: bool = True, extra_restraints: str | None = None, extra_restraint_fc: float = 10.0, extra_conformation_restraints: Path | None = None, receptor_ff: str = 'protein.ff14SB', lipid_ff: str = 'lipid21', solv_shell: float = 15.0, cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, neutralize_only: Literal['yes', 'no']='no', water_model: str = 'TIP3P', l1_range: float = 6.0, min_adis: float = 3.0, max_adis: float = 7.0)[source]
Bases:
BaseModelInputs for system creation and staging.
Notes
This section mirrors the
createblock in the run YAML file.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- anchor_atoms: list[str]
- anion: str
- cation: str
- extra_conformation_restraints: Path | None
- extra_restraint_fc: float
- extra_restraints: str | None
- ion_conc: float
- l1_range: float
- ligand_ff: str
- ligand_input: Path | None
- ligand_paths: dict[str, Path | str]
- lipid_ff: str
- lipid_mol: list[str]
- max_adis: float
- min_adis: float
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- neutralize_only: Literal['yes', 'no']
- other_mol: list[str]
- overwrite: bool
- param_charge: str
- param_method: Literal['amber', 'openff']
- param_outdir: Path | None
- protein_align: str | None
- protein_input: Path | None
- receptor_ff: str
- resolve_paths(base: Path) CreateArgs[source]
Return a copy where path fields are absolute relative to
base.
- retain_lig_prot: bool
- solv_shell: float
- system_coordinate: Path | None
- system_input: Path | None
- system_name: str | None
- water_model: str
- class batter.config.run.FESimArgs(*, dec_int: str = 'mbar', remd: RemdArgs = <factory>, rocklin_correction: Literal['yes', 'no']='no', lambdas: List[float] = <factory>, component_lambdas: Dict[str, ~typing.List[float]]=<factory>, blocks: int = 0, lig_buffer: float = 15.0, lig_distance_force: float = 5.0, lig_angle_force: float = 250.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 10.0, lig_com_force: float = 10.0, buffer_x: float = 20.0, buffer_y: float = 20.0, buffer_z: float = 20.0, eq_steps: Annotated[int, ~annotated_types.Ge(ge=0)] = 1000000, n_steps: Dict[str, int]=<factory>, ntpr: int = 100, ntwr: int = 2500, ntwe: int = 0, ntwx: int = 25000, cut: float = 9.0, gamma_ln: float = 1.0, dt: float = 0.004, hmr: Literal['yes', 'no']='no', enable_mcwat: Literal['yes', 'no']='yes', temperature: float = 298.15, barostat: int = 2, unbound_threshold: Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: Annotated[int, ~annotated_types.Ge(ge=0)] = 0)[source]
Bases:
BaseModelFree-energy simulation knobs loaded from the
fe_simsection.The fields feed directly into
batter.config.simulation.SimulationConfigoverrides.fe_typeis resolved internally fromprotocolrather than being set by users.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- analysis_start_step: int
- barostat: int
- blocks: int
- buffer_x: float
- buffer_y: float
- buffer_z: float
- component_lambdas: Dict[str, List[float]]
- cut: float
- dec_int: str
- dt: float
- enable_mcwat: Literal['yes', 'no']
- eq_steps: int
- gamma_ln: float
- hmr: Literal['yes', 'no']
- lambdas: List[float]
- lig_angle_force: float
- lig_buffer: float
- lig_com_force: float
- lig_dihcf_force: float
- lig_distance_force: float
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_bootstraps: int
- n_steps: Dict[str, int]
- ntpr: int
- ntwe: int
- ntwr: int
- ntwx: int
- rec_com_force: float
- remd: RemdArgs
- rocklin_correction: Literal['yes', 'no']
- temperature: float
- unbound_threshold: float
- class batter.config.run.KartografMapperArgs(*, atom_max_distance: float = 0.95, map_exact_ring_matches_only: bool = True, allow_partial_fused_rings: bool = True, allow_bond_breaks: bool = False, filter_element_changes: bool = True, filter_mismatched_attached_h_count: bool = False)[source]
Bases:
BaseModelKartograf atom mapper option overrides for RBFE.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- allow_bond_breaks: bool
- allow_partial_fused_rings: bool
- atom_max_distance: float
- filter_element_changes: bool
- filter_mismatched_attached_h_count: bool
- map_exact_ring_matches_only: bool
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class batter.config.run.LomapMapperArgs(*, time: int | None = None, threed: bool | None = None, max3d: float | None = None, element_change: bool | None = None, shift: bool | None = None)[source]
Bases:
BaseModelLoMap atom mapper option overrides for RBFE.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- element_change: bool | None
- max3d: float | None
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- shift: bool | None
- threed: bool | None
- time: int | None
- class batter.config.run.MDSimArgs(*, dt: float = 0.004, temperature: float = 298.15, eq_steps: Annotated[int, Ge(ge=0)] = 100000, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 25000, cut: float = 9.0, gamma_ln: float = 1.0, barostat: int = 2, hmr: Literal['yes', 'no'] = 'yes', enable_mcwat: Literal['yes', 'no'] = 'yes')[source]
Bases:
BaseModelSimulation overrides used when
protocol == "md".These runs reuse the equilibration steps from ABFE but never schedule FE windows, so only generic MD knobs are required (no lambdas, SDR restraints, etc.).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- barostat: int
- cut: float
- dt: float
- enable_mcwat: Literal['yes', 'no']
- eq_steps: int
- gamma_ln: float
- hmr: Literal['yes', 'no']
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- ntpr: int
- ntwe: int
- ntwr: int
- ntwx: int
- temperature: float
- class batter.config.run.RBFENetworkArgs(*, mapping: str | None = 'default', atom_mapper: Literal['kartograf', 'lomap']='kartograf', kartograf: KartografMapperArgs = <factory>, lomap: LomapMapperArgs = <factory>, konnektor_layout: str | None = None, both_directions: bool = False, mapping_file: Path | None = None)[source]
Bases:
BaseModelRBFE network mapping controls.
Users can specify a mapping strategy by name (
mapping) or provide an explicit mapping file (mapping_file).Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- atom_mapper: Literal['kartograf', 'lomap']
- both_directions: bool
- kartograf: KartografMapperArgs
- konnektor_layout: str | None
- lomap: LomapMapperArgs
- mapping: str | None
- mapping_file: Path | None
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- resolve_paths(base: Path) RBFENetworkArgs[source]
- class batter.config.run.RunConfig(*, version: int = 1, protocol: Literal['abfe', 'rbfe', 'asfe', 'md']='abfe', backend: Literal['local', 'slurm']='local', create: CreateArgs, fe_sim: Dict[str, ~typing.Any] | ~batter.config.run.FESimArgs | ~batter.config.run.MDSimArgs=<factory>, run: RunSection, rbfe: RBFENetworkArgs | None = None)[source]
Bases:
BaseModelTop-level YAML config.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- backend: Literal['local', 'slurm']
- create: CreateArgs
- classmethod load(path: Path | str) RunConfig[source]
Load and validate a run configuration from disk.
- Parameters:
path (str or pathlib.Path) – Location of the YAML file to parse.
- Returns:
Fully validated configuration object.
- Return type:
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod model_validate_yaml(yaml_text: str) RunConfig[source]
Validate a run configuration from an in-memory YAML string.
- Parameters:
yaml_text (str) – Raw YAML content describing the run configuration.
- Returns:
Validated configuration model.
- Return type:
- protocol: Literal['abfe', 'rbfe', 'asfe', 'md']
- rbfe: RBFENetworkArgs | None
- resolved_sim_config() SimulationConfig[source]
Build the effective simulation configuration for this run.
- Returns:
Simulation parameters derived from
createandfe_simsections.- Return type:
- run: RunSection
- version: int
- class batter.config.run.RunSection(*, output_folder: Path, system_type: Literal['MABFE', 'MASFE'] | None=None, only_fe_preparation: bool = False, on_failure: Literal['raise', 'prune', 'retry']='raise', max_workers: int | None = None, max_active_jobs: Annotated[int | None, ~annotated_types.Ge(ge=0)] = 1000, batch_mode: bool = False, batch_gpus: Annotated[int | None, ~annotated_types.Ge(ge=0)] = None, batch_gpus_per_task: Annotated[int, ~annotated_types.Ge(ge=1)] = 1, batch_srun_extra: List[str] = <factory>, dry_run: bool = False, clean_failures: bool = False, remd: Literal['yes', 'no']='no', run_id: str = 'auto', allow_run_id_mismatch: bool = False, slurm_header_dir: Path | None = None, email_sender: str = 'nobody@stanford.edu', email_on_completion: str | None = None, slurm: SlurmConfig = <factory>)[source]
Bases:
BaseModelRun-related settings, including where outputs land.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- allow_run_id_mismatch: bool
- batch_gpus: int | None
- batch_gpus_per_task: int
- batch_mode: bool
- batch_srun_extra: List[str]
- clean_failures: bool
- dry_run: bool
- email_on_completion: str | None
- email_sender: str
- max_active_jobs: int | None
- max_workers: int | None
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- on_failure: Literal['raise', 'prune', 'retry']
- only_fe_preparation: bool
- output_folder: Path
- remd: Literal['yes', 'no']
- resolve_paths(base: Path) RunSection[source]
Return a copy where
output_folderis absolute relative tobase.
- run_id: str
- slurm: SlurmConfig
- slurm_header_dir: Path | None
- system_type: Literal['MABFE', 'MASFE'] | None
- class batter.config.run.SlurmConfig(*, partition: str | None = None, time: str | None = None, nodes: int | None = None, ntasks_per_node: int | None = None, mem_per_cpu: str | None = None, gres: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, extra_sbatch: List[str] = <factory>)[source]
Bases:
BaseModelSLURM-specific configuration.
- Parameters:
partition (str, optional) – SLURM partition/queue name.
time (str, optional) – Walltime in the
HH:MM:SSformat.nodes (int, optional) – Number of nodes to request.
ntasks_per_node (int, optional) – Number of tasks per node.
mem_per_cpu (str, optional) – Memory per CPU (e.g.,
16G).gres (str, optional) – Generic resource string (e.g., GPU spec).
account (str, optional) – Account to charge for jobs.
qos (str, optional) – QoS string if required by the cluster.
constraint (str, optional) – Constraint string passed to
sbatch.extra_sbatch (list[str]) – Additional arguments appended to the
sbatchsubmission command.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- account: str | None
- constraint: str | None
- extra_sbatch: List[str]
- gres: str | None
- mem_per_cpu: str | None
- model_config = {'extra': 'ignore'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- nodes: int | None
- ntasks_per_node: int | None
- partition: str | None
- qos: str | None
- time: str | None
- to_sbatch_flags() List[str][source]
Produce a flat list of
sbatchcommand-line flags.- Returns:
Sequence suitable for passing to
subprocess.run().- Return type:
list of str
- class batter.config.simulation.SimulationConfig(*, system_name: str, fe_type: ~typing.Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md'], dec_int: ~typing.Literal['mbar', 'ti'] = 'mbar', remd: ~typing.Literal['yes', 'no'] = 'no', remd_nstlim: int = 100, slurm_header_dir: ~pathlib.Path = <factory>, infe: bool = False, p1: str = '', p2: str = '', p3: str = '', other_mol: ~typing.List[str] = <factory>, lipid_mol: ~typing.List[str] = <factory>, solv_shell: float | None = 15.0, rocklin_correction: ~typing.Literal['yes', 'no'] = 'no', release_eq: ~typing.List[float] = <factory>, ti_points: int | None = 0, lambdas: ~typing.List[float] = <factory>, component_windows: ~typing.Dict[str, ~typing.List[float]] = <factory>, sdr_dist: float | None = 0.0, dec_method: str | None = None, blocks: int = 0, unbound_threshold: ~typing.Annotated[float, ~annotated_types.Ge(ge=0)] = 8.0, analysis_start_step: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, n_bootstraps: ~typing.Annotated[int, ~annotated_types.Ge(ge=0)] = 0, lig_distance_force: float = 0.0, lig_angle_force: float = 0.0, lig_dihcf_force: float = 0.0, rec_com_force: float = 0.0, lig_com_force: float = 0.0, water_model: ~typing.Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC'] = 'TIP3P', buffer_x: float = 10.0, buffer_y: float = 10.0, buffer_z: float = 15.0, lig_buffer: float = 10.0, neutralize_only: ~typing.Literal['yes', 'no'] = 'no', cation: str = 'Na+', anion: str = 'Cl-', ion_conc: float = 0.15, hmr: ~typing.Literal['yes', 'no'] = 'no', enable_mcwat: ~typing.Literal['yes', 'no'] = 'yes', temperature: float = 298.15, eq_steps: int = 1000000, n_steps_dict: ~typing.Dict[str, int] = <factory>, l1_x: float | None = None, l1_y: float | None = None, l1_z: float | None = None, l1_range: float | None = None, min_adis: float | None = None, max_adis: float | None = None, ntpr: int = 100, ntwr: int = 10000, ntwe: int = 0, ntwx: int = 2500, cut: float = 9.0, gamma_ln: float = 1.0, barostat: ~typing.Literal[1, 2] = 2, dt: float = 0.004, all_atoms: ~typing.Literal['yes', 'no'] = 'no', receptor_ff: str = 'protein.ff14SB', ligand_ff: str = 'gaff2', lipid_ff: str = 'lipid21', ligand_dict: ~typing.Dict[str, ~typing.Any] = <factory>, rng: int = 0, ion_def: ~typing.List[~typing.Any] = <factory>, dic_n_steps: ~typing.Dict[str, int] = <factory>, rest: ~typing.List[float] = <factory>, neut: str = '', protein_align: str = 'name CA', receptor_segment: str | None = None, components: ~typing.List[str] = <factory>, component_lambdas: ~typing.Dict[str, ~typing.List[float]] = <factory>, membrane_simulation: bool = True)[source]
Bases:
BaseModelSimulation configuration for ABFE/ASFE/RBFE workflows. Values are fed by RunConfig.resolved_sim_config(), which merges create: and fe_sim:.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- all_atoms: Literal['yes', 'no']
- analysis_start_step: int
- anion: str
- barostat: Literal[1, 2]
- blocks: int
- buffer_x: float
- buffer_y: float
- buffer_z: float
- cation: str
- component_lambdas: Dict[str, List[float]]
- component_windows: Dict[str, List[float]]
- components: List[str]
- cut: float
- dec_int: Literal['mbar', 'ti']
- dec_method: str | None
- dic_n_steps: Dict[str, int]
- dt: float
- enable_mcwat: Literal['yes', 'no']
- eq_steps: int
- fe_type: Literal['custom', 'rest', 'sdr', 'dd', 'sdr-rest', 'express', 'relative', 'uno', 'uno_com', 'uno_rest', 'self', 'uno_dd', 'dd-rest', 'asfe', 'md']
- classmethod from_sections(create: CreateArgs, fe: FESimArgs, *, protocol: str | None = None, fe_type: str | None = None, slurm_header_dir: Path | None = None, run_remd: str | bool | None = None) SimulationConfig[source]
Construct a
SimulationConfigfrom run sections.- Parameters:
create (CreateArgs) – System creation inputs taken from the
createYAML section.fe (FESimArgs) – Free-energy simulation overrides from the
fe_simsection.run_remd ({“yes”,”no”}, optional) – Whether REMD execution is enabled (controls submission only; REMD inputs are always written during preparation).
- Returns:
Fully merged simulation configuration ready for downstream use.
- Return type:
- gamma_ln: float
- hmr: Literal['yes', 'no']
- infe: bool
- ion_conc: float
- ion_def: List[Any]
- l1_range: float | None
- l1_x: float | None
- l1_y: float | None
- l1_z: float | None
- lambdas: List[float]
- lig_angle_force: float
- lig_buffer: float
- lig_com_force: float
- lig_dihcf_force: float
- lig_distance_force: float
- ligand_dict: Dict[str, Any]
- ligand_ff: str
- lipid_ff: str
- lipid_mol: List[str]
- max_adis: float | None
- membrane_simulation: bool
- min_adis: float | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_bootstraps: int
- n_steps_dict: Dict[str, int]
- neut: str
- neutralize_only: Literal['yes', 'no']
- ntpr: int
- ntwe: int
- ntwr: int
- ntwx: int
- other_mol: List[str]
- p1: str
- p2: str
- p3: str
- protein_align: str
- rec_com_force: float
- receptor_ff: str
- receptor_segment: str | None
- release_eq: List[float]
- remd: Literal['yes', 'no']
- remd_nstlim: int
- rest: List[float]
- rng: int
- rocklin_correction: Literal['yes', 'no']
- sdr_dist: float | None
- slurm_header_dir: Path
- solv_shell: float | None
- system_name: str
- temperature: float
- ti_points: int | None
- to_dict() Dict[str, Any][source]
- unbound_threshold: float
- water_model: Literal['SPCE', 'TIP4PEW', 'TIP3P', 'TIP3PF', 'OPC']
- batter.config.utils.coerce_yes_no(value: Any) str | None[source]
Normalize boolean-like values into
"yes"or"no".- Parameters:
value – Input flag provided by the user. Supported types include
bool, numeric scalars, or strings such as"true"and"0".- Returns:
"yes"or"no"when the flag can be interpreted.Noneis returned unchanged to preserve optional semantics.- Return type:
str or None
- Raises:
ValueError – If the value cannot be coerced into a boolean switch.
- batter.config.utils.expand_env_vars(data: Any, *, base_dir: Path | None = None) Any[source]
Recursively expand environment variables in a YAML-derived structure.
- Parameters:
data – Parsed YAML content to normalise.
base_dir (Path, optional) – Base directory for resolving relative (
./) paths.
- Returns:
Structure with string values expanded.
- Return type:
Any
- batter.config.utils.normalize_optional_path(value: Any) Path | None[source]
Resolve optional path-like values into
pathlib.Pathobjects.- Parameters:
value – Path candidate that may be
Noneor an empty string. Strings may contain environment variables or~.- Returns:
Expanded path when provided;
Noneif the value is empty.- Return type:
pathlib.Path or None
- batter.config.utils.sanitize_ligand_name(name: str) str[source]
Convert a ligand identifier into a filesystem-safe token.
- Parameters:
name (str) – Original ligand identifier, often derived from filenames or keys.
- Returns:
Uppercase alphanumeric token with unsafe characters replaced by underscores.
- Return type:
str
- batter.config.utils.sanitize_user_ligand_name(name: str) str[source]
Sanitize and validate a user-provided ligand identifier.
Reserved names that conflict with BATTER directory layout are rejected.
RBFE Helpers#
RBFE network helpers.
- class batter.rbfe.RBFENetwork(ligands: Tuple[str, ...], pairs: Tuple[Tuple[str, str], ...])[source]
Record the RBFE simulation mapping as ligand pairs.
- Parameters:
ligands (Sequence[str]) – Ordered ligand identifiers participating in the network.
pairs (Sequence[tuple[str, str]]) – Directed pairs describing simulations to run (reference, target).
- static default_mapping(ligands: Sequence[str]) List[Tuple[str, str]][source]
Default RBFE mapping: first ligand paired to each subsequent ligand.
- classmethod from_ligands(ligands: Sequence[str], mapping_fn: Callable[[Sequence[str]], Iterable[Tuple[str, str]]] | None = None) RBFENetwork[source]
Build an RBFE network from ligand identifiers and a mapping function.
- Parameters:
ligands (Sequence[str]) – Ordered ligand identifiers.
mapping_fn (callable, optional) – Function that returns iterable of (ref, target) pairs. When omitted, defaults to mapping the first ligand to all others.
- ligands: Tuple[str, ...]
- pairs: Tuple[Tuple[str, str], ...]
- to_mapping() dict[source]
Return a JSON-serializable mapping payload.
- batter.rbfe.draw_explicit_konnektor_network(pairs: Sequence[Sequence[str] | tuple[str, str]], ligand_files: Mapping[str, Path], plot_path: Path, hmr: bool = True, atom_mapper: str = 'kartograf', kartograf_options: Any | None = None, lomap_options: Any | None = None) None[source]
Build an explicit Konnektor network from pairs and draw it.
- batter.rbfe.filter_element_changes(molA: rdkit.Chem.Mol, molB: rdkit.Chem.Mol, mapping: dict[int, int]) dict[int, int][source]
Forces a mapping to exclude any alchemical element changes in the core
- batter.rbfe.filter_mismatched_attached_h_count(molA: rdkit.Chem.Mol, molB: rdkit.Chem.Mol, mapping: dict[int, int]) dict[int, int][source]
Exclude mapped heavy-atom pairs where the number of directly attached H differs. This helps avoid HMR mass mismatches for ‘common/core’ atoms.
- batter.rbfe.konnektor_pairs(ligands: Sequence[str], ligand_files: Mapping[str, Path], layout: str | None = None, plot_path: Path | None = None, hmr: bool = True, atom_mapper: str = 'kartograf', kartograf_options: Any | None = None, lomap_options: Any | None = None) List[Tuple[str, str]][source]
Build RBFE pairs using Konnektor network planners.
- batter.rbfe.load_mapping_file(path: Path) List[Tuple[str, str]][source]
Load RBFE mapping pairs from a file.
- Supported formats:
JSON/YAML: list of pairs, or dict with ‘pairs’/’edges’, or adjacency mapping.
Text: one pair per line, separated by ‘~’, ‘,’ or whitespace.
- batter.rbfe.resolve_mapping_fn(name: str | None) Callable[[Sequence[str]], Iterable[Tuple[str, str]]][source]
Resolve a mapping function by name.
Orchestrator Modules#
batter.orchestrate.run#
Top-level orchestration entry for BATTER runs.
This module wires: YAML (RunConfig) → shared system build → bulk ligand staging → single param job (“param_ligands”) → per-ligand pipelines → FE record save.
- batter.orchestrate.run.run_from_yaml(path: Path | str, on_failure: Literal['prune', 'raise', 'retry'] = None, run_overrides: Dict[str, Any] | None = None) None[source]
Execute a BATTER workflow described by a YAML file.
Selection helpers for choosing the correct pipeline implementation.
- batter.orchestrate.pipeline_utils.select_pipeline(protocol: str, sim_cfg: SimulationConfig, only_fe_prep: bool, *, sys_params: SystemParams | dict | None = None, partition: str | None = None) Pipeline[source]
Return the protocol-specific pipeline for a run.
- Parameters:
protocol (str) – Name of the requested protocol (
"abfe","rbfe","asfe", or"md").sim_cfg (SimulationConfig) – Validated simulation configuration produced by
RunConfig.only_fe_prep (bool) – When
True, truncate the pipeline after FE preparation steps.sys_params (SystemParams or dict, optional) – Extra parameters passed to system-level pipeline steps.
- Returns:
Pipeline instance tailored to the requested protocol.
- Return type:
Pipeline
- Raises:
ValueError – If the protocol name is not recognised.
Utilities for configuring execution backends used by the orchestrator.
- batter.orchestrate.backend.register_local_handlers(backend: LocalBackend) None[source]
Register built-in pipeline handlers on the local backend.
- Parameters:
backend (LocalBackend) – Backend instance that should receive the default handler mapping.
- Raises:
RuntimeError – If optional handler dependencies (for example
openff-toolkit) are missing.
Execution Modules#
Interfaces shared by execution backends.
- class batter.exec.base.ExecBackend(*args, **kwargs)[source]
Protocol implemented by execution backends.
- name: str
- run(step: Step, system: SimSystem, params: Dict) ExecResult[source]
Execute
stepforsystem.- Parameters:
step (Step) – Step metadata as produced by the pipeline.
system (SimSystem) – Simulation system descriptor.
params (dict) – Backend-specific parameters, potentially including
resources.
- Returns:
Execution artifacts and job identifiers.
- Return type:
ExecResult
- class batter.exec.base.Resources(time: str | None = None, cpus: int | None = None, gpus: int | None = None, mem: str | None = None, partition: str | None = None, account: str | None = None, extra: Mapping[str, str]=<factory>)[source]
Resource hints supplied to execution backends.
- Parameters:
time (str, optional) – Walltime (e.g.,
"02:00:00").cpus (int, optional) – CPU cores per task.
gpus (int, optional) – Number of GPUs required.
mem (str, optional) – Memory request (e.g.,
"16G").partition (str, optional) – Scheduler partition or queue.
account (str, optional) – Scheduler account.
extra (Mapping[str, str], optional) – Backend-specific SBATCH-style flags.
- account: str | None
- cpus: int | None
- extra: Mapping[str, str]
- gpus: int | None
- mem: str | None
- partition: str | None
- time: str | None
Execution backend for running pipelines locally.
- class batter.exec.local.LocalBackend(max_workers: int | None = None)[source]
In-process execution backend with optional parallel orchestration.
- Parameters:
max_workers (int, optional) – Maximum number of worker processes to use when
run_parallel()is invoked.Nonelets the backend auto-detect resources;0or1forces serial execution.
- name: str = 'local'
- register(step_name: str, handler: Callable[[Step, SimSystem, Mapping], ExecResult]) None[source]
Register a callable to execute
step_name.- Parameters:
step_name (str) – Identifier of the step (matches
batter.pipeline.step.Step.name).handler (Callable[[Step, SimSystem, Mapping], ExecResult]) – Function responsible for executing the step.
- run(step: Step, system: SimSystem, params: Mapping) ExecResult[source]
Execute
stepforsystemon the local machine.- Parameters:
step – Pipeline step metadata.
system – Simulation system descriptor.
params – Step parameters, typically generated by the orchestration layer.
- Returns:
Artifacts and job identifiers (empty for local execution).
- Return type:
ExecResult
- run_parallel(pipeline: Pipeline, systems: Iterable[SimSystem], *, max_workers: int | None = None, description: str = '', batch_size: str | int = 'auto', verbose: int = 10, prefer: str = 'processes', backend: str | None = None) Dict[str, Mapping[str, ExecResult]][source]
Execute
pipelinefor multiple systems in parallel.- Parameters:
pipeline – Pipeline object providing the sequence of steps to execute.
systems (Iterable[SimSystem]) – Collection of systems to process.
max_workers (int, optional) – Override the configured worker cap;
Nonefalls back to the value provided at construction time.description (str, optional) – Human-readable label used in debug logging.
batch_size, verbose, prefer, backend – Joblib configuration knobs forwarded to
joblib.Parallel.
- Returns:
Mapping of
system.nameto per-step results.- Return type:
dict
- Raises:
RuntimeError – When one or more systems fail.
Execution backend that submits steps to Slurm via sbatch.
- class batter.exec.slurm.SlurmBackend(*args, **kwargs)[source]
Slurm backend that materializes lightweight job scripts.
- name: str = 'slurm'
- run(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Submit
stepto Slurm.- Parameters:
step (Step) – Pipeline step metadata.
system (SimSystem) – Simulation system whose
rootdirectory stores scripts and logs.params (dict) – Backend-specific options. Recognised keys include
resources,env(exported variables), andpayload(shell snippet).
- Returns:
Artifacts referencing the generated script and log paths together with the submitted job identifier (if available).
- Return type:
ExecResult
- class batter.exec.slurm_mgr.SlurmJobManager(poll_s: float = 60.0, max_retries: int = 3, resubmit_backoff_s: float = 30.0, registry_file: Path | None = None, dry_run: bool = False, sbatch_flags: Sequence[str] | None = None, submit_retry_limit: int = 3, submit_retry_delay_s: float = 60.0, max_active_jobs: int | None = None, partition: str | None = None, batch_mode: bool = False, batch_gpus: int | None = None, gpus_per_task: int = 1, srun_extra: Sequence[str] | None = None, stage: str | None = None, header_root: Path | None = None, **_ignored: Any)[source]
Submit, monitor, and resubmit Slurm jobs for BATTER executions.
- Parameters:
poll_s (float, optional) – Poll interval (seconds) between status checks.
max_retries (int, optional) – Maximum automatic resubmissions per workdir (excluding TIMEOUT and COMPLETED-without-sentinel).
resubmit_backoff_s (float, optional) – Sleep before resubmitting a job after detecting termination/missing state.
registry_file (pathlib.Path, optional) – JSONL queue file for cross-process coordination.
dry_run (bool, optional) – When True, do not submit; record that submission would occur.
sbatch_flags (Sequence[str], optional) – Global sbatch flags appended to every submission.
submit_retry_limit (int, optional) – Number of retries for the submission command itself.
submit_retry_delay_s (float, optional) – Delay between submission retries.
max_active_jobs (int, optional) – Cap on concurrent jobs for the user (checked via one squeue -u call).
partition (str, optional) – Partition filter used by max_active_jobs checks.
- Other Parameters:
batch_mode, batch_gpus, gpus_per_task, srun_extra, stage, header_root – Accepted for compatibility with older code paths. This manager does not implement batch execution; values are stored/ignored.
**_ignored – Extra kwargs are accepted and ignored for compatibility.
- add(spec: SlurmJobSpec) None[source]
Queue
specfor later submission and optionally persist to registry.- Parameters:
spec (SlurmJobSpec) – Job specification.
- clear() None[source]
Clear in-memory queue/retry book and remove on-disk registry if present.
- ensure_running(spec: SlurmJobSpec) None[source]
Ensure the spec is submitted or already done/active.
- Parameters:
spec (SlurmJobSpec) – Job spec.
Notes
This method does not register specs; it’s a one-off submit-if-needed.
- jobs() List[SlurmJobSpec][source]
Return merged in-memory + registry specs (dedup by workdir).
- set_stage(stage: str | None) None[source]
Set the active stage filter for registry loading/submission.
- Parameters:
stage (str or None) – Stage key such as
equil,fe_equil,fe, etc. If None, stage filtering is disabled.
- wait_all() None[source]
Submit/monitor all registered jobs and block until completion.
- wait_for_slot(poll_s: float | None = None, user: str | None = None, partition: str | None = None) None[source]
Block until active jobs drop below
max_active_jobs.- Parameters:
poll_s (float, optional) – Polling interval in seconds (defaults to
poll_s).user (str, optional) – Unix username (defaults to
$USER).partition (str, optional) – Partition to filter on (defaults to manager partition).
- wait_until_done(specs: Iterable[SlurmJobSpec]) None[source]
Legacy interface: monitor a given set until complete.
- class batter.exec.slurm_mgr.SlurmJobSpec(workdir: Path, script_rel: str = 'SLURMM-run', finished_name: str = 'FINISHED', failed_name: str = 'FAILED', name: str | None = None, stage: str | None = None, body_rel: str | None = None, header_name: str | None = None, header_template: Path | None = None, header_root: Path | None = None, batch_script: Path | None = None, extra_sbatch: Sequence[str] = <factory>, extra_env: Dict[str, str]=<factory>, submit_dir: Path | None = None, alt_script_names: Sequence[str] = ('SLURMM-run', 'SLURMM-Run', 'slurmm-run', 'run.sh'))[source]
Descriptor for a Slurm job managed by
SlurmJobManager.- Parameters:
workdir (pathlib.Path) – Working directory containing submission scripts and sentinel files.
script_rel (str, optional) – Preferred relative submission script path.
finished_name (str, optional) – Sentinel file name indicating success.
failed_name (str, optional) – Sentinel file name indicating failure.
name (str, optional) – Friendly display name.
stage (str, optional) – Logical stage used for registry filtering.
extra_sbatch (Sequence[str], optional) – Extra
sbatchflags (job-specific).extra_env (dict, optional) – Extra environment variables to export (job-specific).
submit_dir (pathlib.Path, optional) – Directory to submit from (defaults to
workdir).
Notes
The remaining fields are legacy compatibility fields used by older BATTER versions and/or existing registry entries. The manager may ignore them.
- alt_script_names: Sequence[str] = ('SLURMM-run', 'SLURMM-Run', 'slurmm-run', 'run.sh')
- batch_script: Path | None = None
- body_rel: str | None = None
- extra_env: Dict[str, str]
- extra_sbatch: Sequence[str]
- failed_name: str = 'FAILED'
- failed_path() Path[source]
Sentinel path signalling failure.
- finished_name: str = 'FINISHED'
- finished_path() Path[source]
Sentinel path signalling successful completion.
- header_name: str | None = None
- header_root: Path | None = None
- header_template: Path | None = None
- jobid_path() Path[source]
Path containing the most recent Slurm job identifier.
- name: str | None = None
- resolve_script_abs() Path[source]
Return the absolute path to the submission script.
- Returns:
Existing script path if found, otherwise the preferred path.
- Return type:
pathlib.Path
- script_arg() str[source]
Return the submission-script path argument for
sbatch.- Returns:
Script path relative to
submit_dirwhen possible.- Return type:
str
- script_rel: str = 'SLURMM-run'
- stage: str | None = None
- submit_dir: Path | None = None
- workdir: Path
Helpers for constructing AMBER mdin control files.
- class batter.exec.amber.mdin.AmberMdin(*, cut: float = 9.0, ioutfm: int = 1, ntb: int = 1, ntxo: int = 2)[source]
Mutable representation of an AMBER mdin file.
- Parameters:
cut (float, optional) – Non-bonded cutoff in Å (default: 9.0).
ioutfm (int, optional) – Output format flag (1 → NetCDF).
ntb (int, optional) – Periodic boundary condition flag.
ntxo (int, optional) – Restart write format.
- add_block(name: str, params: Dict[str, object] | None = None) None[source]
Append a named control block.
- add_raw(line: str) None[source]
Append a raw line verbatim to the output.
- apply_defaults(*, cut: float = 9.0, ioutfm: int = 1, ntb: int = 1, ntxo: int = 2) None[source]
Initialise with a baseline
cntrlblock.
- override_block(block_name: str, param_dict: Dict[str, object]) None[source]
Merge
param_dictinto an existing block or create the block.
- save(filename: str | Path) None[source]
Write the mdin file to
filename.
- to_string() str[source]
Render the mdin contents as text.
- update_param(block_name: str, key: str, value: object) None[source]
Update a single parameter within
block_name.
- batter.exec.amber.mdin.apply_disang(mdin: AmberMdin, *, filename: str = 'disang.rest') None[source]
Reference a DISANG restraint file.
- batter.exec.amber.mdin.apply_membrane_npt(mdin: AmberMdin, *, temp: float = 298.15, steps: int = 50000, barostat: int = 2, dt: float = 0.004) None[source]
Configure semi-isotropic NPT suitable for membranes.
- batter.exec.amber.mdin.apply_minimization(mdin: AmberMdin, *, steps: int = 5000) None[source]
Enable energy minimisation for
stepsiterations.
- batter.exec.amber.mdin.apply_npt(mdin: AmberMdin, *, temp: float = 298.15, steps: int = 50000, barostat: int = 2, dt: float = 0.004) None[source]
Configure standard NPT dynamics.
- batter.exec.amber.mdin.apply_restraints(mdin: AmberMdin, *, mask: str, weight: float = 50.0) None[source]
Add positional restraints.
- batter.exec.amber.mdin.apply_ti(mdin: AmberMdin, *, lbd_val: float, timask1: str, timask2: str, scmask1: str, scmask2: str, crgmask: str) None[source]
Configure thermodynamic integration (TI) parameters.
- batter.exec.amber.mdin.apply_wt_end(mdin: AmberMdin) None[source]
Append the
&wt type='END'control line.
Slurm-backed equilibration handler.
- batter.exec.handlers.equil.equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Submit and register the equilibration job with the Slurm manager.
- Parameters:
step (Step) – Pipeline step metadata (unused but provided for symmetry).
system (SimSystem) – Simulation system descriptor.
params (dict) – Raw handler payload; validated into
StepPayload.
- Returns:
Result containing either existing artifacts (when already finished) or the work directory to be monitored by the manager.
- Return type:
ExecResult
- Raises:
FileNotFoundError – If the expected submission script is missing.
RuntimeError – When
payload['job_mgr']is not aSlurmJobManager.
Handlers that queue free-energy equilibration and production jobs.
- batter.exec.handlers.fe.fe_equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Queue equilibration jobs for each component of a ligand.
- Parameters:
step, system (ignored) – Included for parity with the handler signature.
params (dict) – Handler payload containing the job manager and configuration values.
- Returns:
Number of jobs enqueued (without waiting for completion).
- Return type:
ExecResult
- batter.exec.handlers.fe.fe_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Queue production jobs for each component/window combination.
- Parameters:
step, system (ignored) – Provided for handler API compatibility.
params (dict) – Handler payload containing the job manager and configuration values.
- Returns:
Number of jobs enqueued (without waiting for completion).
- Return type:
ExecResult
Run post-processing analysis on free-energy simulations.
- batter.exec.handlers.fe_analysis.analyze_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Run FE analysis for a ligand rooted at
<system.root>/fe.- Parameters:
step (Step) – Pipeline metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Mapping with the generated
Results.datand optional timeseries artefacts.- Return type:
ExecResult
Parameterise ligands and populate per-ligand artifacts.
- batter.exec.handlers.param_ligands.copy_ligand_params(src_dir: Path, child_dir: Path, residue_name: str) None[source]
Copy
lig.*artifacts intochild_dir/paramsusingresidue_name.
- batter.exec.handlers.param_ligands.param_ligands(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Run the ligand parametrisation pipeline and index results.
- Parameters:
step (Step) – Pipeline metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Mapping containing the parameter store path, JSON index, manifest, and raw hashes.
- Return type:
ExecResult
Prepare equilibration inputs for a ligand.
- batter.exec.handlers.prepare_equil.prepare_equil_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Build equilibration inputs for the current ligand.
- Parameters:
step (Step) – Pipeline step metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Contains the output directory and any generated metadata.
- Return type:
ExecResult
Prepare alchemical FE inputs for a ligand.
- batter.exec.handlers.prepare_fe.prepare_fe_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Construct the initial FE directory layout for a ligand.
- Parameters:
step (Step) – Pipeline metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Metadata describing the generated directories.
- Return type:
ExecResult
- batter.exec.handlers.prepare_fe.prepare_fe_windows_handler(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
- Expand FE windows for each requested component:
copies <comp>-1 to <comp>-2, <comp>-3, … (depending on lambda schedule)
keeps run scripts consistent in each window (builders call write_run_file)
writes artifacts/fe/windows.json summarizing windows
Builders re-use the same interface; here we just iterate components and request per-window builds by calling with win >= 1.
Prepare complex systems (protein/ligand/membrane) for simulations.
- batter.exec.handlers.system_prep.system_prep(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Prepare a system by aligning components and generating reference structures.
- Parameters:
step (Step) – Pipeline metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Paths to generated reference structures and a metadata dictionary with anchor and membrane information.
- Return type:
ExecResult
Minimal system-preparation handler for MASFE workflows.
- batter.exec.handlers.system_prep_masfe.system_prep_masfe(step: Step, system: SimSystem, params: Dict[str, Any]) ExecResult[source]
Prepare a MASFE solvation system by staging ligands and overrides.
- Parameters:
step (Step) – Pipeline metadata (unused).
system (SimSystem) – Simulation system descriptor.
params (dict) – Handler payload validated into
StepPayload.
- Returns:
Manifest of staged ligands and paths to generated files.
- Return type:
ExecResult
Parameterisation Modules#
Ligand parameterisation helpers for GAFF/GAFF2 and OpenFF workflows.
- class batter.param.ligand.LigandFactory[source]
Factory that chooses the appropriate loader/processor by file extension.
- create_ligand(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None) LigandProcessing[source]
Instantiate a concrete
LigandProcessingsubclass.- Parameters:
ligand_file, index, output_dir, ligand_name, charge, retain_lig_prot,
ligand_ff, unique_mol_names – Forwarded to the underlying processor.
- Returns:
Processor configured for the detected file type.
- Return type:
- Raises:
ValueError – If the file extension is unsupported.
- class batter.param.ligand.LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
Base class for ligand processing and parameterization.
It loads a ligand, determines a unique residue/name, estimates the charge, and generates AMBER/OpenFF parameters.
- Parameters:
ligand_file – Input ligand path (SDF/MOL2/PDB depending on subclass).
index – 1-based index used for stable name generation.
output_dir – Output folder for generated files.
ligand_name – Optional preferred name; will be uniquified to 3 chars.
charge – Charge method for OpenFF pre-charge or quick estimate (e.g.,
"am1bcc").retain_lig_prot – If
True, keep hydrogen atoms from input.ligand_ff – One of
"gaff"or"gaff2"or an OpenFF release like"openff-2.2.0".unique_mol_names – Existing names to avoid collisions.
- Variables:
ligand_object (SmallMoleculeComponent)
openff_molecule (Molecule)
ligand_charge (float) – Estimated total charge (integer).
atomnames (list[str]) – Atom names extracted from generated PDB (AMBER path).
- fetch_from_existing_db(database: str | Path) bool[source]
Search and copy ligand artifacts from a local database.
- Parameters:
database – Directory containing
<name>.(frcmod|lib|prmtop|inpcrd|mol2|pdb|json|sdf).- Returns:
Trueif a full, matching entry was found and copied.- Return type:
bool
- property ligand_sdf_path: str
Path to the canonicalised SDF stored on disk.
- Type:
str
- property name: str
Three-character residue name used for generated artifacts.
- Type:
str
- prepare_ligand_parameters() None[source]
Generate parameters using either AMBER (GAFF/GAFF2) or OpenFF path.
Notes
OpenFF path first creates AMBER artifacts for tleap-based system build.
Writes a
<name>.jsonmetadata file to the output folder.
- prepare_ligand_parameters_amberff(charge_method: str = 'bcc') None[source]
Prepare ligand parameters using AMBER (GAFF/GAFF2): mol2/frcmod/lib/prmtop.
- Parameters:
charge_method – Antechamber charge method (e.g.,
"bcc"or"gas").
- prepare_ligand_parameters_openff() None[source]
Prepare ligand parameters using OpenFF toolkit (and AMBER bootstrap).
Behavior#
Runs a fast AMBER bootstrap (GAFF2 + gas charges) so tleap artifacts exist.
Generates an OpenFF prmtop for downstream if you prefer OpenMM/OpenFF.
- property smiles: str
Canonical SMILES with explicit hydrogens.
- Type:
str
- to_dict() Dict[str, Any][source]
- class batter.param.ligand.MOL2_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
- class batter.param.ligand.PDB_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
- class batter.param.ligand.SDF_LigandProcessing(ligand_file: str | Path, index: int, output_dir: str | Path, ligand_name: str | None = None, charge: str = 'am1bcc', retain_lig_prot: bool = True, ligand_ff: str = 'gaff2', unique_mol_names: List[str] | None = None)[source]
- batter.param.ligand.batch_ligand_process(ligand_paths: Sequence[str | Path] | Mapping[str, str | Path], output_path: str | Path, retain_lig_prot: bool = True, ligand_ph: float = 7.0, ligand_ff: str = 'gaff2', charge_method: str = 'am1bcc', overwrite: bool = False, run_with_slurm: bool = False, max_slurm_jobs: int = 50, run_with_slurm_kwargs: Dict[str, Any] | None = None, job_extra_directives: List[str] | None = None, on_failure: Literal['prune', 'retry', 'raise'] | None = None) Tuple[List[str], Dict[str, Tuple[str, str]]][source]
Parameterise ligands into a content-addressed store.
Artifacts for each ligand are written under:
<output_path>/<hash_id>/*
where
hash_id = sha256(canonical_smiles + ligand_ff + retain).hexdigest()[:12].- Parameters:
ligand_paths – List of file paths or mapping {alias: path}. Only the file path affects hashing.
output_path – Output directory for the content-addressed store.
retain_lig_prot – Whether to retain hydrogens from inputs.
ligand_ph – Target protonation pH (reserved for future use).
ligand_ff – Force field (‘gaff’/’gaff2’ or a valid OpenFF release name).
charge_method – Charge method for ligand.
overwrite – If True, re-parameterize even if <hash_id> already exists.
run_with_slurm – If True, distribute parametrization with Dask+SLURM (same behavior as before).
max_slurm_jobs, run_with_slurm_kwargs, job_extra_directives – SLURM/Dask configuration.
- Returns:
list of str – Hash identifiers in processing order (duplicates preserved).
dict – Mapping from the provided input path to
(hash_id, canonical_smiles).
Pipeline Modules#
- class batter.pipeline.pipeline.Pipeline(steps: List[Step])[source]
Bases:
objectDirected acyclic pipeline of
Stepobjects.- Parameters:
steps (list[Step]) – Steps that form a DAG. Dependencies are given by
Step.requires.
Notes
A simple topological sort is performed before execution.
Backends must implement a
run(step, system) -> ExecResultmethod.
- adjacency() Dict[str, List[str]][source]
Return the adjacency list describing the DAG.
- Returns:
Mapping of each step to the steps that depend on it.
- Return type:
dict[str, list[str]]
- dependencies(step_name: str) List[str][source]
Retrieve the declared dependencies for
step_name.- Parameters:
step_name (str) – Step identifier.
- Returns:
Names of prerequisite steps.
- Return type:
list[str]
- Raises:
KeyError – If
step_namedoes not exist in the pipeline.
- describe() List[Dict[str, Any]][source]
Return a serialisable summary of the pipeline.
- Returns:
Each entry contains
name,requires, andpayload_typekeys.- Return type:
list of dict
- ordered_steps() List[Step][source]
Return steps in execution order.
- run(backend, system) Dict[str, ExecResult][source]
Execute steps in topological order.
- Parameters:
backend – Object providing
run(step, system) -> ExecResult.system – The
SimSystemdescriptor.
- Returns:
Mapping from step name to execution result.
- Return type:
dict[str, ExecResult]
- Raises:
RuntimeError – If a required dependency has not been produced.
- class batter.pipeline.pipeline.PipelineState(results: Dict[str, ~batter.pipeline.step.ExecResult]=<factory>)[source]
Bases:
objectIn-memory state of a pipeline execution.
- Variables:
results (dict[str, ExecResult]) – Per-step execution results.
- results: Dict[str, ExecResult]
- class batter.pipeline.step.ExecResult(job_ids: List[str] = <factory>, artifacts: Mapping[str, ~typing.Any]=<factory>)[source]
Execution result returned by a backend.
- Parameters:
job_ids (list[str]) – Scheduler or process identifiers (may be empty for local runs).
artifacts (Mapping[str, Any]) – Named outputs (paths, metrics, small JSON blobs).
- artifacts: Mapping[str, Any]
- job_ids: List[str]
- class batter.pipeline.step.Step(name: str, requires: List[str] = <factory>, payload: Any = None)[source]
One unit of work in the pipeline.
- Parameters:
name (str) – Unique step name (e.g.,
"prepare_fe").requires (list[str]) – Names of steps that must complete before this step can run.
payload (Any, optional) – Typed payload consumed by the backend. Typically a
StepPayload.
Notes
Steps are immutable descriptors. Execution is handled by a backend.
The backend decides how to interpret
params(e.g., templates, flags).
- name: str
- property params: Any
Backwards-compatible alias for
payload.
- payload: Any
- replace(**updates: Any) Step[source]
Return a new
Stepwith selected attributes updated.- Parameters:
**updates – Keyword overrides for any of the dataclass fields (
name,requires, orpayload).- Returns:
Fresh step instance containing the requested updates.
- Return type:
Step
- requires: List[str]
- class batter.pipeline.payloads.StepPayload(*, sim: SimulationConfig | None = None, sys_params: SystemParams | None = None, **extra_data: Any)[source]
Typed payload passed to pipeline step handlers.
The payload binds the
SimulationConfigandSystemParamsobjects used by most handlers while permitting arbitrary extra values for backwards compatibility or specialised needs.- Parameters:
sim (SimulationConfig, optional) – Resolved simulation configuration for the step.
sys_params (SystemParams, optional) – Shared system-level parameters.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- copy_with(**updates: Any) StepPayload[source]
Create a new
StepPayloadwith additional updates.- Parameters:
**updates – Keyword overrides applied to the current payload.
- Returns:
New payload containing the merged data.
- Return type:
StepPayload
- get(item: str, default: Any = None) Any[source]
Safe lookup for a payload value with a default.
- Parameters:
item (str) – Key to fetch.
default (Any, optional) – Value returned when the key is missing or None.
- Returns:
Requested value or the default.
- Return type:
Any
- model_config = {'arbitrary_types_allowed': True, 'extra': 'allow'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- sim: SimulationConfig | None
- sys_params: SystemParams | None
- to_mapping() Dict[str, Any][source]
Convert the payload (including extras) to a plain dictionary.
- Returns:
Merged representation of fields and extras.
- Return type:
dict[str, Any]
- class batter.pipeline.payloads.SystemParams(*, param_outdir: Path | None = None, system_name: str | None = None, protein_input: Path | None = None, system_input: Path | None = None, system_coordinate: Path | None = None, ligand_paths: Dict[str, ~pathlib.Path]=<factory>, yaml_dir: Path | None = None, anchor_atoms: tuple[str, ...]=(), extra_restraints: str | None = None, extra_restraint_fc: float | None = None, extra_conformation_restraints: Path | None = None, **extra_data: Any)[source]
System-level inputs shared by multiple pipeline steps.
This wrapper normalises common fields (paths, anchor atoms, etc.) while still allowing arbitrary extra keys. Paths are converted to
pathlib.Pathinstances, making downstream usage safer.- Parameters:
param_outdir (Path, optional) – Directory where ligand parameter outputs should be written.
system_name (str, optional) – Logical system name propagated to child steps.
protein_input, system_input, system_coordinate (Path, optional) – Paths to the protein topology/coordinate inputs if supplied.
ligand_paths (dict[str, Path]) – Mapping of ligand identifiers to staged files.
yaml_dir (Path, optional) – Directory containing the originating YAML (useful for resolving relatives).
anchor_atoms (tuple[str, …]) – Anchor atom labels used for restraint placement.
extra_restraints (str, optional) – Optional positional restraint selection string.
extra_restraint_fc (float, optional) – Force constant (kcal/mol/Å^2) applied to
extra_restraints.extra_conformation_restraints (Path, optional) – Path to a conformational restraint JSON file.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- anchor_atoms: tuple[str, ...]
- copy_with(**updates: Any) SystemParams[source]
Create a new
SystemParamswith additional updates.- Parameters:
**updates – Keyword overrides applied atop the existing data.
- Returns:
A new instance incorporating the updates.
- Return type:
SystemParams
- extra_conformation_restraints: Path | None
- extra_restraint_fc: float | None
- extra_restraints: str | None
- get(item: str, default: Any = None) Any[source]
Safe lookup for a field or extra value with a default.
- Parameters:
item (str) – Key to fetch.
default (Any, optional) – Value returned when the key is missing or None.
- Returns:
Requested value or the default.
- Return type:
Any
- ligand_paths: Dict[str, Path]
- model_config = {'arbitrary_types_allowed': True, 'extra': 'allow'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- param_outdir: Path | None
- protein_input: Path | None
- system_coordinate: Path | None
- system_input: Path | None
- system_name: str | None
- to_mapping() Dict[str, Any][source]
Convert the model (including extras) to a plain dictionary.
- Returns:
Merged view of standard fields and extras.
- Return type:
dict[str, Any]
- yaml_dir: Path | None
Runtime Modules#
- class batter.runtime.portable.Artifact(name: str, relpath: Path, kind: Literal['file', 'dir']='file', sha256: str = '', size: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectA single artifact tracked by the manifest.
- Parameters:
name (str) – Logical name (e.g., “fe/index” or “traj/lig1.zarr”).
relpath (pathlib.Path) – Path relative to the store root.
kind ({“file”,”dir”}) – File or directory artifact.
sha256 (str) – SHA-256 of the file (empty for directories).
size (int) – Size in bytes (files only; 0 for directories).
meta (dict) – Free-form metadata (component, lambda, etc.).
- kind: Literal['file', 'dir']
- meta: Dict[str, Any]
- name: str
- relpath: Path
- sha256: str
- size: int
- class batter.runtime.portable.ArtifactManifest[source]
Bases:
objectIn-memory manifest for a portable artifact store.
Notes
Paths are relative to enable rebasing the store to a new root.
Serialize with
to_dict()/from_dict().
- add(art: Artifact, overwrite: bool = False) None[source]
- exists(name: str) bool[source]
- classmethod from_dict(d: Dict[str, Any]) ArtifactManifest[source]
- get(name: str) Artifact[source]
- items() List[Artifact][source]
Return all registered artifacts sorted by name.
- Returns:
Snapshot of the manifest contents.
- Return type:
list[Artifact]
- names() List[str][source]
- to_dict() Dict[str, Any][source]
- class batter.runtime.portable.ArtifactStore(root: Path | str, manifest_name: str = 'manifest.json')[source]
Bases:
objectPortable store with a relocatable root and JSON manifest.
- Parameters:
root (path-like) – Store root directory (e.g., a run’s work directory).
manifest_name (str) – File name for the manifest JSON under
root(default: “manifest.json”).
Examples
>>> store = ArtifactStore("work/at1r_aai") >>> p = store.put_file(Path("results.txt"), name="fe/latest", dst_rel=Path("fe/results.txt")) >>> store.save_manifest() >>> # move directory to a new cluster... >>> store2 = ArtifactStore("new_root/at1r_aai"); store2.load_manifest() >>> store2.path("fe/latest") new_root/at1r_aai/fe/results.txt
- list_artifacts(*, prefix: str | None = None, kind: Literal['file', 'dir', None] = None) List[Artifact][source]
Inspect manifest entries, optionally filtering by name or kind.
- Parameters:
prefix (str, optional) – When provided, only artifacts whose logical name starts with
prefixare returned.kind ({‘file’, ‘dir’, None}, optional) – Restrict results to files or directories.
None(default) returns both.
- Returns:
Matching artifacts in alphabetical order.
- Return type:
list[Artifact]
- load_manifest() None[source]
Load the manifest JSON from
root.
- path(name: str) Path[source]
Resolve an artifact name to an absolute path under the current root.
- put_dir(src_dir: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]
Copy a directory under the store and record it in the manifest.
Notes
No per-file hashing; use
put_file()for critical files.
- put_file(src: Path, name: str, dst_rel: Path | None = None, overwrite_manifest_entry: bool = False) Path[source]
Copy a file under the store and record it in the manifest.
- Parameters:
src (path-like) – Source file path (must exist and be a file).
name (str) – Logical artifact name to register under.
dst_rel (path-like, optional) – Relative destination path. Defaults to
name.replace('/', '_').overwrite_manifest_entry (bool) – If True, allows replacing an existing manifest entry with the same name.
- Returns:
Absolute destination path.
- Return type:
pathlib.Path
- rebase(new_root: Path | str) ArtifactStore[source]
Create a new store view with the same manifest but a different root.
- Parameters:
new_root (path-like) – Target root directory.
- Returns:
New store pointing to
new_root.- Return type:
ArtifactStore
- save_manifest() Path[source]
Write the manifest JSON under
root(atomic).
- class batter.runtime.fe_repo.FERecord(*, run_id: str, ligand: str, mol_name: str, system_name: str, fe_type: str, temperature: float, method: Literal['mbar', 'ti']='mbar', total_dG: float, total_se: float = 0.0, components: List[str] = <factory>, created_at: str = <factory>, windows: List[WindowResult] = <factory>, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None, include_in_analysis: bool = True, status: Literal['success', 'failed', 'unbound']='success')[source]
A full FE result bundle (portable, versioned).
- Parameters:
run_id (str) – Unique run identifier.
ligand (str) – Ligand identifier.
mol_name (str) – Molecule resname.
system_name (str) – Logical system name.
fe_type (str) – Protocol type (e.g., ‘uno_rest’, ‘asfe’).
temperature (float) – Simulation temperature (K).
method ({“mbar”,”ti”}) – Integration method.
total_dG (float) – Total free energy (kcal/mol).
total_se (float) – Standard error (kcal/mol).
components (list[str]) – Active components in this run.
created_at (str) – ISO-8601 timestamp (UTC, Z-suffix).
windows (list[WindowResult]) – Per-window results.
canonical_smiles (str, optional) – Canonicalised ligand SMILES captured during parameterization.
original_name (str, optional) – Original ligand identifier or title when known.
original_path (str, optional) – Source path of the ligand before staging.
protocol (str) – Logical protocol used to generate the result (e.g.,
"abfe").analysis_start_step (int, optional) – First production step included in analysis.
n_bootstraps (int, optional) – Number of MBAR bootstrap resamples used during analysis.
include_in_analysis (bool) – Whether downstream aggregate analyses, such as Cinnabar export, should use this record.
status ({“success”,”failed”,”unbound”}) – Final status recorded for the ligand.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- analysis_start_step: int | None
- canonical_smiles: str | None
- components: List[str]
- created_at: str
- fe_type: str
- include_in_analysis: bool
- ligand: str
- method: Literal['mbar', 'ti']
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- mol_name: str
- n_bootstraps: int | None
- original_name: str | None
- original_path: str | None
- protocol: str
- run_id: str
- status: Literal['success', 'failed', 'unbound']
- system_name: str
- temperature: float
- total_dG: float
- total_se: float
- windows: List[WindowResult]
- class batter.runtime.fe_repo.FEResultsRepository(store: ArtifactStore)[source]
- index() DataFrame[source]
- ligand_dir(run_id: str, ligand: str) Path[source]
- load(run_id: str, ligand: str) FERecord[source]
- record_failure(run_id: str, ligand: str, system_name: str, temperature: float, *, status: Literal['failed', 'unbound'], reason: str | None = None, canonical_smiles: str | None = None, original_name: str | None = None, original_path: str | None = None, protocol: str = 'abfe', analysis_start_step: int | None = None, n_bootstraps: int | None = None) None[source]
- save(rec: FERecord, copy_from: Path | None = None) None[source]
- set_analysis_inclusion(*, run_id: str, ligand: str, include: bool, analysis_start_step: int | None = None, n_bootstraps: int | None = None) int[source]
Set
include_in_analysisfor matching rows inresults/index.csv.
- class batter.runtime.fe_repo.WindowResult(*, component: str, lam: float, dG: float, dG_se: float = 0.0, n_samples: int = 0, meta: Dict[str, ~typing.Any]=<factory>)[source]
Result for a single lambda window/component.
- Parameters:
component (str) – Component key (e.g., ‘e’, ‘v’, ‘z’).
lam (float) – Lambda value in [0, 1].
dG (float) – Free-energy increment (kcal/mol).
dG_se (float) – Standard error (kcal/mol).
n_samples (int) – Samples (or effective sample size).
meta (dict) – Extra metadata.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- component: str
- dG: float
- dG_se: float
- lam: float
- meta: Dict[str, Any]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_samples: int
Systems Modules#
- class batter.systems.core.CreateSystemLike(*args, **kwargs)[source]
Structural typing interface for inputs to a system builder.
Notes
This Protocol is intentionally minimal to avoid import cycles with Pydantic models. Any object with these attributes (e.g., a Pydantic model instance) satisfies the protocol.
- anchor_atoms: Sequence[str]
- ligand_ff: str
- ligand_paths: Sequence[Path]
- lipid_mol: Sequence[str]
- other_mol: Sequence[str]
- overwrite: bool
- protein_input: Path | None
- retain_lig_prot: bool
- system_coordinate: Path | None
- system_name: str
- system_topology: Path | None
- class batter.systems.core.SimSystem(name: str, root: Path, topology: Path | None = None, coordinates: Path | None = None, protein: Path | None = None, ligands: Path, ...]=(), lipid_mol: Tuple[str, ...]=(), other_mol: Tuple[str, ...]=(), anchors: Tuple[str, ...]=(), meta: SystemMeta = <factory>)[source]
Immutable descriptor of a simulation system and its on-disk artifacts.
- Parameters:
name (str) – Logical system name (e.g.,
"AT1R_AAI").root (pathlib.Path) – Working directory where artifacts live. This directory is considered relocatable; other modules should store relative paths when possible.
topology (pathlib.Path, optional) – Path to an explicit topology (e.g., AMBER PRMTOP). May be
Noneif the builder generates it later.coordinates (pathlib.Path, optional) – Coordinates or restart file (e.g., RST7/INPCRD).
protein (pathlib.Path, optional) – Input protein structure file (PDB/mmCIF).
ligands (tuple[pathlib.Path, …]) – One or more ligand structure files.
lipid_mol (tuple[str, …]) – Lipid names present in the system (e.g.,
("POPC",)).other_mol (tuple[str, …]) – Other cofactor present in the system``).
anchors (tuple[str, …]) – Anchor atoms in the form
"RESID@ATOM"(e.g.,"85@CA").meta (SystemMeta) – Free-form metadata bundle for provenance (e.g., software versions).
- anchors: Tuple[str, ...]
- coordinates: Path | None
- ligands: Tuple[Path, ...]
- lipid_mol: Tuple[str, ...]
- meta: SystemMeta
- name: str
- other_mol: Tuple[str, ...]
- path(*parts: str | Path) Path[source]
Join
rootwith the provided path segments.- Parameters:
*parts (str or Path) – Relative path components appended in order.
- Returns:
Absolute path pointing inside
root.- Return type:
pathlib.Path
- protein: Path | None
- root: Path
- topology: Path | None
- with_artifacts(**kw) SimSystem[source]
Return a new
SimSystemwith updated artifact attributes.Examples
>>> sys = SimSystem(name="X", root=Path("work/X")) >>> sys2 = sys.with_artifacts(topology=Path("work/X/top.prmtop"))
- with_meta(**updates: Any) SimSystem[source]
Return a copy of the system with merged metadata.
- Parameters:
**updates – Keyword arguments forwarded to
SystemMeta.merge().- Returns:
Copy of the system containing the updated metadata bundle.
- Return type:
SimSystem
- class batter.systems.core.SystemBuilder(*args, **kwargs)[source]
Interface for creating or updating on-disk artifacts for a system.
- build(system, args)[source]
Materialize artifacts for
systemusingargs, returning an updatedSimSystem. Implementations must be idempotent: callingbuildtwice with the same inputs must produce the same state without corrupting outputs.
- build(system: SimSystem, args: CreateSystemLike) SimSystem[source]
- class batter.systems.core.SystemMeta(ligand: str | None = None, residue_name: str | None = None, mode: str | None = None, param_dir_dict: Dict[str, str]=<factory>, extras: Dict[str, ~typing.Any]=<factory>)[source]
Structured metadata attached to a
SimSystem.- Parameters:
ligand (str, optional) – Ligand identifier associated with the system (if applicable).
residue_name (str, optional) – Residue name used for the ligand.
mode (str, optional) – High-level mode indicator (e.g.,
"MABFE"vs"MASFE").param_dir_dict (dict[str, str]) – Mapping from residue names to parameter storage directories.
extras (dict[str, Any]) – Additional context stored alongside the known fields.
- extras: Dict[str, Any]
- classmethod from_mapping(data: Mapping[str, Any] | None) SystemMeta[source]
Construct a
SystemMetafrom a mapping-like object.- Parameters:
data (mapping or None) – Source metadata. If already a
SystemMeta, it is returned.- Returns:
Normalised metadata object.
- Return type:
SystemMeta
- get(key: str, default: Any = None) Any[source]
Retrieve a value by key with an optional default.
- Parameters:
key (str) – Metadata key.
default (Any, optional) – Value returned when the key is missing.
- Returns:
Stored value or the default.
- Return type:
Any
- ligand: str | None
- merge(**updates: Any) SystemMeta[source]
Create a new
SystemMetawith updated values.- Parameters:
**updates – Keyword overrides applied to the existing metadata.
- Returns:
New instance containing the merged metadata.
- Return type:
SystemMeta
- mode: str | None
- param_dir_dict: Dict[str, str]
- residue_name: str | None
- to_dict() Dict[str, Any][source]
Convert the metadata to a plain dictionary.
- Returns:
All known fields plus extra entries.
- Return type:
dict[str, Any]
- class batter.systems.mabfe.MABFEBuilder(*args, **kwargs)[source]
Builder for membrane/absolute free-energy (MABFE) systems.
This builder prepares a shared working directory under
system.rootand, optionally, stages all ligands at once into per-ligand subfolders.Directory layout (relative to
system.root):inputs/ # canonical copies of user-provided inputs artifacts/ # files produced by builders (e.g., PRMTOP, RST7) simulations/ <LIG1>/inputs/ligand.<ext> artifacts/ <LIG2>/inputs/ligand.<ext> artifacts/ ...
- build(system: SimSystem, args: CreateSystemLike) SimSystem[source]
Prepare the shared system area (stage protein/topology/coordinates/inputs).
Uses the actual suffixes from user inputs (no hard-coded extensions).
- build_all_ligands(parent: SimSystem, lig_paths: Sequence[Path], overwrite: bool = False) Dict[str, SimSystem][source]
Stage all ligands at once under
parent.root/simulations/<NAME>/....Ligands are copied as
inputs/ligand.<ext>using each source’s suffix.
- static make_child_for_ligand(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
Create a single per-ligand child system under
simulations/<NAME>/with ligand.<ext>.
- batter.systems.mabfe.make_ligand_subsystem(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
- batter.systems.mabfe.prepare_subsystems_for_ligands(parent: SimSystem, lig_paths: Iterable[Path]) Dict[str, SimSystem][source]
- class batter.systems.masfe.MASFEBuilder(*args, **kwargs)[source]
Builder for membrane-free (solvation) absolute free-energy (MASFE) systems.
This builder prepares a shared working directory under
system.rootand, optionally, stages all ligands at once into per-ligand subfolders.Differences vs MABFE:
No protein/topology/coordinates are required or staged.
The resulting
SimSystemstoresNonefor protein, topology, and coordinates.
Directory layout (relative to
system.root):inputs/ # canonical copies of user-provided ligand inputs artifacts/ # files produced by builders simulations/ <LIG1>/inputs/ligand.<ext> artifacts/ <LIG2>/inputs/ligand.<ext> artifacts/ ...
- build(system: SimSystem, args: CreateSystemLike) SimSystem[source]
Prepare the shared system area (stage ligand inputs).
Uses the actual suffixes from user inputs (no hard-coded extensions).
- build_all_ligands(parent: SimSystem, lig_paths: Sequence[Path], overwrite: bool = False) Dict[str, SimSystem][source]
Stage all ligands at once under
parent.root/simulations/<NAME>/....Ligands are copied as
inputs/ligand.<ext>using each source’s suffix.
- static make_child_for_ligand(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
Create a single per-ligand child system under
simulations/<NAME>/with ligand.<ext>.
- batter.systems.masfe.make_ligand_subsystem_masfe(parent: SimSystem, lig_name: str, lig_src: Path) SimSystem[source]
- batter.systems.masfe.prepare_subsystems_for_ligands_masfe(parent: SimSystem, lig_paths: Iterable[Path]) Dict[str, SimSystem][source]
Analysis Modules#
- class batter.analysis.analysis.BoreschAnalysis(disangfile, k_r, k_a, temperature)[source]
Bases:
FEAnalysisBaseInitialize the Boresch analysis with the disang file and parameters.
- Parameters:
disangfile (str) – The path to the disang file containing the anchor atoms.
k_r (float) – The force constant for the translation restraint.
k_a (float) – The force constant for the angle and dihedral restraints. They are the same (they don’t have to be).
temperature (float) – The temperature in Kelvin for the analysis.
- static fe_int(r1_0, a1_0, t1_0, a2_0, t2_0, t3_0, k_r, k_a, temperature)[source]
Calculate the analytical free energy of boresch restraint. from BAT.py
- plot_convergence(ax=None, **kwargs)[source]
no convergence for analytical results
- run_analysis()[source]
Run the analytical analysis for Boresch restraint.
- class batter.analysis.analysis.FEAnalysisBase[source]
Bases:
ABCMinimal interface shared across component analysis routines.
- Variables:
results (dict) – Storage for the scalar FE, uncertainty, convergence tables, and FE time series generated by subclasses.
- property convergence
- dump(filename='results.json')[source]
Store results to JSON (omit heavy convergence tables).
- property fe
- property fe_error
- property fe_timeseries
- abstractmethod plot_convergence(ax=None, **kwargs)[source]
- abstractmethod run_analysis()[source]
- class batter.analysis.analysis.MBARAnalysis(lig_folder: str, component: str, windows: List[int], temperature: float, energy_unit: str = 'kcal/mol', analysis_start_step: int = 0, detect_equil: bool = True, n_bootstraps: int = 0, n_jobs: int = 6, load: bool = False, dt: float = 0.0, ntwx: int | None = None)[source]
Bases:
FEAnalysisBasePost-process a single component with
alchemlyb.estimators.MBAR.- Parameters:
lig_folder (str) – Absolute path to the ligand work directory.
component (str) – Component identifier (e.g.,
"e"or"m").windows (list[int]) – Lambda windows present for the component.
temperature (float) – Simulation temperature in Kelvin.
energy_unit ({“kcal/mol”, “kJ/mol”, “kT”}, optional) – Output energy unit. Internally every value is accumulated in units of
kTand converted before publishing the results.analysis_start_step (int, optional) – Discard frames with step <= this value before analysis.
detect_equil (bool, optional) – When
Truethe equilibration time of each window is detected and the pre-equilibrated portion is discarded.n_bootstraps (int, optional) – Number of bootstrap samples handed to
MBAR.n_jobs (int, optional) – Level of joblib parallelism when parsing windows.
load (bool, optional) – When
Truereuse cached*_df_list.picklefiles if available.
- property data_list: List[DataFrame]
- get_mbar_data() None[source]
Parse and cache the not reduced potentials for all lambda windows.
Notes
The concatenated dataframe is stored in
self._u_dfwhile the list of per-window frames is available viadata_list.
- plot_block_convergence(ax=None, **kwargs)[source]
- plot_convergence(save_path: str | None = None, title: str | None = None)[source]
- plot_overlap_matrix(ax=None, **kwargs)[source]
- plot_time_convergence(ax=None, **kwargs)[source]
- run_analysis() None[source]
- property u_df: DataFrame
- class batter.analysis.analysis.RESTMBARAnalysis(lig_folder: str, component: str, windows: List[int], temperature: float, energy_unit: str = 'kcal/mol', analysis_start_step: int = 0, detect_equil: bool = True, n_bootstraps: int = 0, n_jobs: int = 6, load: bool = False, dt: float = 0.0, ntwx: int | None = None)[source]
Bases:
MBARAnalysisMBAR analysis variant for restraint components that require cpptraj traces.
- class batter.analysis.analysis.SilenceAlchemlybOnly[source]
Bases:
object
- batter.analysis.analysis.analyze_lig_task(lig_path: str, lig: str, components: List[str], rest: Tuple[float, float, float, float, float], temperature: float, water_model: str, component_windows_dict: Dict[str, List[int]], rocklin_correction: bool = False, analysis_start_step: int = 0, raise_on_error: bool = True, mol: str = 'LIG', n_workers: int = 4, n_bootstraps: int = 0, dt: float = 0.0, ntwx: int = 0)[source]
Analyze one lig under lig_path for the requested components.
- batter.analysis.analysis.generate_results_rest(md_sim_files: List[str], comp: str, blocks: int = 5, top: str = 'full') None[source]
Build a cpptraj input on the fly using ‘restraints.in’ template in cwd, swapping the topology to ../{comp}-1/{top}.prmtop and appending trajins.
Helpers for converting BATTER RBFE results into Cinnabar FEMap objects.
- class batter.analysis.cinnabar.CinnabarConversionResult(femap: 'Any', edge_summary: 'pd.DataFrame', raw_signed: 'pd.DataFrame', merge_bidirectional: 'bool' = True, exp_summary: 'pd.DataFrame | None' = None, absolute_summary: 'pd.DataFrame | None' = None, absolute_warning: 'str | None' = None, ligand_assets: 'dict[str, dict[str, str]]'=<factory>, edge_assets: 'dict[str, dict[str, str]]'=<factory>)[source]
Bases:
object- absolute_summary: DataFrame | None = None
- absolute_warning: str | None = None
- edge_assets: dict[str, dict[str, str]]
- edge_summary: DataFrame
- exp_summary: DataFrame | None = None
- femap: Any
- ligand_assets: dict[str, dict[str, str]]
- merge_bidirectional: bool = True
- raw_signed: DataFrame
- batter.analysis.cinnabar.auto_write_rbfe_cinnabar_for_run(work_dir: str | Path, run_id: str, *, out_dir: str | Path | None = None, combine_by_run_first: bool = True, merge_bidirectional: bool = True, write_plots: bool = True, write_cycle_closure: bool = True, absolute_offset: float = 0.0) dict[str, Any][source]
Write a per-run RBFE Cinnabar bundle plus a replicate-aware follow-up note.
- batter.analysis.cinnabar.build_batter_rbfe_cinnabar(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]
- batter.analysis.cinnabar.build_batter_rbfe_cinnabar_by_run(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) dict[str, CinnabarConversionResult][source]
- batter.analysis.cinnabar.build_batter_rbfe_cinnabar_from_runs(runs: Sequence[tuple[str | Path, str]], *, ligands: Sequence[str] | None = None, edge_separator: str = '~', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, source: str = 'BATTER_RBFE', exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]
- batter.analysis.cinnabar.convert_cinnabar_outputs_to_csv(bundle_dir: str | Path, out_dir: str | Path, *, relative_name: str = 'relative.csv', absolute_name: str = 'absolute.csv', require_absolute: bool = False) dict[str, Path][source]
Load a Cinnabar bundle directory and rewrite merged relative/absolute CSVs.
- batter.analysis.cinnabar.dataframe_to_cinnabar(rbfe_df: DataFrame, *, ligand_column: str = 'ligand', dg_column: str = 'total_dG', se_column: str = 'total_se', run_column: str = 'run_id', status_column: str = 'status', success_value: str = 'success', temperature_column: str = 'temperature', edge_separator: str = '~', source: str = 'BATTER_RBFE', uncertainty_mode: Literal['ivw', 'sample', 'max'] = 'max', combine_by_run_first: bool = True, merge_bidirectional: bool = True, experimental_df: DataFrame | None = None, exp_ligand_column: str = 'ligand', exp_abfe_column: str = 'abfe', exp_error_column: str | None = None, exp_status_column: str | None = None, exp_success_value: str = 'success', exp_temperature_column: str | None = None, exp_source: str = 'experiment', exp_value_unit: Any = 'kcal/mol', exp_error_unit: Any = None) CinnabarConversionResult[source]
Convert an RBFE dataframe into a Cinnabar
FEMapand summary tables.
- batter.analysis.cinnabar.load_batter_rbfe_results(work_dir: str | Path, *, run_ids: Sequence[str] | None = None, ligands: Sequence[str] | None = None, edge_separator: str = '~') DataFrame[source]
Load stored BATTER FE records and keep only RBFE-like edge rows.
- batter.analysis.cinnabar.load_batter_rbfe_results_from_runs(runs: Sequence[tuple[str | Path, str]], *, ligands: Sequence[str] | None = None, edge_separator: str = '~') DataFrame[source]
Load RBFE rows from explicit
(work_dir, run_id)inputs.
- batter.analysis.cinnabar.read_cinnabar_outputs(bundle_dir: str | Path, *, require_absolute: bool = False) tuple[DataFrame, DataFrame][source]
Read merged relative and absolute Cinnabar tables from an export bundle.
The
*_uncorrectedcolumns are copied from Cinnabar’s original relative and absolute CSVs. The*_cycle_closurecolumns are merged from the SFC outputs,cycle_closure_edges.csvandcycle_closure_nodes.csv.
- batter.analysis.cinnabar.summarize_directionality(edge_summary: DataFrame) dict[str, Any][source]
Summarize whether an edge table contains reciprocal directional pairs.
- batter.analysis.cinnabar.write_cinnabar_outputs(result: CinnabarConversionResult, out_dir: str | Path, *, method_name: str = 'BATTER', target_name: str = '', write_plots: bool = True, absolute_offset: float = 0.0, write_cycle_closure: bool = True) dict[str, Path][source]
Write stable on-disk outputs for a converted Cinnabar bundle.
State-function based free-energy correction for RBFE networks.
Acknowledgement#
This module implements the matrix-based State-Function Based Free Energy Correction (SFC) workflow for BATTER’s analysis API, following the article and supporting information cited below.
Reference#
Liu, R.; Lai, Y.; Yao, Y.; Huang, W.; Zhong, Y.; Luo, H.-B.; Li, Z. State Function-Based Correction: A Simple and Efficient Free-Energy Correction Algorithm for Large-Scale Relative Binding Free-Energy Calculations. J. Phys. Chem. Lett. 2025, 16, 23, 5763-5768. doi:10.1021/acs.jpclett.5c01119
The historical cycle_closure_* function names are kept for compatibility
with the existing BATTER Cinnabar integration. They now run SFC/WSFC rather
than the earlier cycle-enumeration WCC algorithm.
- class batter.analysis.cycle_closure.CycleClosureEdge(label_a: str, label_b: str, ddg: float, uncertainties: tuple[float, ...] = ())[source]
Bases:
objectOne directed RBFE edge used as SFC input.
- Parameters:
label_a, label_b – Ligand labels defining the edge direction.
ddg (float) – Relative free energy for
label_a -> label_b.uncertainties (tuple[float, …]) – Optional standard-error columns. Each supplied column creates one WSFC estimate using uncertainty-derived weights.
- ddg: float
- label_a: str
- label_b: str
- uncertainties: tuple[float, ...] = ()
- class batter.analysis.cycle_closure.CycleClosureResult(reference: str, reference_free_energy: float, node_results: DataFrame, edge_results: DataFrame, cycles: tuple[tuple[str, ...], ...] = (), iterations: tuple[int, ...] = (), converged: tuple[bool, ...] = (), method: str = 'sfc', schemes: tuple[str, ...] = ())[source]
Bases:
objectSFC result tables and metadata.
- converged: tuple[bool, ...] = ()
- cycles: tuple[tuple[str, ...], ...] = ()
- edge_results: DataFrame
- iterations: tuple[int, ...] = ()
- method: str = 'sfc'
- node_results: DataFrame
- reference: str
- reference_free_energy: float
- schemes: tuple[str, ...] = ()
- batter.analysis.cycle_closure.StateFunctionCorrectionEdge
alias of
CycleClosureEdge
- batter.analysis.cycle_closure.StateFunctionCorrectionResult
alias of
CycleClosureResult
- batter.analysis.cycle_closure.calculate_cycle_closure(edges: Iterable[CycleClosureEdge | Sequence[object]], *, reference: str | None = None, reference_free_energy: float = 0.0, reference_weight: float = 1000000.0, require_cycles: bool | None = None, **_compat_kwargs) CycleClosureResult[source]
Run SFC/WSFC correction on an RBFE graph.
require_cyclesand extra keyword arguments are accepted for compatibility with the previous WCC implementation. SFC does not enumerate cycles and can operate on any connected RBFE graph.
- batter.analysis.cycle_closure.calculate_state_function_correction(edges: Iterable[CycleClosureEdge | Sequence[object]], *, reference: str | None = None, reference_free_energy: float = 0.0, reference_weight: float = 1000000.0, require_cycles: bool | None = None, **_compat_kwargs) CycleClosureResult
Run SFC/WSFC correction on an RBFE graph.
require_cyclesand extra keyword arguments are accepted for compatibility with the previous WCC implementation. SFC does not enumerate cycles and can operate on any connected RBFE graph.
- batter.analysis.cycle_closure.cycle_closure_from_dataframe(df: DataFrame, *, label_a_col: str = 'labelA', label_b_col: str = 'labelB', ddg_col: str | None = None, uncertainty_cols: Sequence[str] | None = None, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult[source]
Build SFC input from a dataframe and run the correction.
- batter.analysis.cycle_closure.cycle_closure_from_file(path: str | Path, *, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult[source]
Read an SFC-style input file and run state-function correction.
- batter.analysis.cycle_closure.read_cycle_closure_file(path: str | Path) DataFrame[source]
Read a whitespace-delimited SFC input file.
The first three columns are named
labelA,labelB, andddG. Additional columns are treated as standard-error columns namedstd1,std2, etc.
- batter.analysis.cycle_closure.read_state_function_correction_file(path: str | Path) DataFrame
Read a whitespace-delimited SFC input file.
The first three columns are named
labelA,labelB, andddG. Additional columns are treated as standard-error columns namedstd1,std2, etc.
- batter.analysis.cycle_closure.state_function_correction_from_dataframe(df: DataFrame, *, label_a_col: str = 'labelA', label_b_col: str = 'labelB', ddg_col: str | None = None, uncertainty_cols: Sequence[str] | None = None, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult
Build SFC input from a dataframe and run the correction.
- batter.analysis.cycle_closure.state_function_correction_from_file(path: str | Path, *, reference: str | None = None, reference_free_energy: float = 0.0, **kwargs) CycleClosureResult
Read an SFC-style input file and run state-function correction.
Utilities for inspecting replica-exchange simulations.
- class batter.analysis.remd.RemdLog(inputfile: str)[source]
Bases:
objectRead and analyse AMBER
remlogfiles.The parser reconstructs the replica $leftrightarrow$ state mapping at each exchange step and reports high-level metrics such as average single-pass duration and the number of round trips.
- Parameters:
inputfile (str) – Path to the
remlogtext file produced by AMBER.
- analyze() Dict[str, float | List[float]][source]
Summarise the replica trajectory.
- Returns:
Dictionary with the same keys as
get_remd_info().- Return type:
dict
- classmethod get_remd_info(inputfile: str) Dict[str, float | List[float]][source]
Convenience helper that parses and analyses a
remlogfile.- Parameters:
inputfile (str) – Path to the
remlogtext file.- Returns:
Same structure as
analyze().- Return type:
dict
- batter.analysis.remd.plot_trajectory(replica_trajectory, figsize=(10, 6), alpha=0.8, linewidth=1.5, subplot=False, ncols=4)[source]
Visualise the replica walk through thermodynamic states.
- Parameters:
replica_trajectory (numpy.ndarray) – Array of shape
(n_replica, n_step + 1)containing state indices.figsize (tuple, optional) – Base figure size. When
subplot=Truethe width/height apply to each panel instead of the aggregate.alpha (float, optional) – Line transparency used for individual replica traces.
linewidth (float, optional) – Width of trajectory lines.
subplot (bool, optional) – When
True, render one subplot per replica; otherwise plot all replicas on a shared axis.ncols (int, optional) – Number of subplot columns when
subplot=True.
Small numerical helpers used across batter.analysis.
- batter.analysis.utils.MakeChunksWithSize(istart: int, istop: int, size: int) List[List[int]][source]
Build index chunks covering
[istart, istop)with approximatelysizeelements.- Parameters:
istart (int) – Starting index (inclusive).
istop (int) – Stopping index (exclusive).
size (int) – Target chunk size prior to merging trailing fragments.
- Returns:
Collection of contiguous index lists.
- Return type:
list[list[int]]
- batter.analysis.utils.MakeGroupedChunks(ene: ndarray, size: int) List[List[int]][source]
Merge adjacent chunks when their means are statistically indistinguishable.
- Parameters:
ene (numpy.ndarray) – One-dimensional array containing the energy trace used for grouping.
size (int) – Requested minimum chunk size prior to the adaptive merge step.
- Returns:
List of index groups representing contiguous frames with similar means.
- Return type:
list[list[int]]
- batter.analysis.utils.SizedChunks(lst: Iterable[int], n: int) Generator[List[int], None, None][source]
Yield successive
n-sized chunks from an iterable.- Parameters:
lst (Iterable[int]) – Source iterable that should be partitioned. The iterable is consumed, so pass a sequence (e.g.
range) if it needs to be reused.n (int) – Requested chunk size.
- Yields:
list[int] – Consecutive slices of length
n(the final chunk may be shorter).
- batter.analysis.utils.exclude_outliers(df: DataFrame, iclam: int) DataFrame[source]
Remove energy spikes that would destabilise MBAR fits.
- Parameters:
df (pandas.DataFrame) – Reduced potential values with time points along the rows and lambda states in the columns.
iclam (int) – Index of the reference lambda column. The algorithm analyses this column to decide which trajectory chunks should be discarded.
- Returns:
Filtered dataframe with the same columns as
dfbut potentially fewer rows if outliers were detected.- Return type:
pandas.DataFrame
Notes
The implementation mirrors the heuristics used in the original
fe-toolkitscripts: frames are chunked into ~200-sample blocks, grouped via a Welch t-test, and discarded whenever any lambda exhibits a value more than3σ + 1000kcal/mol below the reference median (after correcting for mixed precision offsets).