Cookbook

Task-oriented recipes for the workflows that landed in v4.x. Each section is self-contained — copy, paste, modify the inputs, run.

For background on the underlying CLI flags and Python API, see the programmatic API guide and the main README.

1. One file → one structured result

The lowest-friction path for notebooks and scripts. Replaces the older 15-positional-arg calc_bbe() constructor.

from goodvibes import compute_thermo

r = compute_thermo("ethane.log", QH=True, spc="TZ", temperature=313.15)

print(f"qh-G(T) = {r.qh_gibbs_free_energy:.6f} Hartree")
print(f"point group: {r.point_group}, σ = {r.symmno}")
print(f"level of theory (auto-detected): {r.level_of_theory}")
print(f"frequency scale factor (auto-applied): {r.bbe.scale_fac}")

compute_thermo returns a frozen ThermoResult dataclass with every attribute calc_bbe produces, plus r.bbe and r.qcdata for advanced reads. Defaults match the CLI: gas-phase concentration (P/RT), auto-lookup of the frequency scaling factor from the level of theory via the Truhlar database.

2. Batch a directory with parallel parsing → DataFrame

The most common notebook workflow: parse hundreds of conformers, filter, sort, export.

import glob
from goodvibes import compute_batch, to_dataframe
from goodvibes.constants import KCAL_TO_AU

paths = sorted(glob.glob("conformers/*.log"))
results = compute_batch(paths, jobs=8)        # 8 worker processes

df = to_dataframe(results)
df = df.sort_values("qh_gibbs_free_energy")
df["ΔG_kcal"] = (df.qh_gibbs_free_energy - df.qh_gibbs_free_energy.min()) * KCAL_TO_AU

# Drop conformers more than 3 kcal/mol above the lowest
keep = df[df["ΔG_kcal"] < 3.0]
print(f"{len(keep)} of {len(df)} conformers within 3 kcal/mol of the lowest")
keep[["name", "qh_gibbs_free_energy", "ΔG_kcal"]].to_csv("survivors.csv", index=False)

jobs=0 uses all CPU cores. Output preserves input order. Pandas is optional — install with pip install goodvibes[full].

The same thing from the shell, no Python:

goodvibes conformers/*.log --jobs 8 --csv all_thermo.csv

3. N-way selectivity (replaces `--ee`)

The v4.1 redesign generalizes --ee a:b (2-bucket only) to N-way selectivity. Each bucket is named explicitly with --label NAME=PATTERN (repeatable). The patterns are fnmatch globs against the input filenames — no filesystem walks.

The example fixture in goodvibes/examples/selectivity/ is a Diels–Alder TS set: 8 transition states across two regiochemistries (1,2- vs 1,4-) and two diastereomers (exo / endo).

2-way (exo vs endo)

cd goodvibes/examples/selectivity
goodvibes DA_*.out --label exo='*_exo_*' --label endo='*_endo_*'

Selectivity, Boltzmann-averaged (gibbs, T = 298.15 K)
       Species   Files   Population (%)   ΔΔG (kcal/mol)
       exo           4             2.56            2.156
★      endo          4            97.44            0.000

Ratio exo:endo = 3:97   Major: endo   excess = 94.88%   ΔΔG = 2.16 kcal/mol

Selectivity, Lowest conformer only (gibbs, T = 298.15 K)
       Species   Files   Population (%)   ΔΔG (kcal/mol)
       exo           1             1.84            2.355
★      endo          1            98.16            0.000

Ratio exo:endo = 2:98   Major: endo   excess = 96.31%   ΔΔG = 2.36 kcal/mol

The two tables answer different questions: the Boltzmann row shows the selectivity once you average over conformers; the lowest-conformer row shows what the selectivity would be if only the most stable TS in each species mattered. The gap between them tells you how much of the selectivity is driven by conformer mixing.

4-way (regio × stereo)

goodvibes DA_*.out \
  --label exo_12='*_exo_12*'   --label endo_12='*_endo_12*' \
  --label exo_14='*_exo_14*'   --label endo_14='*_endo_14*'

For N > 2 the summary line drops excess and ΔΔG (those are 2-bucket concepts) and just reports the ratio — Ratio exo_12:endo_12:exo_14:endo_14 = 2:97:0:0.

Per-species subdirectories

If your conformers are organized into one directory per species, --label patterns are matched against the immediate parent directory’s basename in addition to the file’s basename. So a layout like

selectivity_separated/
  exo/
    DA_exo_12_i.out
    DA_exo_12_ii.out
    ...
  endo/
    DA_endo_12_i.out
    ...

works with the directory names as labels:

cd selectivity_separated
goodvibes */*out --label exo='exo*' --label endo='endo*'

The shell expands */*out to relative paths like exo/DA_exo_12_i.out, and the 'exo*' pattern matches the parent dir exo. The same patterns also keep working on flat layouts (where the species is encoded in the filename), so you don’t need to know in advance which layout your data uses.

JSON output

Add --json results.json and the file gets two top-level blocks, selectivity and selectivity_lowest, each with the per-species populations, ΔΔG, ee (when N=2), and the source files for every species. Schema is 0.4.

Strip plot

To visualize where the selectivity comes from — lowest-TS gap vs conformer mixing — write a per-species ΔG strip plot:

goodvibes DA_*.out \
  --label exo='*_exo_*' --label endo='*_endo_*' \
  --strip-plot selectivity.png

The image shows one column per species with scattered conformer ΔG values (relative to the global lowest). A tight cluster near the bottom of a column means that species is dominated by its lowest conformer; a wide spread means conformer mixing is contributing.

In Python:

import matplotlib.pyplot as plt
from goodvibes import compute_batch
from goodvibes.selectivity import (
    compute_selectivity, parse_label_args, assign_files_to_labels,
)
from goodvibes.plot import plot_selectivity_strip

results = compute_batch(glob.glob("DA_*.out"))
thermo = {r.file: r.qh_gibbs_free_energy for r in results}
labels = parse_label_args(["exo=*_exo_*", "endo=*_endo_*"])
files_per_label = assign_files_to_labels(list(thermo), labels)
sel = compute_selectivity(thermo, files_per_label, 298.15)

ax = plot_selectivity_strip(sel, thermo)
plt.savefig("selectivity.png", dpi=200, bbox_inches="tight")

matplotlib is in the optional [plot] extras (or [full]) — install with pip install goodvibes[plot].

Migration from --ee

# v3.x
goodvibes *.log --ee 'P_R_*:P_S_*'

# v4.x equivalent
goodvibes *.log --label R='P_R_*' --label S='P_S_*'

--ee still works in v4.x with a DeprecationWarning; it’s slated for removal in v5.0.

4. PES with the new YAML format

The legacy line-based PES file (--- # PES markers) is auto-detected and still works, but it isn’t real YAML and has no stoichiometry support. v4.2 adds a proper YAML schema with pathways: / species: / format: top-level keys and a coeff*name syntax for stoichiometric sums.

# azabor_PES_v2.yaml
pathways:
  Ph:
    - "R1-An + Aza-Phos"
    - "R1-Comp + THF"
    - "AmTS + THF"
    - "Azir-Comp + THF"
    - "OpenTS + THF"
    - "Syn-P + THF"

species:
  R1-An:      {files: "r1-li-3thf-c1*"}
  Aza-Phos:   {files: "azaoxy-phosphine-full*"}
  THF:        {files: "thf*"}
  R1-Comp:    {files: "r1-phosphine-2thf-full*"}
  Azir-Comp:  {files: "aziridinium-phos-full*"}
  Syn-P:      {files: "syn-product-phos-full*"}
  OpenTS:     {files: "openTS-phos-full*"}
  AmTS:       {files: "aminationTS-full-unfrz-c1*"}

format:
  units: kcal/mol
  decimals: 1

Stoichiometric example: a bimolecular reaction would write a point as "2*A + B". Each species’ files: is a glob (single string) or explicit list ([a.log, b.log]).

Assigning species by directory. When each species lives in its own subdirectory, use dir: (single) or dirs: (list) instead of file globs:

species:
  R1-An:      {dir: "R1-An"}
  Aza-Phos:   {dir: "Aza-Phos"}
  AmTS:      {dir: "AmTS"}
  # combine if a species has both subdir conformers and a separate
  # explicit file:
  THF:        {files: "thf_extra.log", dir: "THF"}

dir: matches files whose immediate parent directory’s basename equals the value (or matches it as an fnmatch glob — dir: "TS_*" catches every TS_R/, TS_S/, …). Trailing /, /* or /** on the dir name is ignored.

Run it from the directory above the per-species subdirectories:

cd goodvibes/examples/pes_separated
goodvibes */*log --spc tzpop --pes azabor_PES.yaml

The shell */*log glob hands GoodVibes relative paths like R1-An/r1-li-3thf-c1.log — the dir: "R1-An" rule sees R1-An as the parent dir basename and assigns the file there.

Run it:

cd goodvibes/examples/pes
goodvibes *.log --pes azabor_PES_v2.yaml --spc sp_tzpop

By default each species’ contribution is gconf-corrected: lowest qh-G conformer + Boltzmann adjustment + the −R Σ pᵢ ln pᵢ mixing entropy. Two flags change that:

Mode	Flag	What it does
gconf (default)	—	lowest + adjustment + mixing entropy
pure Boltzmann	`--nogconf`	Boltzmann-weighted average, no mixing entropy
lowest only	`--lowest-only`	use each species’ single lowest qh-G conformer

The mode tag appears in the table title:

RXN: Ph  (kcal/mol)  at T = 298.15 K, p = 1 atm — lowest conformer per species

Reaction-profile diagram

goodvibes *.log --pes azabor_PES_v2.yaml --spc sp_tzpop \
                --pes-plot pes.png

Saves a step-plot of the pathway’s qh-G profile (one column per point, horizontal bar at each level, smooth bezier connectors). matplotlib via pip install goodvibes[plot].

If your PES YAML defines multiple pathways (e.g. an R-side and an S-side TS sharing reactants and products), --pes-plot overlays them on the same axes by default — different colors from the matplotlib cycle, with a legend and shared x-axis. Pathways must have the same number of points to be comparable.

For full control over colors, single-pathway selection, point annotations, or per-conformer scatter, drop down to the API:

from goodvibes.pes_loader import load_pes
from goodvibes.plot import plot_pes
import matplotlib.pyplot as plt

pes = load_pes("R_vs_S.yaml", thermo_data)        # 2-pathway YAML
ax = plot_pes(
    pes,
    colors=["#26a6a4", "#e76f51"],     # custom per-pathway palette
    connector_style="bezier",          # or "linear"
    label_points=True,                 # annotate ΔqhG at each level
)
plt.savefig("R_vs_S.png", dpi=200, bbox_inches="tight")

# Or pick one pathway and overlay individual conformer dots:
ax = plot_pes(
    pes, pathway_index=0,
    show_conformers=True,
    thermo_lookup={f: bbe.qh_gibbs_free_energy
                   for f, bbe in thermo_data.items()},
)

The legacy --graph FILE.yaml flag is still supported and reads styling (dpi, color, title, legend, gridlines, ylim, …) from a YAML’s --- # FORMAT block. It will be deprecated in v5.1 once --pes-plot covers the remaining gaps.

5. PES + JSON for downstream analysis

goodvibes *.log --pes azabor_PES_v2.yaml --spc sp_tzpop --json pes.json

The JSON gets a pes block (schema v1.0):

import json

with open("pes.json") as f:
    payload = json.load(f)

for path in payload["pes"]["pathways"]:
    print(f"\n=== {path['name']} ({path['units']}) ===")
    for pt in path["points"]:
        print(f"  {pt['label']:25s}  ΔqhG = {pt['relative']['qh_g']:+7.2f}")

Each point carries label, species (name + coefficient + resolved files), and relative (Δ-values for E, ZPE, H, qh-H, T·S, T·qh-S, G, qh-G, plus SPC variants when --spc was set). Plug straight into plotting libraries or downstream pipelines.

6. Parse once, re-analyze many times

QC outputs are slow to parse, especially for large conformer ensembles or composite-method SPCs. The unified v1.0 JSON payload (--export) captures every parsed field once; subsequent runs read it back via --import and skip the QC files entirely. Useful for re-running at a different temperature, concentration, frequency cutoff, or quasi-RRHO scheme without touching the original .log/.out files.

# First pass — parse + apply SPC + export the structured payload.
goodvibes conformers/*.log --spc TZ --export thermo.json

# Re-run at 350 K with the same files but no QC parsing. Re-pass --spc
# to keep the cached SPC numbers driving G(T)_SPC; drop it for plain G.
goodvibes --import thermo.json --spc TZ -t 350

# Re-run with a stricter Truhlar low-frequency cutoff. Still no parsing.
goodvibes --import thermo.json --spc TZ -f 150 --QH

# Combine cached --spc with selectivity at a new temperature.
goodvibes --import thermo.json --spc TZ -t 313.15 \
          --label R='cat_R*' --label S='cat_S*'

--export writes the same payload as --json, so a single file covers both downstream pipelines and re-import. Once exported, the original .log/.out files can be archived, moved, or deleted — --import works with just the JSON. The --spc energies are cached on the QCData record, so re-passing --spc <suffix> reuses them without ever re-reading the SPC files.