# OPI - ORCA Python Interface

OPI (ORCA Python Interface) is a Python library designed to create input files and parse output from the ORCA quantum chemistry software package. It provides a comprehensive interface for setting up quantum chemical calculations, executing them, and analyzing results through structured Python objects. The library is officially supported by FACCTs, the co-developers of ORCA, and requires ORCA version 6.1 or later.

The library follows a modular design separating structure definition, input parameter configuration, calculation execution, and output parsing. This architecture allows users to work at different levels of abstraction - from simple single-point calculations to complex multi-step workflows. OPI leverages ORCA's JSON output format for robust data extraction and uses Pydantic models for type-safe result handling. The Calculator class serves as the main convenience interface, orchestrating input creation, job execution via the Runner class, and result parsing through the Output class.

## APIs and Functions

### Calculator - Main Interface for ORCA Calculations

The Calculator class combines job setup, execution, and result parsing into a unified interface. It manages the basename, working directory, molecular structure, and all ORCA input parameters through an Input object.

```python
from pathlib import Path
import shutil
from opi.core import Calculator
from opi.input.simple_keywords import Method, BasisSet, Task, Scf
from opi.input.structures import Structure

# Setup calculation directory
wd = Path("RUN")
shutil.rmtree(wd, ignore_errors=True)
wd.mkdir()

# Create calculator with basename and working directory
calc = Calculator(basename="job", working_dir=wd)

# Load molecular structure from XYZ file
calc.structure = Structure.from_xyz("inp.xyz")

# Add calculation parameters via simple keywords
calc.input.add_simple_keywords(
    Scf.NOAUTOSTART,
    Method.HF,
    BasisSet.DEF2_SVP,
    Task.SP
)

# Write input file, execute ORCA, and parse results
calc.write_input()
calc.run()

output = calc.get_output()
if not output.terminated_normally():
    print(f"Calculation failed: {output.get_outfile()}")
    exit(1)

output.parse()
print(f"Final energy: {output.results_properties.geometries[0].single_point_data.finalenergy}")
```

### Input - Configuration of ORCA Parameters

The Input class manages simple keywords, block options, and special parameters like core count and memory. Keywords and blocks are added dynamically and can be retrieved, removed, or cleared.

```python
from opi.core import Calculator
from opi.input.simple_keywords import Dft, BasisSet, DispersionCorrection
from opi.input.blocks import BlockMethod
from opi.input.structures import Structure

calc = Calculator(basename="dft_job", working_dir="RUN")
calc.structure = Structure.from_xyz("molecule.xyz")

# Add simple keywords for DFT calculation
calc.input.add_simple_keywords(
    Dft.B3LYP,
    BasisSet.DEF2_SVP,
    DispersionCorrection.D3
)

# Configure computational resources
calc.input.ncores = 4
calc.input.memory = 4000  # MB per core

# Add block options for dispersion parameters
calc.input.add_blocks(
    BlockMethod(
        d3s6=0.64,
        d3a1=0.3065,
        d3s8=0.9147,
        d3a2=5.0570
    )
)

# Check if keywords exist
has_dft = calc.input.has_simple_keywords(Dft.B3LYP)
print(f"B3LYP added: {has_dft}")

calc.write_input()
calc.run()
```

### Structure - Molecular Geometry Representation

The Structure class represents molecular geometries with atoms, charge, and multiplicity. It supports creation from XYZ files, SMILES strings, and RDKit molecules, enabling flexible input preparation.

```python
from opi.input.structures import Structure, Atom
from opi.utils.element import Element

# Create structure from XYZ file
structure_xyz = Structure.from_xyz("molecule.xyz")

# Create structure from SMILES string with 3D coordinates
structure_smiles = Structure.from_smiles("CCO", charge=0, multiplicity=1)

# Create structure programmatically
atoms = [
    Atom(Element.CARBON, x=0.0, y=0.0, z=0.0),
    Atom(Element.HYDROGEN, x=1.09, y=0.0, z=0.0),
    Atom(Element.HYDROGEN, x=-0.545, y=0.943, z=0.0),
    Atom(Element.HYDROGEN, x=-0.545, y=-0.943, z=0.0)
]
structure_manual = Structure(atoms=atoms, charge=0, multiplicity=1)

# Access structure properties
print(f"Number of atoms: {len(structure_manual.atoms)}")
print(f"Charge: {structure_manual.charge}")
print(f"Multiplicity: {structure_manual.multiplicity}")

# Use in calculator
from opi.core import Calculator
calc = Calculator(basename="ch4", working_dir=".")
calc.structure = structure_manual
```

### Runner - ORCA Binary Execution

The Runner class handles execution of ORCA binaries with proper environment setup, including path configuration for ORCA installation and OpenMPI. It automatically manages environment variables and working directories.

```python
from pathlib import Path
from opi.execution.core import Runner, OrcaBinary

# Create runner with working directory
runner = Runner(working_dir=Path("RUN"))

# Execute main ORCA calculation
input_file = Path("RUN/job.inp")
runner.run_orca(input_file, silent=False, timeout=3600)

# Create JSON output files from ORCA results
runner.create_jsons("job", force=False)

# Execute specific ORCA utility
runner.run(
    OrcaBinary.ORCA_2JSON,
    args=["job.gbw"],
    silent=True
)

# Create property JSON with custom configuration
gbw_config = {
    "include_orbitals": True,
    "orbital_indices": [0, 1, 2, 3, 4]
}
runner.create_gbw_json("job", force=True, config=gbw_config)

# Get path to ORCA binary
orca_path = runner.get_orca_binary(OrcaBinary.ORCA)
print(f"Using ORCA at: {orca_path}")
```

### Output - Result Parsing and Analysis

The Output class parses ORCA results from JSON files, providing structured access to energies, geometries, orbital data, and molecular properties through Pydantic models.

```python
from pathlib import Path
from opi.output.core import Output

# Create output parser
output = Output(
    basename="job",
    working_dir=Path("RUN"),
    create_property_json=True,
    create_gbw_json=True,
    parse=True
)

# Check calculation success
if not output.terminated_normally():
    print(f"Calculation failed: {output.get_outfile()}")
    exit(1)

# Access final energy
final_geometry = output.results_properties.geometries[-1]
energy = final_geometry.single_point_data.finalenergy
print(f"Final single point energy: {energy} Hartree")

# Check SCF convergence
converged = final_geometry.single_point_data.converged
print(f"SCF converged: {converged}")

# Access properties along optimization trajectory
ngeoms = len(output.results_properties.geometries)
print(f"Number of geometries: {ngeoms}")

for idx, geom in enumerate(output.results_properties.geometries):
    energy = geom.single_point_data.finalenergy
    print(f"Geometry {idx}: {energy} Hartree")

    # Mulliken charges if available
    try:
        charges = geom.mulliken_population_analysis[0].atomiccharges
        print(f"  Mulliken charges: {charges}")
    except (TypeError, AttributeError):
        print(f"  Mulliken charges: not available")

# Access dispersion correction
if hasattr(final_geometry, 'vdw_correction'):
    disp_energy = final_geometry.vdw_correction.vdw
    print(f"Dispersion correction: {disp_energy} Hartree")
```

### Geometry Optimization Workflow

Complete workflow for running a DFT geometry optimization with wB97X-3c functional, parsing trajectory data, and extracting molecular properties at each step.

```python
import shutil
from pathlib import Path
from opi.core import Calculator
from opi.input.simple_keywords import Dft, BasisSet, Task, Scf
from opi.input.structures import Structure

# Setup
wd = Path("RUN")
shutil.rmtree(wd, ignore_errors=True)
wd.mkdir()

# Configure optimization
calc = Calculator(basename="opt_job", working_dir=wd)
calc.structure = Structure.from_xyz("inp.xyz")
calc.input.add_simple_keywords(
    Scf.NOAUTOSTART,
    Dft.WB97X3C,
    BasisSet.DEF2_TZVP,
    Task.OPT
)
calc.input.ncores = 4

# Execute
calc.write_input()
calc.run(timeout=7200)

# Parse results
output = calc.get_output()
if not output.terminated_normally():
    print(f"Optimization failed: {output.get_outfile()}")
    exit(1)

output.parse()

# Analyze optimization trajectory
ngeoms = len(output.results_properties.geometries)
print(f"Optimization steps: {ngeoms}")
print(f"Final energy: {output.results_properties.geometries[-1].single_point_data.finalenergy}")

# Extract energies along optimization path
print("\nEnergy profile:")
for i, geom in enumerate(output.results_properties.geometries):
    energy = geom.single_point_data.finalenergy
    converged = geom.single_point_data.converged
    print(f"Step {i}: {energy:.8f} Hartree (converged: {converged})")

# Extract Mulliken charges at each step
print("\nCharge evolution:")
for i, geom in enumerate(output.results_properties.geometries):
    try:
        charges = geom.mulliken_population_analysis[0].atomiccharges
        print(f"Step {i}: {charges}")
    except (TypeError, AttributeError):
        print(f"Step {i}: charges not available")
```

### Solvation Model Configuration

Setting up implicit solvation calculations with CPCM model, including custom method parameters for dispersion corrections and solvent specifications.

```python
import shutil
from pathlib import Path
from opi.core import Calculator
from opi.input.simple_keywords import (
    Method, BasisSet, Task, Scf,
    SolvationModel, Solvent, DispersionCorrection
)
from opi.input.blocks import BlockMethod
from opi.input.structures import Structure

# Setup
wd = Path("RUN")
shutil.rmtree(wd, ignore_errors=True)
wd.mkdir()

# Configure solvation calculation
calc = Calculator(basename="solvation", working_dir=wd)
calc.structure = Structure.from_xyz("inp.xyz")

# Add solvation model with solvent
calc.input.add_simple_keywords(
    Scf.NOAUTOSTART,
    Method.HF,
    BasisSet.DEF2_SVP,
    Task.SP,
    SolvationModel.CPCM(Solvent.WATER),
    DispersionCorrection.D3
)

# Fine-tune dispersion parameters
calc.input.add_blocks(
    BlockMethod(
        d3s6=0.64,
        d3a1=0.3065,
        d3s8=0.9147,
        d3a2=5.0570
    )
)

# Execute and analyze
calc.write_input()
calc.run()

output = calc.get_output()
if not output.terminated_normally():
    print(f"Calculation failed")
    exit(1)

output.parse()

# Extract solvation energy contributions
geom = output.results_properties.geometries[0]
total_energy = geom.single_point_data.finalenergy
vdw_correction = geom.vdw_correction.vdw

print(f"Total energy in solution: {total_energy} Hartree")
print(f"Dispersion correction: {vdw_correction} Hartree")
print(f"Electronic energy + VdW: {geom.energy[0].totalenergy[0][0] + vdw_correction}")
```

## Summary

OPI provides a comprehensive Python interface for ORCA quantum chemistry calculations with a clear separation between structure definition, input configuration, execution, and output parsing. The main use cases include single-point energy calculations, geometry optimizations, frequency calculations, and property evaluations across various quantum chemical methods (HF, DFT, MP2, coupled cluster). The library handles molecular structure input from multiple sources (XYZ files, SMILES, programmatic construction), manages computational parameters through simple keywords and block options, and parses results into type-safe Python objects for analysis.

Integration patterns follow a consistent workflow: create a Calculator with basename and working directory, assign a Structure object, configure Input parameters via simple keywords and blocks, execute with run(), and parse results through get_output(). Advanced usage supports custom ORCA binary execution via Runner, manual JSON file generation with configurable parameters, trajectory analysis for optimizations, and direct access to raw JSON data structures. The library integrates with RDKit for molecular structure manipulation and uses Pydantic models for validation, making it suitable for both interactive calculations and automated computational workflows in research and production environments.