Try Live
Add Docs
Rankings
Pricing
Docs
Install
Theme
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
OPI
https://github.com/faccts/opi
Admin
OPI is a Python library to create input and parse output of ORCA, designed to make quantum chemistry
...
Tokens:
80,895
Snippets:
657
Trust Score:
7.2
Update:
1 week ago
Context
Skills
Chat
Benchmark
84.2
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# OPI - ORCA Python Interface OPI (ORCA Python Interface) is a Python library designed to create input files and parse output from the ORCA quantum chemistry program package. It provides a Pythonic API to set up quantum chemical calculations including DFT, Hartree-Fock, coupled cluster methods, and various spectroscopic calculations. OPI handles molecular structure creation, input file generation, ORCA execution, and comprehensive output parsing through JSON-based results. The library is structured around three core components: the `Calculator` class for managing complete calculation workflows, the `Input` module for configuring methods and parameters through simple keywords and blocks, and the `Output` class for parsing and accessing calculation results. OPI requires ORCA 6.1.1 or later and is distributed as an open-source package via PyPI (`pip install orca-pi`). --- ## Calculator - Complete Calculation Workflow The `Calculator` class is the main entry point that combines job setup, execution, and result parsing into a unified workflow. It manages the relationship between molecular structures and calculation parameters. ```python from pathlib import Path from opi.core import Calculator from opi.input.simple_keywords import BasisSet, Method, Scf, Task from opi.input.structures import Structure # Create a molecular structure from XYZ file structure = Structure.from_xyz("water.xyz") # Initialize calculator with basename and working directory calc = Calculator(basename="calculation", working_dir=Path("./work")) calc.structure = structure # Configure the calculation with simple keywords calc.input.add_simple_keywords( Method.HF, # Hartree-Fock method BasisSet.DEF2_SVP, # def2-SVP basis set Task.SP, # Single point calculation Scf.NOAUTOSTART, # Don't read previous orbitals ) # Set computational resources calc.input.ncores = 4 # Use 4 CPU cores calc.input.memory = 4000 # 4000 MB per core # Write input file and execute ORCA calc.write_input() success = calc.run(timeout=3600) # 1 hour timeout # Access results output = calc.get_output() output.parse() energy = output.get_final_energy() print(f"Final energy: {energy:.8f} Hartree") ``` --- ## Structure - Molecular Geometry Handling The `Structure` class represents molecular geometries and provides multiple ways to create structures from files, SMILES strings, RDKit molecules, or programmatically from lists. ```python from opi.input.structures import Structure, Atom, Coordinates from opi.utils.element import Element # Create structure from XYZ file mol = Structure.from_xyz("molecule.xyz") # Create structure from SMILES string (generates 3D coordinates) ethanol = Structure.from_smiles("CCO", charge=0, multiplicity=1) # Create structure from XYZ string block xyz_block = """3 Water molecule O 0.000000 0.000000 0.117349 H 0.756950 0.000000 -0.469397 H -0.756950 0.000000 -0.469397 """ water = Structure.from_xyz_block(xyz_block, charge=0, multiplicity=1) # Create structure programmatically from element symbols and coordinates symbols = ["O", "H", "H"] coords = [ (0.000000, 0.000000, 0.117349), (0.756950, 0.000000, -0.469397), (-0.756950, 0.000000, -0.469397), ] water_programmatic = Structure.from_lists(symbols, coords, charge=0, multiplicity=1) # Modify structure properties water_programmatic.charge = -1 # Set charge to -1 water_programmatic.multiplicity = 2 # Set to doublet # Add atoms to structure new_atom = Atom( element=Element("C"), coordinates=Coordinates((1.0, 2.0, 3.0)), fragment_id=1 # Optional fragment assignment ) water_programmatic.add_atom(new_atom) # Export structure to XYZ format xyz_output = water_programmatic.to_xyz_block() ``` --- ## Input - Simple Keywords Configuration Simple keywords configure fundamental calculation parameters including method, basis set, task type, and SCF settings. They correspond to ORCA's `!` keywords. ```python from opi.core import Calculator from opi.input.simple_keywords import ( Method, Dft, BasisSet, AuxBasisSet, Task, Scf, DispersionCorrection, SolvationModel, Solvent, AtomicCharge, Grid, Approximation, Opt ) from opi.input.structures import Structure calc = Calculator(basename="dft_job", working_dir="./run") calc.structure = Structure.from_xyz("molecule.xyz") # Method selection (HF, DFT functionals, post-HF) calc.input.add_simple_keywords( Dft.B3LYP, # B3LYP DFT functional # Dft.PBE0, # Alternative: PBE0 functional # Dft.WB97X3C, # Alternative: wB97X-3c composite # Method.HF, # Hartree-Fock # Method.CCSD, # Coupled cluster # Method.MP2, # MP2 perturbation theory ) # Basis sets and auxiliary basis calc.input.add_simple_keywords( BasisSet.DEF2_TZVP, # def2-TZVP basis set AuxBasisSet.DEF2_J, # Auxiliary basis for RI-J AuxBasisSet.DEF2_TZVP_C, # Auxiliary basis for correlation ) # Task type calc.input.add_simple_keywords( Task.SP, # Single point energy # Task.OPT, # Geometry optimization # Task.FREQ, # Frequency calculation # Task.OPTFREQ, # Optimization + frequencies # Task.ENGRAD, # Energy + gradient ) # Dispersion correction calc.input.add_simple_keywords( DispersionCorrection.D4, # DFT-D4 dispersion # DispersionCorrection.D3BJ, # DFT-D3(BJ) ) # Solvation model calc.input.add_simple_keywords( SolvationModel.CPCM(Solvent.WATER), # CPCM with water # SolvationModel.SMD(Solvent.METHANOL), # SMD model ) # Population analysis requests calc.input.add_simple_keywords( AtomicCharge.MULLIKEN, AtomicCharge.LOEWDIN, AtomicCharge.HIRSHFELD, AtomicCharge.CHELPG, ) # SCF settings calc.input.add_simple_keywords( Scf.NOAUTOSTART, # Don't read previous orbitals Scf.TIGHTSCF, # Tight SCF convergence # Scf.VERYTIGHTSCF, # Very tight convergence ) # Integration grid calc.input.add_simple_keywords( Grid.GRID5, # Fine integration grid # Grid.DEFGRID3, # Default grid level 3 ) # Approximations for efficiency calc.input.add_simple_keywords( Approximation.RIJCOSX, # RI-J with chain-of-spheres exchange # Approximation.RIJK, # Full RI-JK approximation ) calc.write_input() ``` --- ## Input - Block Options Configuration Blocks provide detailed control over specific calculation aspects using ORCA's `%block ... end` syntax. They allow fine-tuning of methods, geometry optimization, frequencies, and more. ```python from opi.core import Calculator from opi.input.blocks import ( BlockScf, BlockGeom, BlockFreq, BlockMethod, BlockElprop, BlockCpcm, BlockTddft, BlockMdci, Scan, Constraints, Constraint ) from opi.input.simple_keywords import Dft, BasisSet, Task from opi.input.structures import Structure calc = Calculator(basename="advanced_job", working_dir="./run") calc.structure = Structure.from_xyz("molecule.xyz") calc.input.add_simple_keywords(Dft.B3LYP, BasisSet.DEF2_SVP, Task.OPT) # SCF block for convergence settings scf_block = BlockScf( maxiter=200, # Maximum SCF iterations thresh=1e-10, # Energy convergence threshold tcut=1e-14, # Integral screening threshold ) calc.input.add_blocks(scf_block) # Geometry optimization block geom_block = BlockGeom( maxiter=100, # Max optimization cycles trust=0.3, # Trust radius maxstep=0.1, # Maximum step size ) calc.input.add_blocks(geom_block) # Relaxed surface scan scan_block = BlockGeom( scan=Scan(b=[0, 1, 20, 1.0, 2.0]) # Scan bond 0-1, 20 steps, 1.0-2.0 Angstrom ) calc.input.add_blocks(scan_block, overwrite=True) # Geometry constraints constraints = Constraints( constraint_list=[ Constraint(b=[0, 1]), # Constrain bond 0-1 Constraint(a=[0, 1, 2]), # Constrain angle 0-1-2 ] ) geom_constrained = BlockGeom(constraints=constraints) calc.input.add_blocks(geom_constrained, overwrite=True) # Frequency calculation block freq_block = BlockFreq( temp=298.15, # Temperature in Kelvin pressure=1.0, # Pressure in atm numfreq=False, # Analytical frequencies quasirrho=True, # Quasi-RRHO correction ) calc.input.add_blocks(freq_block) # Electric properties block elprop_block = BlockElprop( dipole=True, # Calculate dipole moment quadrupole=True, # Calculate quadrupole moment polar="analytic", # Analytic polarizability ) calc.input.add_blocks(elprop_block) # Method block for dispersion parameters method_block = BlockMethod( d3s6=0.64, # D3 s6 parameter d3a1=0.3065, # D3 a1 parameter d3s8=0.9147, # D3 s8 parameter d3a2=5.0570, # D3 a2 parameter ) calc.input.add_blocks(method_block) # TD-DFT block for excited states tddft_block = BlockTddft( nroots=10, # Number of excited states maxdim=5, # Davidson expansion space triplets=True, # Include triplet states ) calc.input.add_blocks(tddft_block) calc.write_input() ``` --- ## Output - Parsing Calculation Results The `Output` class provides comprehensive access to ORCA calculation results through parsed JSON property and GBW files, offering methods for energies, structures, populations, and molecular orbitals. ```python from pathlib import Path from opi.output.core import Output # Initialize output parser output = Output( basename="calculation", working_dir=Path("./work"), version_check=True # Verify ORCA version compatibility ) # Parse all JSON result files output.parse() # Check calculation status if output.terminated_normally(): print("Calculation completed successfully") if output.scf_converged(): print("SCF converged") if output.geometry_optimization_converged(): print("Geometry optimization converged") # Access energies final_energy = output.get_final_energy() # Final single point energy (Hartree) energies = output.get_energies() # Dict of all energy types for name, energy_obj in (energies or {}).items(): print(f"{name}: {energy_obj.energy}") # Access thermochemistry (requires frequency calculation) zpe = output.get_zpe() # Zero-point energy enthalpy = output.get_enthalpy() # Enthalpy (H) entropy = output.get_entropy() # Entropy (S) free_energy = output.get_free_energy() # Gibbs free energy (G) inner_energy = output.get_inner_energy() # Inner energy (U) print(f"G = {free_energy:.6f} Hartree") # Access optimized structure structure = output.get_structure(index=-1) # -1 for final geometry if structure: print(structure.to_xyz_block()) # Access gradient (for geometry before final) gradient = output.get_gradient(index=-2) # Access calculation info charge = output.get_charge() mult = output.get_mult() n_electrons = output.get_nelectrons() n_bf = output.get_nbf() # Number of basis functions ``` --- ## Output - Population Analysis Results Access various population analysis results including Mulliken, Loewdin, Hirshfeld, CHELPG, Mayer, and MBIS charges and other properties. ```python from opi.output.core import Output output = Output(basename="job", working_dir="./work") output.parse() # Mulliken population analysis mulliken = output.get_mulliken() if mulliken: for pop in mulliken: print(f"Mulliken charges: {pop.atomiccharges}") print(f"Mulliken spin densities: {pop.atomicspinpopulations}") # Loewdin population analysis loewdin = output.get_loewdin() if loewdin: print(f"Loewdin charges: {loewdin[0].atomiccharges}") # CHELPG/RESP charges (electrostatic potential derived) chelpg = output.get_chelpg() if chelpg: print(f"CHELPG charges: {chelpg[0].atomiccharges}") # Hirshfeld population analysis hirshfeld = output.get_hirshfeld() if hirshfeld: print(f"Hirshfeld charges: {hirshfeld[0].atomiccharges}") # Mayer bond orders and populations mayer = output.get_mayer() if mayer: print(f"Mayer valences: {mayer[0].valences}") # MBIS (Minimal Basis Iterative Stockholder) charges mbis = output.get_mbis() if mbis: print(f"MBIS charges: {mbis[0].atomiccharges}") # Dipole moment dipole = output.get_dipole() if dipole: dx, dy, dz = dipole[0].dipoletotal[0][0], dipole[0].dipoletotal[1][0], dipole[0].dipoletotal[2][0] print(f"Dipole moment: ({dx:.4f}, {dy:.4f}, {dz:.4f}) Debye") # Quadrupole moment quadrupole = output.get_quadrupole() # Polarizability tensor polarizability = output.get_polarizability() ``` --- ## Output - Molecular Orbital Analysis Access molecular orbital information including energies, occupancies, HOMO/LUMO identification, and orbital plotting capabilities. ```python from opi.output.core import Output output = Output(basename="job", working_dir="./work") output.parse() # Get HF type (RHF, UHF, ROHF) hftype = output.get_hftype() print(f"HF type: {hftype}") # Get electron counts n_alpha, n_beta = output.get_nelectrons(spin_resolved=True) print(f"Alpha electrons: {n_alpha}, Beta electrons: {n_beta}") # Access all molecular orbitals mos = output.get_mos() if mos: # For RHF/ROHF, key is 'mo'; for UHF, keys are 'alpha' and 'beta' for channel, mo_list in mos.items(): print(f"Channel: {channel}, Number of MOs: {len(mo_list)}") for i, mo in enumerate(mo_list[:5]): # First 5 orbitals print(f" MO {i}: Energy={mo.orbitalenergy:.6f} Eh, Occ={mo.occupancy}") # Get HOMO information homo_data = output.get_homo() if homo_data: print(f"HOMO index: {homo_data.index}") print(f"HOMO channel: {homo_data.channel}") print(f"HOMO energy: {homo_data.orbitalenergy:.6f} Hartree") # Get LUMO information lumo_data = output.get_lumo() if lumo_data: print(f"LUMO index: {lumo_data.index}") print(f"LUMO energy: {lumo_data.orbitalenergy:.6f} Hartree") # Calculate HOMO-LUMO gap gap = output.get_hl_gap() if gap: print(f"HOMO-LUMO gap: {gap:.2f} eV") # Plot molecular orbital to cube file (requires orca_plot) cube = output.plot_mo( index=homo_data.index, # MO index to plot operator=0, # 0=alpha, 1=beta resolution=40, # Grid resolution timeout=300 # Timeout in seconds ) if cube: print(f"Cube file generated: {cube.filepath}") ``` --- ## Output - Integral and Density Matrix Access Access one-electron integrals, Fock matrix components, and density matrices from GBW JSON files. ```python from opi.output.core import Output import numpy as np output = Output(basename="job", working_dir="./work") output.parse() # Get overlap matrix S (optionally regenerate JSON with integrals) overlap = output.get_int_overlap(recreate_json=True) if overlap is not None: print(f"Overlap matrix shape: {overlap.shape}") print(f"Trace(S): {np.trace(overlap):.4f}") # Get core Hamiltonian matrix H hcore = output.get_int_hcore(recreate_json=True) if hcore is not None: print(f"Hcore matrix shape: {hcore.shape}") # Get Fock matrix F fock = output.get_int_f(recreate_json=True) if fock is not None: print(f"Fock matrix shape: {fock.shape}") # Get Coulomb matrix J coulomb_j = output.get_int_j(recreate_json=True) # Get Exchange matrix K exchange_k = output.get_int_k(recreate_json=True) # Get SCF density matrix P scf_density = output.get_scf_density(recreate_json=True) if scf_density is not None: print(f"Density matrix shape: {scf_density.shape}") n_electrons = np.trace(np.dot(scf_density, overlap)) print(f"Number of electrons from P*S: {n_electrons:.4f}") # Get spin density matrix (UHF/ROHF) spin_density = output.get_scf_spin_density(recreate_json=True) # Custom configuration for GBW JSON generation output.config_dict = { "1elIntegrals": ["S", "H"], # Request overlap and Hcore "FockMatrix": ["F", "J", "K"], # Request Fock components "Densities": ["scfp", "scfr"], # Request densities } output.recreate_gbw_results(output.config_dict, gbw_index=0) ``` --- ## Output - Vibrational Analysis (IR Spectrum) Access vibrational frequencies and IR intensities from frequency calculations. ```python from opi.output.core import Output output = Output(basename="freq_job", working_dir="./work") output.parse() # Get IR spectrum data from output file ir_spectrum = output.get_ir() if ir_spectrum: print("Mode | Frequency (cm-1) | IR Intensity (km/mol)") print("-" * 50) for mode_num, ir_mode in ir_spectrum.items(): freq = ir_mode.frequency intensity = ir_mode.ir_intensity print(f"{mode_num:4d} | {freq:15.2f} | {intensity:15.4f}") # Access thermochemistry from frequency calculation if output.results_properties: geom = output.results_properties.geometries[-1] # Temperature and pressure used thermo = geom.thermochemistry_energies[0] print(f"Temperature: {thermo.temperature} K") # Thermodynamic properties print(f"Zero-point energy: {output.get_zpe():.6f} Hartree") print(f"Enthalpy (H): {output.get_enthalpy():.6f} Hartree") print(f"Entropy (S): {output.get_entropy():.8f} Hartree/K") print(f"Free energy (G): {output.get_free_energy():.6f} Hartree") # Thermostatistical correction (G - E_el) g_thermo = output.get_free_energy_delta() print(f"G_thermo correction: {g_thermo:.6f} Hartree") ``` --- ## Runner - Direct ORCA Execution The `Runner` class provides low-level control over ORCA binary execution for custom workflows. ```python from pathlib import Path from opi.execution.core import Runner # Initialize runner with working directory runner = Runner(working_dir=Path("./calculation")) # Check ORCA version compatibility runner.check_version(ignore_errors=False) # Run ORCA calculation directly inp_file = Path("./calculation/job.inp") runner.run_orca( inp_file, silent=True, # Suppress stdout/stderr timeout=7200 # 2 hour timeout (-1 for unlimited) ) # Generate JSON files from ORCA binary outputs runner.create_jsons("job", force=True) # Creates both property.json and gbw.json # Create only property JSON runner.create_property_json("job", force=True) # Create GBW JSON with custom configuration config = { "1elIntegrals": ["S", "H"], "FockMatrix": ["F"], "Densities": ["scfp"], } runner.create_gbw_json("job", force=True, config=config) # Run orca_plot for orbital visualization gbw_file = Path("./calculation/job.gbw") stdin_commands = [ "1", # Plot type "1", # MO plot "2", "5", # Select MO 5 "3", "0", # Alpha orbitals "4", "40", # Resolution "5", "7", # Cube output "11", # Generate plot "12", # Exit ] runner.run_orca_plot(gbw_file, stdin_commands, timeout=300) ``` --- ## Structure - ASE and RDKit Integration OPI integrates with the Atomic Simulation Environment (ASE) and RDKit for interoperability with other chemistry tools. ```python from opi.input.structures import Structure # Convert from ASE Atoms object from ase import Atoms as AseAtoms from ase.build import molecule as ase_molecule # Create ASE water molecule ase_water = ase_molecule("H2O") ase_water.set_initial_charges([0, 0, 0]) # Convert to OPI Structure opi_structure = Structure.from_ase( ase_water, charge=0, multiplicity=1 ) # Convert from RDKit Mol object from rdkit import Chem from rdkit.Chem import AllChem # Create RDKit molecule rdkit_mol = Chem.MolFromSmiles("CC(=O)O") # Acetic acid rdkit_mol = Chem.AddHs(rdkit_mol) AllChem.EmbedMolecule(rdkit_mol) # Convert to OPI Structure opi_from_rdkit = Structure.from_rdkitmol( rdkit_mol, charge=0, multiplicity=1 ) # Also available: direct SMILES conversion (uses RDKit internally) methanol = Structure.from_smiles("CO", charge=0, multiplicity=1) # Read trajectory files (multiple structures) structures = Structure.from_trj_xyz( "trajectory.xyz", charge=0, multiplicity=1, n_struc_limit=10 # Only read first 10 structures ) for i, struct in enumerate(structures): print(f"Structure {i}: {len(struct)} atoms") ``` --- ## Complete Workflow Examples ### DFT Geometry Optimization with Solvation ```python from pathlib import Path from opi.core import Calculator from opi.input.simple_keywords import ( Dft, BasisSet, Task, Scf, SolvationModel, Solvent, DispersionCorrection ) from opi.input.blocks import BlockGeom from opi.input.structures import Structure # Setup calc = Calculator(basename="opt_solv", working_dir=Path("./run")) calc.structure = Structure.from_smiles("c1ccccc1O") # Phenol calc.structure.charge = 0 calc.structure.multiplicity = 1 # DFT optimization in water calc.input.add_simple_keywords( Dft.PBE0, BasisSet.DEF2_TZVP, DispersionCorrection.D4, Task.OPT, Scf.TIGHTSCF, SolvationModel.CPCM(Solvent.WATER), ) calc.input.add_blocks(BlockGeom(maxiter=100)) calc.input.ncores = 8 calc.input.memory = 4000 # Execute calc.write_input() calc.run() # Analyze results output = calc.get_output() output.parse() if output.geometry_optimization_converged(): optimized = output.get_structure() print(optimized.to_xyz_block()) print(f"Final energy: {output.get_final_energy():.8f} Hartree") ``` ### Frequency Calculation with Thermochemistry ```python from pathlib import Path from opi.core import Calculator from opi.input.simple_keywords import Dft, BasisSet, Task, Scf from opi.input.blocks import BlockFreq from opi.input.structures import Structure calc = Calculator(basename="freq", working_dir=Path("./run")) calc.structure = Structure.from_xyz("optimized.xyz") calc.input.add_simple_keywords( Dft.B3LYP, BasisSet.DEF2_SVP, Task.FREQ, Scf.TIGHTSCF, ) calc.input.add_blocks(BlockFreq( temp=298.15, pressure=1.0, quasirrho=True, # Quasi-RRHO approximation for low frequencies )) calc.input.ncores = 4 calc.write_input() calc.run() output = calc.get_output() output.parse() # Thermochemistry results print(f"ZPE: {output.get_zpe():.6f} Hartree") print(f"H: {output.get_enthalpy():.6f} Hartree") print(f"G: {output.get_free_energy():.6f} Hartree") ``` --- ## Summary OPI provides a comprehensive Python interface for the ORCA quantum chemistry program, enabling automated computational chemistry workflows. The library excels at managing complex calculation setups through its modular architecture of simple keywords and block options, while the output parsing capabilities give easy access to energies, structures, populations, orbitals, and thermochemistry. Common use cases include high-throughput DFT screening, geometry optimizations, frequency calculations for thermodynamic properties, and spectroscopic property predictions. Integration patterns typically involve using the `Calculator` class as the central workflow manager, combining it with `Structure` objects loaded from various sources (XYZ files, SMILES, ASE, RDKit). For advanced workflows, direct access to the `Runner` class enables custom ORCA execution scenarios, while the `Output` class's comprehensive parsing methods support detailed result analysis and data extraction for machine learning applications or database storage. The library's JSON-based output parsing ensures reliable data access even for large-scale computational campaigns.