### Install LangStruct

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Installs the LangStruct library along with example dependencies. This command is used to set up the necessary tools for using LangStruct.

```bash
pip install "langstruct[examples]"
```

--------------------------------

### Install and Run vLLM Server

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Installs the vLLM library and starts a local OpenAI-compatible API server for serving local models.

```bash
# Install vLLM
pip install vllm

# Start vLLM server
python -m vllm.entrypoints.openai.api_server \
  --model microsoft/DialoGPT-medium \
  --port 8000
```

--------------------------------

### LangStruct with Multiple Examples (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Demonstrates how to provide multiple examples to LangStruct for better type inference. This allows the extractor to understand a wider range of data structures and types.

```python
# Better type inference from multiple examples
examples = [
    {"name": "Alice", "age": 25, "skills": ["Python"]},
    {"name": "Bob", "age": 35, "skills": ["JavaScript", "React"]}
]

extractor = LangStruct(examples=examples)
# Infers: name=str, age=int, skills=List[str]

```

--------------------------------

### LangStruct Source Tracking Example (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

A minimal example demonstrating how to extract source tracking information from text using LangStruct. This helps in understanding where specific extracted entities originated within the source text.

```python
result = extractor.extract(text)

```

--------------------------------

### Install LangStruct with Optional Extras

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Installs LangStruct with specific optional dependencies. Each command installs a different set of extras like 'viz', 'examples', 'parallel', 'dev', or 'all' for comprehensive features.

```bash
pip install "langstruct[viz]"
pip install "langstruct[examples]"
pip install "langstruct[parallel]"
pip install "langstruct[dev]"
pip install "langstruct[all]"
```

--------------------------------

### Development Installation for LangStruct

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Guides for setting up the LangStruct project for development. Includes cloning the repository, installing dependencies in development mode using `uv` or `pip`, and running tests and linting.

```bash
# Clone repository
git clone https://github.com/langstruct/langstruct.git
cd langstruct

# Install in development mode
uv sync --dev

# Or with pip
pip install -e ".[dev,test]"

# Run tests
pytest

# Run linting
ruff check .
mypy src/
```

--------------------------------

### Basic Data Extraction with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Demonstrates how to perform basic data extraction using LangStruct. It involves defining a schema by example, extracting entities from text, and tracking character-level source information. Refinement can be enabled for higher accuracy.

```python
from langstruct import LangStruct

# Define schema by example
extractor = LangStruct(example={
    "company": "Apple Inc.",
    "revenue": 125.3,
    "quarter": "Q3 2024"
})

# Extract from text
text = "Apple reported $125.3B revenue in Q3 2024..."
result = extractor.extract(text)

print(result.entities)
# {'company': 'Apple Inc.', 'revenue': 125.3, 'quarter': 'Q3 2024'}

print(result.sources)  # Character-level source tracking
# {'company': [CharSpan(0, 5, 'Apple')], ...}

# Boost accuracy with refinement
refined_result = extractor.extract(text, refine=True)
print(f"Confidence: {refined_result.confidence:.1%}")  # Higher confidence

```

--------------------------------

### Save and Load LangStruct Extractors

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Demonstrates how to persist the state of an extractor to disk and then load it back for later use. This is crucial for deployment and resuming operations without re-initializing the extractor.

```python
# Save an extractor (preserves all state)
extractor.save("./my_extractor")

# Load anywhere (API keys must be available)
loaded_extractor = LangStruct.load("./my_extractor")

# Works exactly like the original
result = loaded_extractor.extract("New text")
```

--------------------------------

### LangStruct Model Configuration (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Shows how to configure the language model used by LangStruct. It covers auto-detection, specifying a particular model, and using local models via Ollama.

```python
# Default: Auto-detects available models
extractor = LangStruct(example=schema)

# Specific model
extractor = LangStruct(
    example=schema,
    model="gpt-5-mini"  # Example latest OpenAI model
)

# Local with Ollama
extractor = LangStruct(
    example=schema,
    model="ollama/llama3.2"
)

```

--------------------------------

### Install LangStruct and Set API Keys

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Installs the langstruct Python package and sets environment variables for API keys required by various language models like OpenAI, Gemini, and Claude.

```bash
pip install langstruct

# Set up any API key:
export OPENAI_API_API_KEY="sk-your-key"     # OpenAI
export GOOGLE_API_KEY="your-key"        # Gemini
export ANTHROPIC_API_KEY="sk-ant-key"   # Claude
```

--------------------------------

### Verify LangStruct Installation and Basic Functionality

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

A Python script to verify the LangStruct installation by checking the version and performing a basic extraction task using a defined schema.

```python
import langstruct

# Check version
print(f"LangStruct version: {langstruct.__version__}")

# Test basic functionality
from pydantic import BaseModel, Field
from langstruct import LangStruct

class TestSchema(BaseModel):
    message: str = Field(description="A simple message")

# This will test your API connection (uses your default model)
extractor = LangStruct(schema=TestSchema)
result = extractor.extract("Hello, LangStruct!")

print(f"Success! Extracted: {result.entities}")
```

--------------------------------

### Start Local Documentation Server

Source: https://github.com/langstruct-ai/langstruct/blob/main/CONTRIBUTING.md

Starts a local development server to preview the documentation site as it's being built. Changes are often reflected live.

```bash
cd docs
pnpm install
pnpm dev
```

--------------------------------

### Install LangStruct with uv

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Installs the LangStruct package using the 'uv' package manager, which is recommended for faster installation.

```bash
uv add langstruct
```

--------------------------------

### Install LangStruct Package

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Instructions for installing the LangStruct package using pip. It covers both user-level installation and the recommended approach using a virtual environment.

```bash
# Use user installation if needed
pip install --user langstruct

# Or use virtual environment (recommended)
python -m venv langstruct-env
source langstruct-env/bin/activate  # On Windows: langstruct-env\Scripts\activate
pip install langstruct
```

--------------------------------

### Complete RAG Pipeline with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Illustrates a complete Retrieval Augmented Generation (RAG) pipeline using LangStruct and ChromaDB. It covers indexing documents with extracted metadata and querying the vector database using parsed natural language queries.

```python
from langstruct import LangStruct
from chromadb import Client

# 1. Single instance for both operations
ls = LangStruct(example={
    "company": "Apple",
    "revenue": 100.0,
    "quarter": "Q3"
})
vector_db = Client().create_collection("docs")

# 2. Index documents with metadata
def index_document(text):
    metadata = ls.extract(text).entities
    vector_db.add(texts=[text], metadatas=[metadata])

# 3. Query with natural language
def search(query):
    parsed = ls.query(query)
    return vector_db.query(
        query_texts=parsed.semantic_terms,
        where=parsed.structured_filters,
        n_results=5
    )

# Usage
index_document("Apple reported $125B in Q3 2024...")
results = search("Q3 tech companies over $100B")
# Returns only Apple, not other Q3 mentions

```

--------------------------------

### LangStruct with Custom Pydantic Schemas (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Explains how to define custom data schemas using Pydantic models for LangStruct. This provides more control over data validation and structure during extraction.

```python
from pydantic import BaseModel, Field
from typing import List, Optional

class CompanySchema(BaseModel):
    name: str
    revenue: float = Field(gt=0, description="Revenue in billions")
    employees: Optional[int] = None
    products: List[str] = []

extractor = LangStruct(schema=CompanySchema)

```

--------------------------------

### Install and Configure Ollama for Local Models

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Installs Ollama, pulls a model (e.g., llama2), and sets the Ollama base URL as an environment variable for LangStruct to use local models.

```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama2

# Use in LangStruct
export OLLAMA_BASE_URL="http://localhost:11434"
```

--------------------------------

### Install Dependencies

Source: https://github.com/langstruct-ai/langstruct/blob/main/CONTRIBUTING.md

Installs project dependencies using either uv (recommended) or pip. The `.[dev]` extra ensures development-specific packages are installed.

```bash
# With uv (recommended)
uv sync --extra dev

# Or with pip
pip install -e ".[dev]"
```

--------------------------------

### Set Google Gemini API Key

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Configures the Google API key as an environment variable for accessing Google Gemini models.

```bash
export GOOGLE_API_KEY="your-google-api-key"
```

--------------------------------

### LangStruct Batch Processing and Retries (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Details how to process multiple documents efficiently using LangStruct's batch capabilities. It includes options for controlling concurrency, showing progress, handling rate limits, and retrying failed extractions.

```python
# Process multiple documents efficiently
documents = [doc1, doc2, doc3, ...]
results = extractor.extract(documents, max_workers=8, show_progress=True)

for result in results:
    print(f"Confidence: {result.confidence:.1%}")
    print(f"Entities: {result.entities}")

# Batch processing with refinement for higher accuracy
results = extractor.extract(
  documents,
  refine=True,
  max_workers=5,
  rate_limit=60,    # calls per minute
  retry_failed=True # raise on failures (False to skip with warnings)
)
# Note: 2-5x higher cost but significantly better accuracy

```

--------------------------------

### Connect to OpenAI API

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

This Python snippet demonstrates how to establish a connection to the OpenAI API using the `openai` library. It assumes the `OPENAI_API_KEY` environment variable is set.

```python
import openai
client = openai.OpenAI()  # Uses OPENAI_API_KEY
response = client.models.list()
print("OpenAI connection successful")
```

--------------------------------

### Complete LangStruct Example: Extract Metadata and Parse Queries

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Demonstrates a full usage of LangStruct for extracting entities from a document and parsing a natural language query into structured filters. This example also includes comments on how to integrate with a RAG system.

```python
from langstruct import LangStruct

# 1. Single instance for both operations
ls = LangStruct(example={
    "company": "Apple",
    "revenue": 100.0,
    "quarter": "Q3"
})

# 2. Extract metadata from documents
doc = "Apple reported $125B revenue in Q3 2024"
metadata = ls.extract(doc).entities
print(f"Extracted: {metadata}")

# 3. Parse queries into filters
query = "Q3 tech companies over $100B"
filters = ls.query(query)
print(f"Filters: {filters.structured_filters}")

# 4. Use with your RAG system
# vector_db.add(doc, metadata=metadata)
# results = vector_db.search(
#     query=filters.semantic_terms,
#     where=filters.structured_filters
# )
```

--------------------------------

### Configure LangStruct with .env File

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Sets up LangStruct configuration by creating a .env file in the project root, including API keys and default model settings.

```env
# Add your preferred provider's API key
GOOGLE_API_KEY=your-google-api-key
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
LANGSTRUCT_DEFAULT_MODEL=your-preferred-model
```

--------------------------------

### Install LangStruct and Set API Keys (Bash)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/index.mdx

Provides instructions for installing the LangStruct library using pip and setting up necessary API keys for various LLM providers. It covers OpenAI, Google Gemini, and Anthropic, as well as the option to use local models with Ollama.

```bash
pip install langstruct

# Set up any API key (choose one):
export OPENAI_API_KEY="sk-your-key"        # OpenAI
export GOOGLE_API_KEY="your-key"           # Google Gemini
export ANTHROPIC_API_KEY="sk-ant-key"      # Claude models
# Or use local models with Ollama (no API key needed)
```

--------------------------------

### Extract Structured Data with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Demonstrates how to perform data extraction using LangStruct by defining a schema through an example. It shows how to extract entities and their source spans from a given text.

```python
from langstruct import LangStruct

# Define schema by example
extractor = LangStruct(example={
    "company": "Apple Inc.",
    "revenue": 125.3,
    "quarter": "Q3 2024"
})

# Extract from text
text = "Apple reported $125.3B revenue in Q3 2024, beating estimates."
result = extractor.extract(text)

print(result.entities)
# {'company': 'Apple Inc.', 'revenue': 125.3, 'quarter': 'Q3 2024'}

print(result.sources['revenue'])
# [CharSpan(15, 22, '$125.3B')]
```

--------------------------------

### Set Up Pre-commit Hooks

Source: https://github.com/langstruct-ai/langstruct/blob/main/CONTRIBUTING.md

Installs pre-commit hooks to automate code formatting, linting, and other checks before committing. This helps maintain code quality.

```bash
uv run pre-commit install
```

--------------------------------

### Reinstall LangStruct with All Dependencies

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Uninstalls the current LangStruct package and then reinstalls it with all optional dependencies to resolve potential import errors.

```bash
# If you get import errors, reinstall with dependencies
pip uninstall langstruct
pip install langstruct[all]
```

--------------------------------

### Test Google Gemini API Key

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

A Python snippet to test the configured Google API key by attempting to list available models using the google.genai client.

```python
# Test your Google API key
from google import genai
client = genai.Client()  # Uses GOOGLE_API_KEY
response = client.models.list()
print("Google Gemini connection successful")
```

--------------------------------

### LangStruct Initialization and Optimization with MIPROv2

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/why-dspy.mdx

This Python code demonstrates initializing a LangStruct extractor with a schema example and then using its 'optimize' method, powered by MIPROv2. It shows how to provide training texts and expected results to automatically tune the extraction prompts and examples.

```python
from langstruct import LangStruct

# 1. Create extractor with your schema
extractor = LangStruct(example={
    "company": "Apple",
    "revenue": 100.0,
    "quarter": "Q3"
})

# 2. Let MIPROv2 optimize prompts and examples automatically
extractor.optimize(
    texts=["Apple reported $125B in Q3...", "Meta earned $40B..."],
    expected_results=[
        {"company": "Apple", "revenue": 125.0, "quarter": "Q3"},
        {"company": "Meta", "revenue": 40.0, "quarter": "Q3"}
    ]
)

# 3. Now it's optimized for your specific data!
result = extractor.extract("Microsoft announced $65B revenue for Q4")
```

--------------------------------

### Shell command for running LangStruct example

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/examples/gepa.mdx

This command sets the Google API key as an environment variable and then runs the specified Python example script using 'uv'. This is typically used to set up and execute the optimization process.

```bash
export GOOGLE_API_KEY="YOUR_KEY"
uv run python examples/07b_optimization_gepa.py
```

--------------------------------

### Initialize LangStruct with Different Model Configurations

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Shows how to initialize LangStruct, either by auto-detecting the model from the environment or by explicitly specifying a model, including options for cloud-based and local models.

```python
from langstruct import LangStruct

# No model needed - it auto-detects from your environment!
extractor = LangStruct(example=schema)

# Or specify model explicitly
extractor = LangStruct(
    example=schema,
    model="gemini/gemini-2.5-flash-lite"  # Fast & cheap
)

# Local models
extractor = LangStruct(
    example=schema,
    model="ollama/llama3.2"  # No API needed
)
```

--------------------------------

### Track Data Sources in LangStruct

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Iterates through the sources of a result to identify and print the field, text, and span of each data source. This helps in understanding where specific information originated.

```python
for field, spans in result.sources.items():
    for span in spans:
        print(f"{field}: '{text[span.start:span.end]}' at {span.start}-{span.end}")
```

--------------------------------

### Production Deployment Workflow for LangStruct

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/persistence.mdx

Illustrates the typical workflow for production deployment, including training and saving an extractor during development, and then loading and using it in a production service or API. This ensures a smooth transition from development to deployment.

```python
# Development: Train and save
extractor = LangStruct(schema=MySchema)
extractor.optimize(training_data, expected_results)
extractor.save("./production_extractor")

# Production: Load and use
def load_extractor():
    return LangStruct.load("./production_extractor")

# Use in API or service
extractor = load_extractor()
result = extractor.extract(incoming_text)
```

--------------------------------

### Manual Documentation Deployment (Bash)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/README.md

Provides steps for manually building and deploying the LangStruct documentation site. This involves navigating to the `docs/` directory, building the production site, and then deploying the generated `dist/` directory.

```bash
cd docs
pnpm build
# Deploy dist/ directory to your hosting provider
```

--------------------------------

### Export and Visualize LangStruct Results

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Provides methods for saving individual results to JSON, exporting batch results in various formats (CSV, JSON, Excel, Parquet), and performing a JSONL round-trip for annotation and visualization.

```python
# Save individual result
result.save_json("output.json")

# Export batch results
extractor.export_batch(results, "output.csv")  # csv/json/excel/parquet

# JSONL round‑trip
extractor.save_annotated_documents(results, "extractions.jsonl")
loaded = extractor.load_annotated_documents("extractions.jsonl")
extractor.visualize(loaded, "results.html")
```

--------------------------------

### Install LangStruct with pip

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Installs the LangStruct package using the standard 'pip' package manager.

```bash
pip install langstruct
```

--------------------------------

### Query Parsing for RAG with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/getting-started.mdx

Shows how to use LangStruct to parse natural language queries for Retrieval Augmented Generation (RAG). It extracts semantic terms and structured filters from a query, which can then be used to query a vector database.

```python
from langstruct import LangStruct

# Same instance for both extraction and parsing
ls = LangStruct(example={
    "company": "Apple Inc.",
    "revenue": 125.3,
    "quarter": "Q3 2024"
})

# Parse natural language query
query = "Q3 2024 tech companies over $100B discussing AI"
result = ls.query(query)

print(result.semantic_terms)
# ['tech companies', 'AI', 'artificial intelligence']

print(result.structured_filters)
# {'quarter': 'Q3 2024', 'revenue': {'$gte': 100.0}}

```

--------------------------------

### DSPy Optimization in LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Explains how LangStruct utilizes DSPy 3.0 for automatic prompt optimization, eliminating the need for manual prompt engineering. It contrasts the traditional approach with LangStruct's self-optimizing method and shows how to initiate optimization.

```python
# LangStruct uses DSPy 3.0 for automatic optimization
# No manual prompt engineering needed!

# Traditional approach (manual prompts):
prompt = "Extract company, revenue, quarter from: {text}"
# Requires iterative tuning, breaks with new data

# LangStruct approach (self-optimizing):
extractor = LangStruct(example=schema)
# Automatically optimizes prompts using MIPROv2
# Improves with your data, no manual tuning

# See optimization in action
extractor.optimize(
    texts=["training texts..."],
    expected_results=[{"expected outputs..."}]  # Optional - uses confidence if omitted
)
```

--------------------------------

### Build Documentation Locally

Source: https://github.com/langstruct-ai/langstruct/blob/main/CONTRIBUTING.md

Builds the static documentation site for LangStruct. This command is run from the 'docs' directory.

```bash
cd docs
pnpm install
pnpm build
```

--------------------------------

### Process Multiple Documents with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Demonstrates batch processing of multiple documents for data extraction using LangStruct. It includes parameters for controlling concurrency (`max_workers`), progress display (`show_progress`), and rate limiting (`rate_limit`).

```python
# Batch processing
documents = [
    "Apple Q3: $125.3B revenue",
    "Microsoft Q3: $62.9B revenue",
    "Google Q3: $88.2B revenue"
]

results = extractor.extract(
  documents,
  max_workers=8,
  show_progress=True,
  rate_limit=60
)

for result in results:
    print(f"{result.entities['company']}: ${result.entities['revenue']}B")
    print(f"Confidence: {result.confidence:.1%}\n")
```

--------------------------------

### Set OpenAI API Key

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Configures the OpenAI API key as an environment variable for accessing OpenAI models.

```bash
export OPENAI_API_KEY="your-openai-api-key"
```

--------------------------------

### Set Anthropic Claude API Key

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Configures the Anthropic API key as an environment variable for accessing Anthropic Claude models.

```bash
export ANTHROPIC_API_KEY="your-anthropic-api-key"
```

--------------------------------

### Source Tracking and Visualization with LangStruct (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/quickstart.mdx

Details how to track and visualize extracted data sources at a character-level precision. It covers saving visualizations to HTML, using JSONL for dataset round-trips, and debugging validation warnings.

```python
result = extractor.extract(text)

# Character-level precision
for field, spans in result.sources.items():
    for span in spans:
        print(f"{field}: '{text[span.start:span.end]}' at {span.start}-{span.end}")

# Interactive visualization
from langstruct import HTMLVisualizer
viz = HTMLVisualizer()
viz.save_visualization(text, result, "output.html")

# JSONL round‑trip for datasets
results = extractor.extract(documents, validate=False)
extractor.save_annotated_documents(results, "extractions.jsonl")
loaded = extractor.load_annotated_documents("extractions.jsonl")
extractor.visualize(loaded, "results.html")

# Debug mode for detailed validation feedback
result = extractor.extract(text, debug=True)
# Shows detailed validation warnings and suggestions when issues are detected
```

--------------------------------

### Set Azure OpenAI Credentials

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Configures the Azure OpenAI endpoint, API key, and API version as environment variables for accessing Azure OpenAI services.

```bash
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-azure-api-key"
export AZURE_OPENAI_API_VERSION="2024-02-01"
```

--------------------------------

### RAG Integration Example with LangStruct

Source: https://context7.com/langstruct-ai/langstruct/llms.txt

Provides a foundational example for integrating LangStruct with a vector database (Chroma) for enhanced RAG capabilities. It sets up an extractor with a defined schema for financial documents, initializes embeddings and text splitters, and demonstrates the initial setup for processing documents and preparing them for retrieval.

```python
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema import Document
from langstruct import LangStruct

# Define schema for financial documents
extractor = LangStruct(example={
    "company": "Contoso Corp",
    "quarter": "Q2 2024",
    "revenue_numeric": 61.9,
    "risks": ["Macro", "Competition"]
})

embeddings = OpenAIEmbeddings()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
```

--------------------------------

### Quick Experiment Extractor Initialization - Python

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/optimization/index.mdx

Provides a concise way to initialize a LangStruct extractor for quick experiments, skipping the optimization step entirely.

```python
extractor = LangStruct(example={"name": "John", "age": 30})
```

--------------------------------

### Configure LangStruct Environment Variables

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/installation.mdx

Sets various environment variables for LangStruct, including API keys for different providers, default model, cache directory, and log level.

```bash
# Choose your preferred provider
export GOOGLE_API_KEY="your-google-api-key"        # Google Gemini
export OPENAI_API_KEY="sk-..."                     # OpenAI
export OPENAI_ORG_ID="org-..."                     # Optional
export ANTHROPIC_API_KEY="sk-ant-..."              # Anthropic Claude

# LangStruct configuration
export LANGSTRUCT_DEFAULT_MODEL="your-preferred-model"
export LANGSTRUCT_CACHE_DIR="~/.langstruct"
export LANGSTRUCT_LOG_LEVEL="INFO"
```

--------------------------------

### Model Switching with Traditional Libraries (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/why-langstruct.mdx

Demonstrates the difficulties of switching LLM providers (OpenAI to Claude to Llama) using traditional libraries. It highlights the need for extensive re-tuning and rewriting prompts and examples for each new model, leading to significant time loss.

```python
# Month 1: Carefully tune prompts for OpenAI
extractor = LangExtract(...)
# Spend days crafting examples and prompt engineering

# Month 6: Switch to Claude - everything breaks!
# ❌ Prompts don't work the same way
# ❌ Few-shot examples need rewriting
# ❌ Back to manual tuning for weeks

# Month 12: Move to local Llama - start over again!
# ❌ Different prompt format requirements
# ❌ Re-engineer everything from scratch
```

--------------------------------

### Running Documentation Commands (Bash)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/README.md

Lists essential commands for managing the LangStruct documentation project, including installation, local development, building for production, and previewing. These commands are executed from the `docs/` directory.

```bash
pnpm install
pnpm dev
pnpm build
pnpm preview
pnpm astro ...
pnpm astro -- --help
```

--------------------------------

### LangExtract Data Extraction (Python)

Source: https://github.com/langstruct-ai/langstruct/blob/main/docs/src/content/docs/why-langstruct.mdx

Illustrates the usage of LangExtract for data extraction. This approach requires manual prompt engineering and few-shot examples to define the extraction schema and guide the model. It also provides character-level provenance tracking.

```python
from langextract import LangExtract

# Manual prompt engineering required
extractor = LangExtract(
    model="gemini-1.5-flash",
    schema={
        "company": "string",
        "revenue": "number",
        "quarter": "string"
    },
    examples=[
        {"text": "...", "output": {...}},
        {"text": "...", "output": {...}}
    ]
)

result = extractor.extract(text)
print(result.extractions[0].provenance)  # Character tracking
```