# Rig

Rig is a Rust library (crate: `rig-core`, facade: `rig`) for building scalable, modular, and ergonomic LLM-powered applications. It provides a unified, provider-agnostic interface over 20+ model providers (OpenAI, Anthropic, Gemini, Cohere, Groq, Mistral, Ollama, DeepSeek, xAI, and more), exposing consistent traits for completion, embedding, transcription, image generation, and audio generation. The library is fully async (Tokio), WASM-compatible at the core level, and implements the GenAI Semantic Convention for OpenTelemetry tracing.

The core design centers on a few composable primitives: `Agent` (an LLM with a preamble, static/dynamic context, and tools), `EmbeddingsBuilder` (for generating and batching embedding vectors), `Extractor` (structured data extraction via tool-calling), and `VectorStoreIndex` (a common interface for in-memory or external vector databases). Vector store integrations (LanceDB, MongoDB, Qdrant, PostgreSQL, SQLite, SurrealDB, Milvus, ScyllaDB, Neo4j, S3Vectors, HelixDB, Vectorize) are available as optional companion crates gated behind feature flags. The root `rig` facade re-exports everything from `rig-core` plus companion crates under a single dependency.

---

## Installation

```toml
# Cargo.toml

# Root facade with optional integrations
[dependencies]
rig = { version = "0.37.0", features = ["lancedb", "fastembed", "memory"] }
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
schemars = "1"
anyhow = "1"

# Or use rig-core directly for minimal footprint
# rig-core = { version = "0.37.0", features = ["derive"] }
```

---

## Creating a Provider Client

Every workflow starts with a typed provider `Client`. Clients are created from environment variables or explicit API keys and expose builder methods for agents, embedding models, extractors, and more.

```rust
use rig::client::{CompletionClient, EmbeddingsClient, ProviderClient};
use rig::providers::{anthropic, cohere, gemini, groq, mistral, openai};

// OpenAI – reads OPENAI_API_KEY from environment
let openai = openai::Client::from_env()?;

// Anthropic – reads ANTHROPIC_API_KEY
let anthropic = anthropic::Client::from_env()?;

// Gemini – reads GEMINI_API_KEY
let gemini = gemini::Client::from_env()?;

// Groq – reads GROQ_API_KEY
let groq = groq::Client::from_env()?;

// Explicit API key
let openai_explicit = openai::Client::new("sk-...")?;

// Access the raw completion model for low-level use
let model = openai.completion_model(openai::GPT_4O);
let embedding_model = openai.embedding_model(openai::TEXT_EMBEDDING_3_SMALL);
```

---

## Agent – Simple Prompt

`Agent` is the primary high-level abstraction. Built via a fluent `AgentBuilder`, it combines a completion model with a system preamble, optional static context documents, tools, and conversation memory. The `Prompt` trait provides a one-shot `.prompt()` method.

```rust
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::Prompt;
use rig::providers::openai;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let client = openai::Client::from_env()?;

    let agent = client
        .agent(openai::GPT_4O)
        .preamble("You are a concise assistant. Answer in one sentence.")
        .temperature(0.5)
        .max_tokens(256)
        .build();

    let response = agent.prompt("What is the capital of France?").await?;
    println!("{response}");
    // Output: The capital of France is Paris.

    Ok(())
}
```

---

## Agent – Chat with History

The `Chat` trait lets you pass an existing message history alongside a new prompt. The agent appends the new user turn, calls the model, and returns only the assistant's text.

```rust
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::{Chat, Message};
use rig::providers::openai;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are a helpful assistant.")
        .build();

    let mut history: Vec<Message> = vec![
        Message::user("My name is Alice."),
        Message::assistant("Nice to meet you, Alice!"),
    ];

    let response = agent.chat("What is my name?", &mut history).await?;
    println!("{response}");
    // Output: Your name is Alice.

    Ok(())
}
```

---

## Agent – Tools (Function Calling)

Implement the `Tool` trait on any struct to expose typed callable functions to the model. Register tools on the builder with `.tool()` or `.tools()`. The agent automatically dispatches tool calls and feeds results back into the model in a multi-turn loop.

```rust
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::{Prompt, ToolDefinition};
use rig::providers::openai;
use rig::tool::Tool;
use serde::{Deserialize, Serialize};
use serde_json::json;

#[derive(Deserialize)]
struct Args { x: f64, y: f64 }

#[derive(Debug, thiserror::Error)]
#[error("math error")]
struct MathError;

#[derive(Deserialize, Serialize)]
struct Add;

impl Tool for Add {
    const NAME: &'static str = "add";
    type Error = MathError;
    type Args = Args;
    type Output = f64;

    async fn definition(&self, _: String) -> ToolDefinition {
        ToolDefinition {
            name: "add".into(),
            description: "Add two numbers together".into(),
            parameters: json!({
                "type": "object",
                "properties": {
                    "x": { "type": "number" },
                    "y": { "type": "number" }
                },
                "required": ["x", "y"]
            }),
        }
    }

    async fn call(&self, args: Self::Args) -> Result<f64, MathError> {
        Ok(args.x + args.y)
    }
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are a calculator. Use tools before answering.")
        .tool(Add)
        .max_tokens(512)
        .build();

    // With max_turns the agent will iterate tool calls automatically
    let result = agent.prompt("What is 47.3 + 18.9?").max_turns(5).await?;
    println!("{result}");
    // Output: 47.3 + 18.9 = 66.2

    Ok(())
}
```

---

## Agent – RAG with Dynamic Context

Attach a `VectorStoreIndex` as dynamic context with `.dynamic_context(n, index)`. On each prompt, the `n` most-similar documents are retrieved and injected into the completion request automatically.

```rust
use rig::prelude::*;
use rig::client::{CompletionClient, EmbeddingsClient, ProviderClient};
use rig::completion::Prompt;
use rig::embeddings::EmbeddingsBuilder;
use rig::providers::openai;
use rig::vector_store::in_memory_store::InMemoryVectorStore;
use rig::Embed;
use serde::Serialize;

#[derive(Embed, Serialize, Clone, Debug, Eq, PartialEq, Default)]
struct Article {
    id: String,
    title: String,
    #[embed]
    body: Vec<String>,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let openai = openai::Client::from_env()?;
    let embed_model = openai.embedding_model(openai::TEXT_EMBEDDING_ADA_002);

    let embeddings = EmbeddingsBuilder::new(embed_model.clone())
        .documents(vec![
            Article {
                id: "1".into(),
                title: "Rust ownership".into(),
                body: vec!["Ownership is Rust's most unique feature.".into()],
            },
            Article {
                id: "2".into(),
                title: "Async Rust".into(),
                body: vec!["Tokio is the de-facto async runtime for Rust.".into()],
            },
        ])?
        .build()
        .await?;

    let store = InMemoryVectorStore::from_documents(embeddings);
    let index = store.index(embed_model);

    let agent = openai
        .agent(openai::GPT_4O)
        .preamble("You are a Rust expert. Use the provided context to answer questions.")
        .dynamic_context(1, index)  // retrieve top-1 document per query
        .build();

    let response = agent.prompt("Tell me about async Rust.").await?;
    println!("{response}");

    Ok(())
}
```

---

## Agent – Conversation Memory

Attach an `InMemoryConversationMemory` (or custom `ConversationMemory` backend) to persist history across calls. Identify each conversation with `.conversation("id")` per request.

```rust
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::Prompt;
use rig::memory::InMemoryConversationMemory;
use rig::providers::openai;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let memory = InMemoryConversationMemory::new();

    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are a helpful assistant with persistent memory.")
        .memory(memory)
        .build();

    // Turn 1 – establish a fact
    let t1 = agent
        .prompt("My name is Alice and I work at Acme Corp.")
        .conversation("session-42")
        .await?;
    println!("turn 1: {t1}");

    // Turn 2 – test recall across turns
    let t2 = agent
        .prompt("Where do I work?")
        .conversation("session-42")
        .await?;
    println!("turn 2: {t2}");
    // Output: turn 2: You work at Acme Corp.

    Ok(())
}
```

---

## Streaming Prompt and Chat

The `StreamingPrompt` and `StreamingChat` traits expose `.stream_prompt()` and `.stream_chat()` which return a `StreamingPromptRequest`. Await the stream and iterate `MultiTurnStreamItem` chunks.

```rust
use anyhow::Result;
use futures::StreamExt;
use rig::agent::{MultiTurnStreamItem, StreamingResult};
use rig::client::{CompletionClient, ProviderClient};
use rig::message::Message;
use rig::providers::openai;
use rig::streaming::StreamingChat;

#[tokio::main]
async fn main() -> Result<()> {
    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are a storyteller.")
        .build();

    let history = vec![
        Message::user("Start a short story about a robot."),
        Message::assistant("Once upon a time, a small robot named Byte lived in a server room..."),
    ];

    let mut stream = agent.stream_chat("Continue the story.", &history).await;

    while let Some(item) = stream.next().await {
        match item? {
            MultiTurnStreamItem::Text(chunk) => print!("{chunk}"),
            MultiTurnStreamItem::FinalResponse(fin) => {
                println!("\n--- done ---");
                // fin.history() returns the updated message list
                let _ = fin.history();
            }
            _ => {}
        }
    }

    Ok(())
}
```

---

## Extractor – Structured Data from Text

`Extractor<M, T>` wraps a tool-calling agent to parse arbitrary text into a strongly-typed Rust struct. The target type must derive `serde::Deserialize`, `serde::Serialize`, and `schemars::JsonSchema`.

```rust
use anyhow::Result;
use rig::client::ProviderClient;
use rig::providers::openai;
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Debug, Deserialize, JsonSchema, Serialize)]
struct Invoice {
    #[schemars(required)]
    vendor: Option<String>,
    #[schemars(required)]
    amount_usd: Option<f64>,
    #[schemars(required)]
    due_date: Option<String>,
}

#[tokio::main]
async fn main() -> Result<()> {
    let extractor = openai::Client::from_env()?
        .extractor::<Invoice>(openai::GPT_4O)
        .retries(2)   // retry up to 2 times on extraction failure
        .build();

    let text = "Please pay $1,250.00 to Acme Supplies by 2024-12-31.";
    let response = extractor.extract_with_usage(text).await?;

    println!("{}", serde_json::to_string_pretty(&response.data)?);
    println!("tokens used: {}", response.usage.total_tokens);
    // Output:
    // {
    //   "vendor": "Acme Supplies",
    //   "amount_usd": 1250.0,
    //   "due_date": "2024-12-31"
    // }
    // tokens used: 312

    Ok(())
}
```

---

## EmbeddingsBuilder – Batch Embedding

`EmbeddingsBuilder` batches documents into embedding requests and associates the resulting vectors with the original data for ingestion into any vector store.

```rust
use rig::client::{EmbeddingsClient, ProviderClient};
use rig::embeddings::EmbeddingsBuilder;
use rig::providers::openai;
use rig::Embed;
use serde::Serialize;

#[derive(Embed, Serialize, Clone, Debug, Default)]
struct Product {
    id: String,
    name: String,
    #[embed]
    description: Vec<String>,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let openai = openai::Client::from_env()?;
    let model = openai.embedding_model(openai::TEXT_EMBEDDING_3_SMALL);

    let embeddings = EmbeddingsBuilder::new(model)
        .documents(vec![
            Product {
                id: "p1".into(),
                name: "Widget Pro".into(),
                description: vec!["A high-performance widget for professionals.".into()],
            },
            Product {
                id: "p2".into(),
                name: "Budget Widget".into(),
                description: vec!["An affordable entry-level widget.".into()],
            },
        ])?
        .build()
        .await?;

    println!("Generated {} embeddings", embeddings.len());
    // Each item: (DocumentEmbeddings, Product)

    Ok(())
}
```

---

## VectorStoreIndex – Similarity Search

After building embeddings, create an `InMemoryVectorStore` (or any vector store backend) and query its index using `VectorSearchRequest`.

```rust
use rig::prelude::*;
use rig::client::{EmbeddingsClient, ProviderClient};
use rig::embeddings::EmbeddingsBuilder;
use rig::providers::openai;
use rig::vector_store::{VectorStoreIndex, in_memory_store::InMemoryVectorStore, request::VectorSearchRequest};
use rig::Embed;
use serde::{Deserialize, Serialize};

#[derive(Embed, Clone, Deserialize, Serialize, Debug, Default)]
struct Doc {
    id: String,
    #[embed]
    content: Vec<String>,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let openai = openai::Client::from_env()?;
    let model = openai.embedding_model(openai::TEXT_EMBEDDING_ADA_002);

    let embeddings = EmbeddingsBuilder::new(model.clone())
        .documents(vec![
            Doc { id: "d1".into(), content: vec!["Rust is a systems programming language.".into()] },
            Doc { id: "d2".into(), content: vec!["Python is great for data science.".into()] },
            Doc { id: "d3".into(), content: vec!["Go is designed for cloud services.".into()] },
        ])?
        .build()
        .await?;

    let store = InMemoryVectorStore::from_documents_with_id_f(embeddings, |d| d.id.clone());
    let index = store.index(model);

    let req = VectorSearchRequest::builder()
        .query("Which language is used for low-level programming?")
        .samples(2)
        .build();

    // top_n returns (score, id, document)
    let results = index.top_n::<Doc>(req.clone()).await?;
    for (score, id, doc) in &results {
        println!("score={score:.4} id={id} content={:?}", doc.content);
    }

    // top_n_ids returns only (score, id)
    let ids = index.top_n_ids(req).await?;
    println!("top ids: {:?}", ids);

    Ok(())
}
```

---

## Pipeline – Composable Data Flows

The `pipeline` module provides `Op`-based functional composition for chaining arbitrary transformations, parallel lookups, and agent prompts into a single callable pipeline.

```rust
use rig::prelude::*;
use rig::providers::openai;
use rig::{
    embeddings::EmbeddingsBuilder,
    parallel,
    pipeline::{self, Op, agent_ops::lookup, passthrough},
    providers::openai::Client,
    vector_store::in_memory_store::InMemoryVectorStore,
};

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let client = Client::from_env()?;
    let embed_model = client.embedding_model(openai::TEXT_EMBEDDING_ADA_002);

    let mut builder = EmbeddingsBuilder::new(embed_model.clone());
    for def in ["The sky is blue.", "Water is wet.", "Fire is hot."] {
        builder = builder.document(def)?;
    }
    let store = InMemoryVectorStore::from_documents(builder.build().await?);
    let index = store.index(embed_model);

    let agent = client
        .agent(openai::GPT_4O)
        .preamble("You are a helpful assistant. Use the provided context.")
        .build();

    // Build a pipeline: retrieve context in parallel, merge it, then prompt
    let chain = pipeline::new()
        .chain(parallel!(
            passthrough::<&str>(),
            lookup::<_, _, String>(index, 1),
        ))
        .map(|(prompt, docs)| match docs {
            Ok(d) => format!("Context: {}\n\nQuestion: {}", d.into_iter().map(|(_, _, s)| s).collect::<Vec<_>>().join(" "), prompt),
            Err(_) => prompt.to_string(),
        })
        .prompt(agent);

    let response = chain.call("What color is the sky?").await?;
    println!("{response}");

    Ok(())
}
```

---

## Transcription Model

Any provider that supports speech-to-text exposes `.transcription_model()`. Build a request with `.transcription_request()`, attach an audio file, and call `.send()`.

```rust
use rig::client::ProviderClient;
use rig::providers::openai;
use rig::transcription::TranscriptionModel;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let openai = openai::Client::from_env()?;
    let whisper = openai.transcription_model(openai::WHISPER_1);

    let response = whisper
        .transcription_request()
        .load_file("audio.mp3")?
        .send()
        .await?;

    println!("Transcript: {}", response.text);
    // Output: Transcript: Hello, this is a test audio file.

    Ok(())
}
```

---

## Prompt Hook – Observability Middleware

Implement `PromptHook<M>` to intercept every completion call and response. Attach a hook to a `PromptRequest` with `.with_hook()` to log, audit, or modify the flow.

```rust
use rig::agent::{HookAction, PromptHook};
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::{CompletionModel, CompletionResponse, Message, Prompt};
use rig::message::UserContent;
use rig::providers::openai;

#[derive(Clone)]
struct AuditHook { session_id: String }

impl<M: CompletionModel> PromptHook<M> for AuditHook {
    async fn on_completion_call(&self, prompt: &Message, _history: &[Message]) -> HookAction {
        if let Message::User { content } = prompt {
            let text: String = content.iter().filter_map(|c| {
                if let UserContent::Text(t) = c { Some(t.text.clone()) } else { None }
            }).collect::<Vec<_>>().join(" ");
            eprintln!("[audit:{}] prompt: {text}", self.session_id);
        }
        HookAction::cont()  // continue processing
    }

    async fn on_completion_response(&self, _prompt: &Message, response: &CompletionResponse<M::Response>) -> HookAction {
        eprintln!("[audit:{}] received {} choice(s)", self.session_id, response.choice.len());
        HookAction::cont()
    }
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("Be concise.")
        .build();

    let response = agent
        .prompt("Why is the sky blue?")
        .with_hook(AuditHook { session_id: "req-001".into() })
        .await?;

    println!("{response}");
    Ok(())
}
```

---

## MCP Integration (Model Context Protocol)

Rig supports the `rmcp` crate for connecting to MCP servers. Use `McpClientHandler` to automatically refresh tools from an MCP server and attach the resulting `ToolServerHandle` to an agent.

```rust
use rig::{
    client::{CompletionClient, ProviderClient},
    completion::Prompt,
    providers::openai,
    tool::{rmcp::McpClientHandler, server::ToolServer},
};
use rmcp::{ClientCapabilities, ClientInfo, Implementation, model::*};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let client_info = ClientInfo::new(
        ClientCapabilities::default(),
        Implementation::new("my-agent", "0.1.0"),
    );

    // Create a shared ToolServer that the MCP handler will populate
    let tool_server_handle = ToolServer::new().run();

    // McpClientHandler auto-refreshes tools on notifications/tools/list_changed
    let handler = McpClientHandler::new(client_info, tool_server_handle.clone());
    let transport = rmcp::transport::StreamableHttpClientTransport::from_uri("http://localhost:8080");
    let _service = handler.connect(transport).await?;

    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are a helpful assistant with access to external tools.")
        .tool_server_handle(tool_server_handle)   // attach live MCP tools
        .build();

    let result = agent.prompt("What is 12 + 34?").max_turns(3).await?;
    println!("{result}");

    Ok(())
}
```

---

## FileLoader – Loading Documents from Disk

`FileLoader` reads individual files, directories, or glob patterns into strings. Optional `pdf` and `epub` feature flags unlock `PdfFileLoader` and `EpubFileLoader`.

```rust
use rig::loaders::FileLoader;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Load all Markdown files in a directory
    let docs: Vec<String> = FileLoader::with_glob("docs/**/*.md")?
        .load()
        .collect::<Result<Vec<_>, _>>()?;

    println!("Loaded {} documents", docs.len());

    // Load with path metadata
    let with_paths: Vec<(std::path::PathBuf, String)> = FileLoader::with_glob("src/**/*.rs")?
        .load_with_path()
        .collect::<Result<Vec<_>, _>>()?;

    for (path, content) in &with_paths {
        println!("{}: {} bytes", path.display(), content.len());
    }

    Ok(())
}
```

---

## Custom Vector Store Implementation

Implement `VectorStoreIndex` to integrate any external database. Only `top_n` and `top_n_ids` are required.

```rust
use std::future::Future;
use rig::vector_store::{VectorStoreError, VectorStoreIndex, request::VectorSearchRequest};
use serde::Deserialize;

struct MyVectorStore; // wraps your DB client

impl VectorStoreIndex for MyVectorStore {
    async fn top_n<T: for<'a> Deserialize<'a> + Send>(
        &self,
        req: VectorSearchRequest,
    ) -> Result<Vec<(f64, String, T)>, VectorStoreError> {
        // query your database, return (score, id, document) tuples
        Ok(vec![])
    }

    async fn top_n_ids(
        &self,
        req: VectorSearchRequest,
    ) -> Result<Vec<(f64, String)>, VectorStoreError> {
        // lighter query returning only scores and IDs
        Ok(vec![])
    }
}
```

---

## Telemetry – OpenTelemetry Integration

Rig emits GenAI Semantic Convention-compliant spans. Configure an OTLP exporter and attach it before creating clients.

```rust
use opentelemetry::global;
use opentelemetry_otlp::WithExportConfig;
use opentelemetry_sdk::runtime;
use rig::client::{CompletionClient, ProviderClient};
use rig::completion::Prompt;
use rig::providers::openai;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Set up OTLP exporter
    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_endpoint("http://localhost:4317"),
        )
        .install_batch(runtime::Tokio)?;

    tracing_subscriber::registry()
        .with(tracing_opentelemetry::layer().with_tracer(tracer))
        .with(tracing_subscriber::fmt::layer())
        .init();

    let agent = openai::Client::from_env()?
        .agent(openai::GPT_4O)
        .preamble("You are helpful.")
        .build();

    let response = agent.prompt("Say hello.").await?;
    println!("{response}");

    global::shutdown_tracer_provider();
    Ok(())
}
```

---

## Summary

Rig's primary use cases are LLM-powered chat agents, tool-calling orchestration, retrieval-augmented generation (RAG) pipelines, and structured data extraction. The library eliminates provider lock-in through its trait-based model abstraction — swapping from OpenAI to Anthropic, Gemini, or any other supported provider requires only changing the client constructor and model constant. For production workloads, agents can be enhanced with persistent conversation memory (swappable backends), streaming responses for low-latency UX, and OpenTelemetry tracing for observability. The `rig-memory` companion crate adds sliding-window and token-budget history policies on top of the core memory trait.

Integration follows a consistent pattern: create a `Client`, build an `Agent` or `Extractor` via the fluent builder, then call high-level methods (`prompt`, `chat`, `stream_prompt`, `extract`). More complex architectures compose multiple agents into pipelines using the `pipeline` module's `parallel!` macro and `Op` trait, or delegate tool execution to external MCP servers via the `rmcp` feature. Vector store integrations (LanceDB, Qdrant, MongoDB, PostgreSQL, etc.) all satisfy the same `VectorStoreIndex` trait, so the RAG pattern ports across databases without changing agent code.