# Bob - LLM-Powered Coding Agent Framework

Bob is an LLM-powered coding agent framework built in Rust with a hexagonal (ports & adapters) architecture. It provides a deterministic turn-loop scheduler that orchestrates LLM inference, tool execution via MCP (Model Context Protocol), and session management. The framework connects to language models through the `genai` crate and to external tools via MCP servers using `rmcp`.

The core value proposition is the bounded, cancellable turn loop (scheduler FSM) that coordinates LLM calls, tool invocations, and policy enforcement. The hexagonal design enables clean separation between domain logic (`bob-core`), orchestration (`bob-runtime`), and concrete adapters (`bob-adapters`), making it easy to swap LLM providers, add new tool sources, or customize persistence without touching the core scheduling logic.

## AgentRuntime - Execute Agent Turns

The primary API for running agent turns. Accepts user input and returns the agent's final response after potentially multiple LLM calls and tool executions. Supports both blocking and streaming execution modes.

```rust
use std::collections::HashMap;
use std::sync::Arc;
use bob_core::types::{AgentRequest, AgentRunResult, TurnPolicy};
use bob_runtime::{AgentRuntime, DefaultAgentRuntime};
use bob_adapters::{
    llm_genai::GenAiLlmAdapter,
    store_memory::InMemorySessionStore,
    observe::TracingEventSink,
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create adapters
    let llm = Arc::new(GenAiLlmAdapter::new(genai::Client::default()));
    let tools = Arc::new(NoOpToolPort); // or McpToolAdapter
    let store = Arc::new(InMemorySessionStore::new());
    let events = Arc::new(TracingEventSink::new());

    // Build runtime
    let runtime: Arc<dyn AgentRuntime> = Arc::new(DefaultAgentRuntime {
        llm,
        tools,
        store,
        events,
        default_model: "openai:gpt-4o-mini".to_string(),
        policy: TurnPolicy::default(),
    });

    // Execute a turn
    let request = AgentRequest {
        input: "What is 2 + 2?".to_string(),
        session_id: "session-123".to_string(),
        model: None,  // uses default_model
        metadata: HashMap::new(),
        cancel_token: None,
    };

    match runtime.run(request).await? {
        AgentRunResult::Finished(response) => {
            println!("Response: {}", response.content);
            println!("Tool calls: {:?}", response.tool_transcript);
            println!("Token usage: {} total", response.usage.total());
        }
    }

    Ok(())
}
```

## LlmPort - LLM Inference Interface

Port trait for LLM inference that adapters must implement. Provides both blocking completion and streaming modes. The `GenAiLlmAdapter` implements this using the `genai` crate for multi-provider support.

```rust
use bob_core::{
    ports::LlmPort,
    types::{LlmRequest, LlmResponse, Message, Role, ToolDescriptor},
    error::LlmError,
};
use bob_adapters::llm_genai::GenAiLlmAdapter;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), LlmError> {
    // Create the adapter with default client (reads OPENAI_API_KEY from env)
    let llm: Arc<dyn LlmPort> = Arc::new(
        GenAiLlmAdapter::new(genai::Client::default())
    );

    // Build a request
    let request = LlmRequest {
        model: "openai:gpt-4o-mini".to_string(),
        messages: vec![
            Message { role: Role::System, content: "You are a helpful assistant.".into() },
            Message { role: Role::User, content: "Hello!".into() },
        ],
        tools: vec![],  // Tool descriptors for function calling
    };

    // Non-streaming completion
    let response: LlmResponse = llm.complete(request.clone()).await?;
    println!("Content: {}", response.content);
    println!("Tokens: {} prompt + {} completion",
        response.usage.prompt_tokens,
        response.usage.completion_tokens);

    // Streaming completion
    use futures_util::StreamExt;
    let mut stream = llm.complete_stream(request).await?;
    while let Some(chunk) = stream.next().await {
        match chunk? {
            bob_core::types::LlmStreamChunk::TextDelta(text) => print!("{}", text),
            bob_core::types::LlmStreamChunk::Done { usage } => {
                println!("\n[Done: {} tokens]", usage.total());
            }
        }
    }

    Ok(())
}
```

## ToolPort - Tool Discovery and Execution

Port trait for tool discovery and execution. The `McpToolAdapter` implements this for MCP servers, while `CompositeToolPort` aggregates multiple tool sources under a unified namespace.

```rust
use bob_core::{
    ports::ToolPort,
    types::{ToolCall, ToolDescriptor, ToolResult},
    error::ToolError,
};
use bob_adapters::mcp_rmcp::McpToolAdapter;
use bob_runtime::composite::CompositeToolPort;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), ToolError> {
    // Connect to an MCP server via stdio transport
    let filesystem_adapter = McpToolAdapter::connect_stdio(
        "filesystem",                    // server_id for namespacing
        "npx",                           // command
        &["-y".into(), "@modelcontextprotocol/server-filesystem".into(), ".".into()],
        &[],                             // environment variables
    ).await?;

    let tools: Arc<dyn ToolPort> = Arc::new(filesystem_adapter);

    // List available tools
    let descriptors: Vec<ToolDescriptor> = tools.list_tools().await?;
    for tool in &descriptors {
        println!("Tool: {} - {}", tool.id, tool.description);
        // Output: Tool: mcp/filesystem/read_file - Read a file from disk
    }

    // Call a tool
    let result: ToolResult = tools.call_tool(ToolCall {
        name: "mcp/filesystem/read_file".to_string(),
        arguments: serde_json::json!({"path": "./README.md"}),
    }).await?;

    if result.is_error {
        eprintln!("Tool error: {}", result.output);
    } else {
        println!("Result: {}", result.output);
    }

    Ok(())
}

// Aggregating multiple MCP servers
async fn composite_example() -> Result<(), ToolError> {
    let fs = Arc::new(McpToolAdapter::connect_stdio("fs", "npx",
        &["-y".into(), "@modelcontextprotocol/server-filesystem".into(), ".".into()], &[]).await?);
    let git = Arc::new(McpToolAdapter::connect_stdio("git", "npx",
        &["-y".into(), "@modelcontextprotocol/server-git".into()], &[]).await?);

    let composite = CompositeToolPort::new(vec![
        ("fs".into(), fs as Arc<dyn ToolPort>),
        ("git".into(), git as Arc<dyn ToolPort>),
    ]);

    // Lists tools from all servers
    let all_tools = composite.list_tools().await?;
    // Calls route to correct server based on namespace prefix
    let result = composite.call_tool(ToolCall {
        name: "mcp/git/status".into(),
        arguments: serde_json::json!({}),
    }).await?;

    Ok(())
}
```

## SessionStore - Session Persistence

Port trait for persisting conversation history across turns. The `InMemorySessionStore` provides thread-safe in-memory storage suitable for CLI usage.

```rust
use bob_core::{
    ports::SessionStore,
    types::{SessionId, SessionState, Message, Role, TokenUsage},
    error::StoreError,
};
use bob_adapters::store_memory::InMemorySessionStore;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), StoreError> {
    let store: Arc<dyn SessionStore> = Arc::new(InMemorySessionStore::new());
    let session_id: SessionId = "user-123-session".to_string();

    // Load (returns None if not found)
    let existing = store.load(&session_id).await?;
    assert!(existing.is_none());

    // Create and save session state
    let state = SessionState {
        messages: vec![
            Message { role: Role::User, content: "Hello!".into() },
            Message { role: Role::Assistant, content: "Hi there!".into() },
        ],
        total_usage: TokenUsage { prompt_tokens: 10, completion_tokens: 5 },
    };
    store.save(&session_id, &state).await?;

    // Load persisted state
    let loaded = store.load(&session_id).await?.unwrap();
    assert_eq!(loaded.messages.len(), 2);
    assert_eq!(loaded.total_usage.total(), 15);

    Ok(())
}
```

## AgentAction - Provider-Neutral Action Protocol

The framework uses a provider-neutral JSON action protocol for LLM responses. The scheduler parses these actions to determine the next step: emit final response, call a tool, or ask the user a question.

```rust
use bob_core::types::AgentAction;
use bob_runtime::action::parse_action;

fn main() {
    // Final response - turn completes
    let final_json = r#"{"type": "final", "content": "The answer is 42."}"#;
    match parse_action(final_json).unwrap() {
        AgentAction::Final { content } => println!("Final: {}", content),
        _ => unreachable!(),
    }

    // Tool call - scheduler executes tool and loops back
    let tool_json = r#"{
        "type": "tool_call",
        "name": "mcp/filesystem/read_file",
        "arguments": {"path": "./config.toml"}
    }"#;
    match parse_action(tool_json).unwrap() {
        AgentAction::ToolCall { name, arguments } => {
            println!("Call tool: {} with {:?}", name, arguments);
        }
        _ => unreachable!(),
    }

    // Ask user - returns question to user
    let ask_json = r#"{"type": "ask_user", "question": "Which file should I modify?"}"#;
    match parse_action(ask_json).unwrap() {
        AgentAction::AskUser { question } => println!("Question: {}", question),
        _ => unreachable!(),
    }

    // Handles markdown code fences from LLM output
    let fenced = "```json\n{\"type\": \"final\", \"content\": \"Done!\"}\n```";
    assert!(parse_action(fenced).is_ok());
}
```

## TurnPolicy - Execution Limits and Guards

Configures safety bounds for the turn loop including maximum steps, tool calls, timeouts, and error tolerance. The `LoopGuard` enforces these limits during execution.

```rust
use bob_core::types::{TurnPolicy, GuardReason};
use bob_runtime::scheduler::LoopGuard;
use std::time::Duration;

fn main() {
    // Default policy
    let policy = TurnPolicy::default();
    assert_eq!(policy.max_steps, 12);
    assert_eq!(policy.max_tool_calls, 8);
    assert_eq!(policy.max_consecutive_errors, 2);
    assert_eq!(policy.turn_timeout_ms, 90_000);
    assert_eq!(policy.tool_timeout_ms, 15_000);

    // Custom policy for resource-constrained environment
    let strict_policy = TurnPolicy {
        max_steps: 5,
        max_tool_calls: 3,
        max_consecutive_errors: 1,
        turn_timeout_ms: 30_000,
        tool_timeout_ms: 5_000,
    };

    // LoopGuard tracks execution and enforces limits
    let mut guard = LoopGuard::new(strict_policy);

    assert!(guard.can_continue());

    guard.record_step();
    guard.record_tool_call();
    guard.record_step();

    // After exceeding limits
    for _ in 0..5 { guard.record_step(); }

    assert!(!guard.can_continue());
    assert_eq!(guard.reason(), GuardReason::MaxSteps);
}
```

## EventSink - Observability Events

Port trait for emitting structured events during agent execution. Events cover the full turn lifecycle for monitoring and debugging.

```rust
use bob_core::{
    ports::EventSink,
    types::{AgentEvent, TokenUsage, FinishReason},
};
use std::sync::Mutex;

// Custom event sink for testing or custom logging
struct CollectingEventSink {
    events: Mutex<Vec<AgentEvent>>,
}

impl EventSink for CollectingEventSink {
    fn emit(&self, event: AgentEvent) {
        self.events.lock().unwrap().push(event);
    }
}

fn main() {
    let sink = CollectingEventSink { events: Mutex::new(vec![]) };

    // Events emitted during a typical turn:
    sink.emit(AgentEvent::TurnStarted { session_id: "s1".into() });
    sink.emit(AgentEvent::SkillsSelected { skill_names: vec!["rust-review".into()] });
    sink.emit(AgentEvent::LlmCallStarted { model: "openai:gpt-4o-mini".into() });
    sink.emit(AgentEvent::LlmCallCompleted {
        usage: TokenUsage { prompt_tokens: 100, completion_tokens: 50 }
    });
    sink.emit(AgentEvent::ToolCallStarted { name: "mcp/fs/read_file".into() });
    sink.emit(AgentEvent::ToolCallCompleted { name: "mcp/fs/read_file".into(), is_error: false });
    sink.emit(AgentEvent::TurnCompleted { finish_reason: FinishReason::Stop });

    let events = sink.events.lock().unwrap();
    assert_eq!(events.len(), 7);
}
```

## Streaming API - Real-time Token Updates

The streaming API provides real-time token updates and tool execution events for responsive UIs.

```rust
use bob_core::types::{AgentRequest, AgentStreamEvent};
use bob_runtime::AgentRuntime;
use futures_util::StreamExt;
use std::collections::HashMap;
use std::sync::Arc;

async fn streaming_example(runtime: Arc<dyn AgentRuntime>) {
    let request = AgentRequest {
        input: "Write a haiku about Rust.".into(),
        session_id: "stream-session".into(),
        model: None,
        metadata: HashMap::new(),
        cancel_token: None,
    };

    let mut stream = runtime.run_stream(request).await.unwrap();

    while let Some(event) = stream.next().await {
        match event {
            AgentStreamEvent::TextDelta { content } => {
                print!("{}", content);  // Real-time token output
            }
            AgentStreamEvent::ToolCallStarted { name, call_id } => {
                println!("\n[Tool starting: {} ({})]", name, call_id);
            }
            AgentStreamEvent::ToolCallCompleted { call_id, result } => {
                println!("[Tool completed: {} - error: {}]", call_id, result.is_error);
            }
            AgentStreamEvent::Finished { usage } => {
                println!("\n[Finished: {} tokens used]", usage.total());
            }
            AgentStreamEvent::Error { error } => {
                eprintln!("\n[Error: {}]", error);
            }
        }
    }
}
```

## Configuration - TOML Setup

The CLI agent loads configuration from `agent.toml` defining the model, MCP servers, skills, and policies.

```toml
# agent.toml - Bob Agent Configuration

[runtime]
default_model = "openai:gpt-4o-mini"  # Format: "provider:model"
max_steps = 12                         # Maximum scheduler iterations
turn_timeout_ms = 90000                # 90 second turn timeout
# model_context_tokens = 128000        # For ratio-based skill budgets

[llm]
# retry_max = 2
# stream_default = false

[policy]
deny_tools = ["local/shell_exec"]      # Blocked tools
# allow_tools = ["mcp/fs/read_file"]   # Allowlist (optional)

# MCP Server Configuration
[[mcp.servers]]
id = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "."]
tool_timeout_ms = 15000
env = { OPENAI_API_KEY = "${OPENAI_API_KEY}" }

# Multiple servers supported
[[mcp.servers]]
id = "git"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-git"]

# Skills Configuration (optional)
[skills]
max_selected = 3
token_budget_tokens = 1800
# token_budget_ratio = 0.10  # Alternative: percentage of context

[[skills.sources]]
type = "directory"
path = "./skills"
recursive = true
```

## Cancellation - Cooperative Turn Cancellation

Turns can be cancelled mid-execution using a cancellation token. The scheduler checks the token at safe points and terminates gracefully.

```rust
use bob_core::types::{AgentRequest, AgentRunResult, CancelToken, FinishReason};
use bob_runtime::AgentRuntime;
use std::collections::HashMap;
use std::sync::Arc;
use std::time::Duration;

async fn cancellation_example(runtime: Arc<dyn AgentRuntime>) {
    let cancel_token = CancelToken::new();
    let token_clone = cancel_token.clone();

    let request = AgentRequest {
        input: "Perform a long-running analysis...".into(),
        session_id: "cancel-demo".into(),
        model: None,
        metadata: HashMap::new(),
        cancel_token: Some(cancel_token),
    };

    // Cancel after 5 seconds from another task
    tokio::spawn(async move {
        tokio::time::sleep(Duration::from_secs(5)).await;
        token_clone.cancel();
        println!("Cancellation requested");
    });

    match runtime.run(request).await {
        Ok(AgentRunResult::Finished(resp)) => {
            if resp.finish_reason == FinishReason::Cancelled {
                println!("Turn was cancelled: {}", resp.content);
            } else {
                println!("Completed: {}", resp.content);
            }
        }
        Err(e) => eprintln!("Error: {}", e),
    }
}
```

## Error Handling - Structured Error Types

The framework provides a structured error taxonomy with automatic conversion between error layers.

```rust
use bob_core::error::{AgentError, LlmError, ToolError, StoreError};
use bob_core::types::GuardReason;

fn handle_agent_error(err: AgentError) {
    match err {
        AgentError::Llm(llm_err) => match llm_err {
            LlmError::RateLimited => println!("Rate limited, retry later"),
            LlmError::ContextLengthExceeded => println!("Prompt too long"),
            LlmError::Provider(msg) => println!("Provider error: {}", msg),
            LlmError::Stream(msg) => println!("Stream error: {}", msg),
            LlmError::Other(e) => println!("LLM error: {}", e),
        },
        AgentError::Tool(tool_err) => match tool_err {
            ToolError::NotFound { name } => println!("Tool not found: {}", name),
            ToolError::Timeout { name } => println!("Tool timed out: {}", name),
            ToolError::Execution(msg) => println!("Tool execution failed: {}", msg),
            ToolError::Other(e) => println!("Tool error: {}", e),
        },
        AgentError::Store(store_err) => match store_err {
            StoreError::Serialization(msg) => println!("Serialization error: {}", msg),
            StoreError::Backend(msg) => println!("Storage backend error: {}", msg),
            StoreError::Other(e) => println!("Store error: {}", e),
        },
        AgentError::Policy(violation) => println!("Policy violation: {}", violation),
        AgentError::Timeout => println!("Turn timed out"),
        AgentError::GuardExceeded { reason } => match reason {
            GuardReason::MaxSteps => println!("Exceeded maximum steps"),
            GuardReason::MaxToolCalls => println!("Exceeded maximum tool calls"),
            GuardReason::MaxConsecutiveErrors => println!("Too many consecutive errors"),
            GuardReason::TurnTimeout => println!("Turn timeout exceeded"),
            GuardReason::Cancelled => println!("Turn was cancelled"),
        },
        AgentError::Internal(e) => println!("Internal error: {}", e),
    }
}
```

Bob is designed for developers building LLM-powered agents in Rust who need a production-ready framework with deterministic execution, comprehensive observability, and flexible tool integration. The hexagonal architecture ensures that adapters for LLM providers, tool systems, and storage backends can be swapped without modifying the core scheduling logic.

Common integration patterns include embedding the runtime in CLI tools, web services, or desktop applications. The framework supports both synchronous (blocking) and asynchronous (streaming) execution modes, making it suitable for batch processing and real-time interactive use cases. The MCP integration enables connecting to any tool server implementing the Model Context Protocol, while the skill system allows injecting domain-specific instructions into the agent's system prompt based on user input.