# Bob - LLM-Powered Coding Agent Framework Bob is an LLM-powered coding agent framework built in Rust with a hexagonal (ports & adapters) architecture. It provides a deterministic turn-loop scheduler that orchestrates LLM inference, tool execution via MCP (Model Context Protocol), and session management. The framework connects to language models through the `genai` crate and to external tools via MCP servers using `rmcp`. The core value proposition is the bounded, cancellable turn loop (scheduler FSM) that coordinates LLM calls, tool invocations, and policy enforcement. The hexagonal design enables clean separation between domain logic (`bob-core`), orchestration (`bob-runtime`), and concrete adapters (`bob-adapters`), making it easy to swap LLM providers, add new tool sources, or customize persistence without touching the core scheduling logic. ## AgentRuntime - Execute Agent Turns The primary API for running agent turns. Accepts user input and returns the agent's final response after potentially multiple LLM calls and tool executions. Supports both blocking and streaming execution modes. ```rust use std::collections::HashMap; use std::sync::Arc; use bob_core::types::{AgentRequest, AgentRunResult, TurnPolicy}; use bob_runtime::{AgentRuntime, DefaultAgentRuntime}; use bob_adapters::{ llm_genai::GenAiLlmAdapter, store_memory::InMemorySessionStore, observe::TracingEventSink, }; #[tokio::main] async fn main() -> Result<(), Box> { // Create adapters let llm = Arc::new(GenAiLlmAdapter::new(genai::Client::default())); let tools = Arc::new(NoOpToolPort); // or McpToolAdapter let store = Arc::new(InMemorySessionStore::new()); let events = Arc::new(TracingEventSink::new()); // Build runtime let runtime: Arc = Arc::new(DefaultAgentRuntime { llm, tools, store, events, default_model: "openai:gpt-4o-mini".to_string(), policy: TurnPolicy::default(), }); // Execute a turn let request = AgentRequest { input: "What is 2 + 2?".to_string(), session_id: "session-123".to_string(), model: None, // uses default_model metadata: HashMap::new(), cancel_token: None, }; match runtime.run(request).await? { AgentRunResult::Finished(response) => { println!("Response: {}", response.content); println!("Tool calls: {:?}", response.tool_transcript); println!("Token usage: {} total", response.usage.total()); } } Ok(()) } ``` ## LlmPort - LLM Inference Interface Port trait for LLM inference that adapters must implement. Provides both blocking completion and streaming modes. The `GenAiLlmAdapter` implements this using the `genai` crate for multi-provider support. ```rust use bob_core::{ ports::LlmPort, types::{LlmRequest, LlmResponse, Message, Role, ToolDescriptor}, error::LlmError, }; use bob_adapters::llm_genai::GenAiLlmAdapter; use std::sync::Arc; #[tokio::main] async fn main() -> Result<(), LlmError> { // Create the adapter with default client (reads OPENAI_API_KEY from env) let llm: Arc = Arc::new( GenAiLlmAdapter::new(genai::Client::default()) ); // Build a request let request = LlmRequest { model: "openai:gpt-4o-mini".to_string(), messages: vec![ Message { role: Role::System, content: "You are a helpful assistant.".into() }, Message { role: Role::User, content: "Hello!".into() }, ], tools: vec![], // Tool descriptors for function calling }; // Non-streaming completion let response: LlmResponse = llm.complete(request.clone()).await?; println!("Content: {}", response.content); println!("Tokens: {} prompt + {} completion", response.usage.prompt_tokens, response.usage.completion_tokens); // Streaming completion use futures_util::StreamExt; let mut stream = llm.complete_stream(request).await?; while let Some(chunk) = stream.next().await { match chunk? { bob_core::types::LlmStreamChunk::TextDelta(text) => print!("{}", text), bob_core::types::LlmStreamChunk::Done { usage } => { println!("\n[Done: {} tokens]", usage.total()); } } } Ok(()) } ``` ## ToolPort - Tool Discovery and Execution Port trait for tool discovery and execution. The `McpToolAdapter` implements this for MCP servers, while `CompositeToolPort` aggregates multiple tool sources under a unified namespace. ```rust use bob_core::{ ports::ToolPort, types::{ToolCall, ToolDescriptor, ToolResult}, error::ToolError, }; use bob_adapters::mcp_rmcp::McpToolAdapter; use bob_runtime::composite::CompositeToolPort; use std::sync::Arc; #[tokio::main] async fn main() -> Result<(), ToolError> { // Connect to an MCP server via stdio transport let filesystem_adapter = McpToolAdapter::connect_stdio( "filesystem", // server_id for namespacing "npx", // command &["-y".into(), "@modelcontextprotocol/server-filesystem".into(), ".".into()], &[], // environment variables ).await?; let tools: Arc = Arc::new(filesystem_adapter); // List available tools let descriptors: Vec = tools.list_tools().await?; for tool in &descriptors { println!("Tool: {} - {}", tool.id, tool.description); // Output: Tool: mcp/filesystem/read_file - Read a file from disk } // Call a tool let result: ToolResult = tools.call_tool(ToolCall { name: "mcp/filesystem/read_file".to_string(), arguments: serde_json::json!({"path": "./README.md"}), }).await?; if result.is_error { eprintln!("Tool error: {}", result.output); } else { println!("Result: {}", result.output); } Ok(()) } // Aggregating multiple MCP servers async fn composite_example() -> Result<(), ToolError> { let fs = Arc::new(McpToolAdapter::connect_stdio("fs", "npx", &["-y".into(), "@modelcontextprotocol/server-filesystem".into(), ".".into()], &[]).await?); let git = Arc::new(McpToolAdapter::connect_stdio("git", "npx", &["-y".into(), "@modelcontextprotocol/server-git".into()], &[]).await?); let composite = CompositeToolPort::new(vec![ ("fs".into(), fs as Arc), ("git".into(), git as Arc), ]); // Lists tools from all servers let all_tools = composite.list_tools().await?; // Calls route to correct server based on namespace prefix let result = composite.call_tool(ToolCall { name: "mcp/git/status".into(), arguments: serde_json::json!({}), }).await?; Ok(()) } ``` ## SessionStore - Session Persistence Port trait for persisting conversation history across turns. The `InMemorySessionStore` provides thread-safe in-memory storage suitable for CLI usage. ```rust use bob_core::{ ports::SessionStore, types::{SessionId, SessionState, Message, Role, TokenUsage}, error::StoreError, }; use bob_adapters::store_memory::InMemorySessionStore; use std::sync::Arc; #[tokio::main] async fn main() -> Result<(), StoreError> { let store: Arc = Arc::new(InMemorySessionStore::new()); let session_id: SessionId = "user-123-session".to_string(); // Load (returns None if not found) let existing = store.load(&session_id).await?; assert!(existing.is_none()); // Create and save session state let state = SessionState { messages: vec![ Message { role: Role::User, content: "Hello!".into() }, Message { role: Role::Assistant, content: "Hi there!".into() }, ], total_usage: TokenUsage { prompt_tokens: 10, completion_tokens: 5 }, }; store.save(&session_id, &state).await?; // Load persisted state let loaded = store.load(&session_id).await?.unwrap(); assert_eq!(loaded.messages.len(), 2); assert_eq!(loaded.total_usage.total(), 15); Ok(()) } ``` ## AgentAction - Provider-Neutral Action Protocol The framework uses a provider-neutral JSON action protocol for LLM responses. The scheduler parses these actions to determine the next step: emit final response, call a tool, or ask the user a question. ```rust use bob_core::types::AgentAction; use bob_runtime::action::parse_action; fn main() { // Final response - turn completes let final_json = r#"{"type": "final", "content": "The answer is 42."}"#; match parse_action(final_json).unwrap() { AgentAction::Final { content } => println!("Final: {}", content), _ => unreachable!(), } // Tool call - scheduler executes tool and loops back let tool_json = r#"{ "type": "tool_call", "name": "mcp/filesystem/read_file", "arguments": {"path": "./config.toml"} }"#; match parse_action(tool_json).unwrap() { AgentAction::ToolCall { name, arguments } => { println!("Call tool: {} with {:?}", name, arguments); } _ => unreachable!(), } // Ask user - returns question to user let ask_json = r#"{"type": "ask_user", "question": "Which file should I modify?"}"#; match parse_action(ask_json).unwrap() { AgentAction::AskUser { question } => println!("Question: {}", question), _ => unreachable!(), } // Handles markdown code fences from LLM output let fenced = "```json\n{\"type\": \"final\", \"content\": \"Done!\"}\n```"; assert!(parse_action(fenced).is_ok()); } ``` ## TurnPolicy - Execution Limits and Guards Configures safety bounds for the turn loop including maximum steps, tool calls, timeouts, and error tolerance. The `LoopGuard` enforces these limits during execution. ```rust use bob_core::types::{TurnPolicy, GuardReason}; use bob_runtime::scheduler::LoopGuard; use std::time::Duration; fn main() { // Default policy let policy = TurnPolicy::default(); assert_eq!(policy.max_steps, 12); assert_eq!(policy.max_tool_calls, 8); assert_eq!(policy.max_consecutive_errors, 2); assert_eq!(policy.turn_timeout_ms, 90_000); assert_eq!(policy.tool_timeout_ms, 15_000); // Custom policy for resource-constrained environment let strict_policy = TurnPolicy { max_steps: 5, max_tool_calls: 3, max_consecutive_errors: 1, turn_timeout_ms: 30_000, tool_timeout_ms: 5_000, }; // LoopGuard tracks execution and enforces limits let mut guard = LoopGuard::new(strict_policy); assert!(guard.can_continue()); guard.record_step(); guard.record_tool_call(); guard.record_step(); // After exceeding limits for _ in 0..5 { guard.record_step(); } assert!(!guard.can_continue()); assert_eq!(guard.reason(), GuardReason::MaxSteps); } ``` ## EventSink - Observability Events Port trait for emitting structured events during agent execution. Events cover the full turn lifecycle for monitoring and debugging. ```rust use bob_core::{ ports::EventSink, types::{AgentEvent, TokenUsage, FinishReason}, }; use std::sync::Mutex; // Custom event sink for testing or custom logging struct CollectingEventSink { events: Mutex>, } impl EventSink for CollectingEventSink { fn emit(&self, event: AgentEvent) { self.events.lock().unwrap().push(event); } } fn main() { let sink = CollectingEventSink { events: Mutex::new(vec![]) }; // Events emitted during a typical turn: sink.emit(AgentEvent::TurnStarted { session_id: "s1".into() }); sink.emit(AgentEvent::SkillsSelected { skill_names: vec!["rust-review".into()] }); sink.emit(AgentEvent::LlmCallStarted { model: "openai:gpt-4o-mini".into() }); sink.emit(AgentEvent::LlmCallCompleted { usage: TokenUsage { prompt_tokens: 100, completion_tokens: 50 } }); sink.emit(AgentEvent::ToolCallStarted { name: "mcp/fs/read_file".into() }); sink.emit(AgentEvent::ToolCallCompleted { name: "mcp/fs/read_file".into(), is_error: false }); sink.emit(AgentEvent::TurnCompleted { finish_reason: FinishReason::Stop }); let events = sink.events.lock().unwrap(); assert_eq!(events.len(), 7); } ``` ## Streaming API - Real-time Token Updates The streaming API provides real-time token updates and tool execution events for responsive UIs. ```rust use bob_core::types::{AgentRequest, AgentStreamEvent}; use bob_runtime::AgentRuntime; use futures_util::StreamExt; use std::collections::HashMap; use std::sync::Arc; async fn streaming_example(runtime: Arc) { let request = AgentRequest { input: "Write a haiku about Rust.".into(), session_id: "stream-session".into(), model: None, metadata: HashMap::new(), cancel_token: None, }; let mut stream = runtime.run_stream(request).await.unwrap(); while let Some(event) = stream.next().await { match event { AgentStreamEvent::TextDelta { content } => { print!("{}", content); // Real-time token output } AgentStreamEvent::ToolCallStarted { name, call_id } => { println!("\n[Tool starting: {} ({})]", name, call_id); } AgentStreamEvent::ToolCallCompleted { call_id, result } => { println!("[Tool completed: {} - error: {}]", call_id, result.is_error); } AgentStreamEvent::Finished { usage } => { println!("\n[Finished: {} tokens used]", usage.total()); } AgentStreamEvent::Error { error } => { eprintln!("\n[Error: {}]", error); } } } } ``` ## Configuration - TOML Setup The CLI agent loads configuration from `agent.toml` defining the model, MCP servers, skills, and policies. ```toml # agent.toml - Bob Agent Configuration [runtime] default_model = "openai:gpt-4o-mini" # Format: "provider:model" max_steps = 12 # Maximum scheduler iterations turn_timeout_ms = 90000 # 90 second turn timeout # model_context_tokens = 128000 # For ratio-based skill budgets [llm] # retry_max = 2 # stream_default = false [policy] deny_tools = ["local/shell_exec"] # Blocked tools # allow_tools = ["mcp/fs/read_file"] # Allowlist (optional) # MCP Server Configuration [[mcp.servers]] id = "filesystem" transport = "stdio" command = "npx" args = ["-y", "@modelcontextprotocol/server-filesystem", "."] tool_timeout_ms = 15000 env = { OPENAI_API_KEY = "${OPENAI_API_KEY}" } # Multiple servers supported [[mcp.servers]] id = "git" transport = "stdio" command = "npx" args = ["-y", "@modelcontextprotocol/server-git"] # Skills Configuration (optional) [skills] max_selected = 3 token_budget_tokens = 1800 # token_budget_ratio = 0.10 # Alternative: percentage of context [[skills.sources]] type = "directory" path = "./skills" recursive = true ``` ## Cancellation - Cooperative Turn Cancellation Turns can be cancelled mid-execution using a cancellation token. The scheduler checks the token at safe points and terminates gracefully. ```rust use bob_core::types::{AgentRequest, AgentRunResult, CancelToken, FinishReason}; use bob_runtime::AgentRuntime; use std::collections::HashMap; use std::sync::Arc; use std::time::Duration; async fn cancellation_example(runtime: Arc) { let cancel_token = CancelToken::new(); let token_clone = cancel_token.clone(); let request = AgentRequest { input: "Perform a long-running analysis...".into(), session_id: "cancel-demo".into(), model: None, metadata: HashMap::new(), cancel_token: Some(cancel_token), }; // Cancel after 5 seconds from another task tokio::spawn(async move { tokio::time::sleep(Duration::from_secs(5)).await; token_clone.cancel(); println!("Cancellation requested"); }); match runtime.run(request).await { Ok(AgentRunResult::Finished(resp)) => { if resp.finish_reason == FinishReason::Cancelled { println!("Turn was cancelled: {}", resp.content); } else { println!("Completed: {}", resp.content); } } Err(e) => eprintln!("Error: {}", e), } } ``` ## Error Handling - Structured Error Types The framework provides a structured error taxonomy with automatic conversion between error layers. ```rust use bob_core::error::{AgentError, LlmError, ToolError, StoreError}; use bob_core::types::GuardReason; fn handle_agent_error(err: AgentError) { match err { AgentError::Llm(llm_err) => match llm_err { LlmError::RateLimited => println!("Rate limited, retry later"), LlmError::ContextLengthExceeded => println!("Prompt too long"), LlmError::Provider(msg) => println!("Provider error: {}", msg), LlmError::Stream(msg) => println!("Stream error: {}", msg), LlmError::Other(e) => println!("LLM error: {}", e), }, AgentError::Tool(tool_err) => match tool_err { ToolError::NotFound { name } => println!("Tool not found: {}", name), ToolError::Timeout { name } => println!("Tool timed out: {}", name), ToolError::Execution(msg) => println!("Tool execution failed: {}", msg), ToolError::Other(e) => println!("Tool error: {}", e), }, AgentError::Store(store_err) => match store_err { StoreError::Serialization(msg) => println!("Serialization error: {}", msg), StoreError::Backend(msg) => println!("Storage backend error: {}", msg), StoreError::Other(e) => println!("Store error: {}", e), }, AgentError::Policy(violation) => println!("Policy violation: {}", violation), AgentError::Timeout => println!("Turn timed out"), AgentError::GuardExceeded { reason } => match reason { GuardReason::MaxSteps => println!("Exceeded maximum steps"), GuardReason::MaxToolCalls => println!("Exceeded maximum tool calls"), GuardReason::MaxConsecutiveErrors => println!("Too many consecutive errors"), GuardReason::TurnTimeout => println!("Turn timeout exceeded"), GuardReason::Cancelled => println!("Turn was cancelled"), }, AgentError::Internal(e) => println!("Internal error: {}", e), } } ``` Bob is designed for developers building LLM-powered agents in Rust who need a production-ready framework with deterministic execution, comprehensive observability, and flexible tool integration. The hexagonal architecture ensures that adapters for LLM providers, tool systems, and storage backends can be swapped without modifying the core scheduling logic. Common integration patterns include embedding the runtime in CLI tools, web services, or desktop applications. The framework supports both synchronous (blocking) and asynchronous (streaming) execution modes, making it suitable for batch processing and real-time interactive use cases. The MCP integration enables connecting to any tool server implementing the Model Context Protocol, while the skill system allows injecting domain-specific instructions into the agent's system prompt based on user input.