# LLM Tornado

LLM Tornado is a provider-agnostic .NET SDK designed for building AI agents and workflows. It provides a unified API that abstracts away the differences between major LLM providers including OpenAI, Anthropic Claude, Google Gemini, Cohere, Mistral, DeepSeek, Groq, and xAI Grok. The SDK enables developers to write code once and seamlessly switch between providers or use multiple providers simultaneously without changing application logic.

The framework goes beyond simple chat completions to offer a comprehensive suite of AI capabilities including autonomous agents with tool integration, Model Context Protocol (MCP) support, Agent-to-Agent (A2A) communication, embeddings, image generation, audio transcription, vector database integrations, and conversation management with automatic tool resolution. LLM Tornado emphasizes type-safe structured outputs, streaming responses, prompt caching, and a fluent API that makes building production-ready AI applications straightforward.

---

## TornadoApi Client Initialization

The TornadoApi client is the entry point for all API interactions. It supports single-provider or multi-provider authentication configurations.

```csharp
using LlmTornado;
using LlmTornado.Code;

// Single provider initialization
TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-openai-api-key");

// Multi-provider initialization for seamless provider switching
TornadoApi multiApi = new TornadoApi(new List<ProviderAuthentication>
{
    new ProviderAuthentication(LLmProviders.OpenAi, "your-openai-key"),
    new ProviderAuthentication(LLmProviders.Anthropic, "your-anthropic-key"),
    new ProviderAuthentication(LLmProviders.Google, "your-google-key"),
    new ProviderAuthentication(LLmProviders.Cohere, "your-cohere-key"),
    new ProviderAuthentication(LLmProviders.Mistral, "your-mistral-key"),
    new ProviderAuthentication(LLmProviders.DeepSeek, "your-deepseek-key"),
    new ProviderAuthentication(LLmProviders.Groq, "your-groq-key")
});

// Azure OpenAI configuration
TornadoApi azureApi = new TornadoApi(new ProviderAuthentication(
    LLmProviders.OpenAi,
    "your-azure-key",
    new AzureOpenAiConfiguration("your-resource-name", "your-deployment-name")
));
```

---

## Chat Completions API

The Chat API provides conversation-based interactions with LLMs. It supports system messages, user inputs, streaming responses, function calling, and multi-turn conversations.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Code;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Simple completion request
ChatResult? result = await api.Chat.CreateChatCompletion(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.Turbo,
    ResponseFormat = ChatRequestResponseFormats.Json,
    Messages = [
        new ChatMessage(ChatMessageRoles.System, "Solve the math problem given by user, respond in JSON format."),
        new ChatMessage(ChatMessageRoles.User, "2+2=?")
    ]
});
Console.WriteLine(result?.Choices?[0].Message?.Content);
// Output: {"result": 4}

// Conversation-based chat with fluent API
Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Anthropic.Claude4.Sonnet250514
});
chat.AppendSystemMessage("Pretend you are a dog. Sound authentic.");
chat.AppendUserInput("Who are you?");

string? response = await chat.GetResponse();
Console.WriteLine(response);
// Output: *wags tail excitedly* Woof woof! I'm a good boy! ...

// Streaming response
await chat.StreamResponse(Console.Write);

// Rich response with metadata
ChatRichResponse richResponse = await chat.GetResponseRich();
Console.WriteLine($"Text: {richResponse.Text}");
Console.WriteLine($"Tokens: {richResponse.Result?.Usage?.TotalTokens}");
```

---

## Streaming Chat with Event Handlers

Streaming responses allow real-time token delivery for responsive UIs. The ChatStreamEventHandler provides fine-grained control over different event types.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Google.Gemini.Gemini25Flash
});
chat.AppendSystemMessage("You are a helpful assistant.");
chat.AppendUserInput("Explain how to make coffee in 3 steps.");

await chat.StreamResponseRich(new ChatStreamEventHandler
{
    MessageTokenHandler = (token) =>
    {
        Console.Write(token);
        return ValueTask.CompletedTask;
    },
    BlockFinishedHandler = (block) =>
    {
        Console.WriteLine("\n--- Block finished ---");
        return ValueTask.CompletedTask;
    },
    OnUsageReceived = (usage) =>
    {
        Console.WriteLine($"\nUsage: Input={usage.PromptTokens}, Output={usage.CompletionTokens}");
        return ValueTask.CompletedTask;
    },
    OnFinished = (data) =>
    {
        Console.WriteLine($"Finish reason: {data.FinishReason}");
        return ValueTask.CompletedTask;
    }
});
```

---

## Function Calling (Tool Use)

Function calling enables LLMs to invoke external tools and APIs. LLM Tornado provides automatic tool resolution with streaming and multi-turn support.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.ChatFunctions;
using LlmTornado.Common;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.O,
    Tools = [
        new Tool(new ToolFunction("get_weather", "Gets the current weather for a location", new
        {
            type = "object",
            properties = new
            {
                location = new
                {
                    type = "string",
                    description = "The city name, e.g. Prague"
                }
            },
            required = new List<string> { "location" }
        }))
    ],
    MaxTokens = 256
});

chat.OnAfterToolsCall = async (result) =>
{
    // Continue conversation after tool execution
    await chat.StreamResponseRich(handler);
};

chat.AppendSystemMessage("You are a helpful assistant");
chat.AppendUserInput("What is the weather like today in Prague?");

ChatStreamEventHandler handler = new ChatStreamEventHandler
{
    MessageTokenHandler = (token) =>
    {
        Console.Write(token);
        return ValueTask.CompletedTask;
    },
    FunctionCallHandler = (calls) =>
    {
        foreach (FunctionCall fn in calls)
        {
            // Execute your tool logic here
            fn.Result = new FunctionResult(fn, "A mild rain is expected around noon, 18°C.");
        }
        return ValueTask.CompletedTask;
    },
    AfterFunctionCallsResolvedHandler = async (results, h) =>
    {
        await chat.StreamResponseRich(h);
    }
};

await chat.StreamResponseRich(handler);
// Output: Based on the weather data, Prague is expecting a mild rain around noon with temperatures around 18°C...
```

---

## Structured JSON Output

Structured output forces the model to return responses matching a specific JSON schema, ensuring reliable parsing.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.O240806,
    ResponseFormat = ChatRequestResponseFormats.StructuredJson("extract_city", new
    {
        type = "object",
        properties = new
        {
            city = new { type = "string" },
            country = new { type = "string" },
            population = new { type = "integer" }
        },
        required = new List<string> { "city", "country" },
        additionalProperties = false
    })
});

chat.AppendUserInput("Extract info about Tokyo from: Tokyo is the capital of Japan with over 13 million people.");

ChatRichResponse response = await chat.GetResponseRich();
Console.WriteLine(response.Text);
// Output: {"city": "Tokyo", "country": "Japan", "population": 13000000}
```

---

## TornadoAgent - Autonomous AI Agents

TornadoAgent provides a high-level abstraction for building autonomous AI agents that can use tools, manage state, and handle complex multi-step tasks.

```csharp
using LlmTornado;
using LlmTornado.Agents;
using LlmTornado.Chat.Models;
using System.ComponentModel;
using Newtonsoft.Json.Converters;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Define tool with automatic schema conversion
[JsonConverter(typeof(StringEnumConverter))]
public enum TemperatureUnit { Celsius, Fahrenheit }

[Description("Get the current weather in a given location")]
public static string GetCurrentWeather(
    [Description("The city and state, e.g. Boston, MA")] string location,
    [Description("Unit of temperature")] TemperatureUnit unit = TemperatureUnit.Celsius)
{
    // Call actual weather API here
    return $"72°F, Sunny in {location}";
}

// Create agent with tools
TornadoAgent agent = new TornadoAgent(
    client,
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful weather assistant.",
    tools: [GetCurrentWeather]
);

// Run agent - tool calls are handled automatically
Conversation result = await agent.Run("What's the weather in Boston?");
Console.WriteLine(result.Messages.Last().Content);
// Output: The current weather in Boston is 72°F and sunny.
```

---

## Agent Streaming with Event Handling

Stream agent responses in real-time with comprehensive event handling for tool calls and text output.

```csharp
using LlmTornado.Agents;
using LlmTornado.Agents.DataModels;
using LlmTornado.Chat.Models;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

TornadoAgent agent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a creative storyteller.",
    streaming: true
);

ValueTask StreamHandler(AgentRunnerEvents runEvent)
{
    switch (runEvent.EventType)
    {
        case AgentRunnerEventTypes.Streaming:
            if (runEvent is AgentRunnerStreamingEvent streamingEvent)
            {
                if (streamingEvent.ModelStreamingEvent is ModelStreamingOutputTextDeltaEvent deltaText)
                {
                    Console.Write(deltaText.DeltaText);
                }
            }
            break;
        case AgentRunnerEventTypes.ToolCalling:
            Console.WriteLine("\n[Tool being called...]");
            break;
    }
    return ValueTask.CompletedTask;
}

Conversation result = await agent.Run(
    "Tell me a short story about a robot.",
    streaming: true,
    onAgentRunnerEvent: StreamHandler
);
// Output streams in real-time: Once upon a time, there was a little robot named Bolt...
```

---

## Structured Output with Agents

Agents can return structured data using automatic C# type to JSON schema conversion.

```csharp
using LlmTornado.Agents;
using LlmTornado.Chat.Models;
using System.ComponentModel;

[Description("Mathematical problem solution with reasoning steps")]
public struct MathReasoning
{
    [Description("Step-by-step solution process")]
    public MathStep[] Steps { get; set; }

    [Description("The final numerical answer")]
    public string FinalAnswer { get; set; }
}

public struct MathStep
{
    [Description("Explanation of what is being done in this step")]
    public string Explanation { get; set; }

    [Description("The intermediate result of this step")]
    public string Output { get; set; }
}

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

TornadoAgent agent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "Solve math problems step by step.",
    outputSchema: typeof(MathReasoning)
);

Conversation result = await agent.Run("Solve: 8x + 7 = -23");

MathReasoning solution = result.Messages.Last().Content.JsonDecode<MathReasoning>();
Console.WriteLine($"Final Answer: {solution.FinalAnswer}");
foreach (var step in solution.Steps)
{
    Console.WriteLine($"  Step: {step.Explanation} -> {step.Output}");
}
// Output:
// Final Answer: x = -3.75
//   Step: Subtract 7 from both sides -> 8x = -30
//   Step: Divide both sides by 8 -> x = -3.75
```

---

## Agent as Tool (Agent Composition)

Compose complex agent systems by using one agent as a tool for another agent.

```csharp
using LlmTornado.Agents;
using LlmTornado.Chat.Models;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Specialist translator agent
TornadoAgent translatorAgent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You only translate English to Spanish. Do not answer questions, only translate."
);

// Main agent that uses translator as a tool
TornadoAgent mainAgent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful assistant. Use the translator tool for any Spanish translations.",
    tools: [translatorAgent.AsTool]
);

Conversation result = await mainAgent.Run(
    "What is 2+2? Also, translate your answer to Spanish."
);

Console.WriteLine(result.Messages.Last().Content);
// Output: 2+2 equals 4. In Spanish: "Dos más dos es igual a cuatro."
```

---

## Input Guardrails

Protect your agents with input guardrails that validate user input before processing.

```csharp
using LlmTornado.Agents;
using LlmTornado.Agents.DataModels;
using LlmTornado.Chat.Models;

public struct IsMathQuestion
{
    public string Reasoning { get; set; }
    public bool IsMathRequest { get; set; }
}

async ValueTask<GuardRailFunctionOutput> MathOnlyGuardRail(string? input = "")
{
    TornadoAgent guardrailAgent = new TornadoAgent(
        client,
        ChatModel.OpenAi.Gpt41.V41Mini,
        instructions: "Determine if the user is asking a math-related question.",
        outputSchema: typeof(IsMathQuestion)
    );

    Conversation result = await guardrailAgent.Run(input);
    IsMathQuestion? check = result.Messages.Last().Content.JsonDecode<IsMathQuestion>();

    // Return true to trigger guardrail (block request)
    return new GuardRailFunctionOutput(
        check?.Reasoning ?? "",
        !check?.IsMathRequest ?? true
    );
}

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

TornadoAgent mathAgent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a math tutor. Only answer math questions."
);

// This will throw an exception - weather is not math
try
{
    await mathAgent.Run("What is the weather?", inputGuardRailFunction: MathOnlyGuardRail);
}
catch (GuardrailException ex)
{
    Console.WriteLine($"Blocked: {ex.Message}");
    // Output: Blocked: Input does not appear to be a math question.
}

// This will succeed
Conversation result = await mathAgent.Run("What is 15% of 200?", inputGuardRailFunction: MathOnlyGuardRail);
Console.WriteLine(result.Messages.Last().Content);
// Output: 15% of 200 is 30.
```

---

## MCP (Model Context Protocol) Integration

Connect to MCP servers to access external tools and capabilities. Supports both local process-based and remote HTTP-based MCP servers.

```csharp
using LlmTornado;
using LlmTornado.Agents;
using LlmTornado.Chat.Models;
using LlmTornado.Mcp;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Local MCP server (spawns a process)
MCPServer mcpServer = new MCPServer(
    serverLabel: "weather-tool",
    command: "dotnet",
    arguments: ["run", "--project", "./WeatherMcpServer"]
);

await mcpServer.InitializeAsync();

// Create agent with MCP tools
TornadoAgent agent = new TornadoAgent(
    client,
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful assistant with weather capabilities."
);

agent.AddTool(mcpServer.AllowedTornadoTools.ToArray());

Conversation result = await agent.Run("What is the weather in Boston?");
Console.WriteLine(result.Messages.Last().Content);

// Remote MCP server (HTTP-based, e.g., GitHub Copilot)
MCPServer remoteMcp = new MCPServer(
    serverLabel: "github",
    url: "https://api.githubcopilot.com/mcp",
    additionalConnectionHeaders: new Dictionary<string, string>
    {
        { "Authorization", $"Bearer {Environment.GetEnvironmentVariable("GITHUB_API_KEY")}" }
    }
);

await remoteMcp.InitializeAsync();
```

---

## A2A (Agent-to-Agent) Protocol

Enable communication between distributed agents using the A2A protocol for multi-agent orchestration.

```csharp
using LlmTornado.A2A;
using LlmTornado.Agents;
using LlmTornado.Chat.Models;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Connect to A2A agent endpoints
A2ATornadoConnector a2aConnector = new A2ATornadoConnector([
    "http://localhost:5125",  // Remote agent 1
    "http://localhost:5126"   // Remote agent 2
]);

// List available remote agents
foreach (var card in a2aConnector.A2ACards)
{
    Console.WriteLine($"Agent: {card.Value.Name}, Description: {card.Value.Description}");
}

// Create orchestrator agent that can communicate with remote agents
TornadoAgent orchestrator = new TornadoAgent(
    client,
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You coordinate tasks between available agents.",
    tools: [
        a2aConnector.GetAvailableAgentsTool,
        a2aConnector.SendMessageTool
    ]
);

Conversation result = await orchestrator.Run("Find all my GitHub repositories using the available agents.");
Console.WriteLine(result.Messages.Last().Content);
```

---

## Sequential Agent Runtime

Chain multiple agents in sequence where each agent's output feeds into the next.

```csharp
using LlmTornado.Agents;
using LlmTornado.Agents.Runtime;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Code;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Research agent with web search capability
SequentialRuntimeAgent researchAgent = new SequentialRuntimeAgent(
    client: client,
    name: "Research Agent",
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: @"You are a research assistant. Given a topic, search the web
                    and produce a concise 2-3 paragraph summary of key findings.",
    sequentialInstructions: "Research the topic in the next message thoroughly."
);

// Report writer agent
SequentialRuntimeAgent reportAgent = new SequentialRuntimeAgent(
    client: client,
    name: "Report Agent",
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: @"You are a senior researcher. Take research summaries and
                    write a cohesive, detailed report in markdown format.",
    sequentialInstructions: "Write a comprehensive report based on the research provided."
);

// Create sequential runtime
SequentialRuntimeConfiguration config = new SequentialRuntimeConfiguration([
    researchAgent,
    reportAgent
]);

ChatRuntime runtime = new ChatRuntime(config);

// Execute the pipeline
ChatMessage report = await runtime.InvokeAsync(
    new ChatMessage(ChatMessageRoles.User, "Write a report about the benefits of AI agents in software development.")
);

Console.WriteLine(report.Content);
// Output: # Benefits of AI Agents in Software Development
//         ## Executive Summary...
```

---

## Handoff Agent Runtime

Create agent systems where control can be handed off between specialized agents based on context.

```csharp
using LlmTornado.Agents;
using LlmTornado.Agents.Runtime;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Code;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Spanish-speaking agent
HandoffAgent spanishAgent = new HandoffAgent(
    client: client,
    name: "SpanishAgent",
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful assistant. Always respond in Spanish.",
    description: "Use this agent for Spanish language responses"
);

// English-speaking agent
HandoffAgent englishAgent = new HandoffAgent(
    client: client,
    name: "EnglishAgent",
    model: ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful assistant. Always respond in English.",
    description: "Use this agent for English language responses",
    handoffs: [spanishAgent]
);

// Allow bidirectional handoffs
spanishAgent.HandoffAgents = [englishAgent];

// Create handoff runtime starting with English agent
HandoffRuntimeConfiguration config = new HandoffRuntimeConfiguration(englishAgent);
ChatRuntime runtime = new ChatRuntime(config);

// Spanish input automatically routes to Spanish agent
ChatMessage response = await runtime.InvokeAsync(
    new ChatMessage(ChatMessageRoles.User, "¿Cuánto es 2+2?")
);

Console.WriteLine(response.Content);
// Output: Dos más dos es igual a cuatro.
```

---

## Embeddings API

Generate vector embeddings for text using various providers. Supports multiple embedding models and customization options.

```csharp
using LlmTornado;
using LlmTornado.Embedding;
using LlmTornado.Embedding.Models;
using LlmTornado.Embedding.Vendors.Google;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Simple embedding
EmbeddingResult? result = await api.Embeddings.CreateEmbedding(
    EmbeddingModel.OpenAi.Gen3.Small,
    "The quick brown fox jumps over the lazy dog."
);

float[]? embedding = result?.Data.FirstOrDefault()?.Embedding;
Console.WriteLine($"Embedding dimensions: {embedding?.Length}");
// Output: Embedding dimensions: 1536

// Batch embeddings
EmbeddingResult? batchResult = await api.Embeddings.CreateEmbedding(
    EmbeddingModel.OpenAi.Gen3.Small,
    ["Hello world", "How are you?", "Machine learning is fascinating"]
);

foreach (var entry in batchResult?.Data ?? [])
{
    Console.WriteLine($"Vector {entry.Index}: {entry.Embedding.Length} dimensions");
}

// Google Gemini embeddings with task type
EmbeddingResult? googleResult = await api.Embeddings.CreateEmbedding(new EmbeddingRequest(
    EmbeddingModel.Google.Gemini.Embedding4,
    "This is content of a document"
)
{
    VendorExtensions = new EmbeddingRequestVendorExtensions(new EmbeddingRequestVendorGoogleExtensions
    {
        TaskType = EmbeddingRequestVendorGoogleExtensionsTaskTypes.RetrievalDocument,
        Title = "My Document Title"
    })
});

// Mistral embeddings with dimension control
EmbeddingResult? mistralResult = await api.Embeddings.CreateEmbedding(new EmbeddingRequest
{
    Model = EmbeddingModel.Mistral.Premier.CodestralEmbed,
    InputScalar = "<div>hello world</div>",
    Dimensions = 512,
    OutputDType = EmbeddingOutputDtypes.Int8
});
```

---

## Image Generation API

Generate and edit images using DALL-E, Imagen, Grok, and other providers.

```csharp
using LlmTornado;
using LlmTornado.Code;
using LlmTornado.Images;
using LlmTornado.Images.Models;
using LlmTornado.Images.Vendors.Google;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Generate image with DALL-E 3
ImageGenerationResult? result = await api.ImageGenerations.CreateImage(new ImageGenerationRequest(
    prompt: "A cute orange tabby cat wearing a tiny wizard hat, digital art style",
    quality: TornadoImageQualities.Hd,
    responseFormat: TornadoImageResponseFormats.Base64,
    model: ImageModel.OpenAi.Dalle.V3
));

if (result?.Data?[0].Base64 is string base64)
{
    byte[] imageBytes = Convert.FromBase64String(base64);
    await File.WriteAllBytesAsync("wizard_cat.png", imageBytes);
    Console.WriteLine("Image saved to wizard_cat.png");
}

// GPT Image model with transparency
ImageGenerationResult? gptResult = await api.ImageGenerations.CreateImage(new ImageGenerationRequest(
    "A minimalist logo for a tech startup",
    quality: TornadoImageQualities.Medium,
    model: ImageModel.OpenAi.Gpt.V1
)
{
    Background = ImageBackgroundTypes.Transparent,
    Moderation = ImageModerationTypes.Low
});

// Edit existing image
byte[] originalImage = await File.ReadAllBytesAsync("original.png");
ImageGenerationResult? edited = await api.ImageEdit.EditImage(new ImageEditRequest(
    "Make this cat look more dangerous with glowing eyes"
)
{
    Quality = TornadoImageQualities.High,
    Model = ImageModel.OpenAi.Gpt.V1,
    Image = new TornadoInputFile(Convert.ToBase64String(originalImage), "image/png")
});

// Google Imagen 3
ImageGenerationResult? imagenResult = await api.ImageGenerations.CreateImage(new ImageGenerationRequest(
    "A serene Japanese garden with cherry blossoms",
    responseFormat: TornadoImageResponseFormats.Base64,
    model: ImageModel.Google.Imagen.V3Generate002
)
{
    VendorExtensions = new ImageGenerationRequestVendorExtensions(new ImageGenerationRequestGoogleExtensions
    {
        MimeType = ImageGenerationRequestGoogleExtensionsMimeTypes.Jpeg,
        CompressionQuality = 90
    })
});
```

---

## Audio Transcription API

Transcribe audio files using Whisper and other speech-to-text models with support for streaming and speaker diarization.

```csharp
using LlmTornado;
using LlmTornado.Audio;
using LlmTornado.Audio.Models;
using LlmTornado.Code;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Basic transcription
byte[] audioData = await File.ReadAllBytesAsync("meeting.wav");

TranscriptionResult? transcription = await api.Audio.CreateTranscription(new TranscriptionRequest
{
    File = new AudioFile(audioData, AudioFileTypes.Wav),
    Model = AudioModel.OpenAi.Whisper.V2,
    ResponseFormat = AudioTranscriptionResponseFormats.Text
});

Console.WriteLine(transcription?.Text);
// Output: Hello everyone, welcome to today's meeting...

// Verbose transcription with timestamps
TranscriptionResult? detailed = await api.Audio.CreateTranscription(new TranscriptionRequest
{
    File = new AudioFile(audioData, AudioFileTypes.Wav),
    Model = AudioModel.OpenAi.Whisper.V2,
    ResponseFormat = AudioTranscriptionResponseFormats.VerboseJson,
    TimestampGranularities = [TimestampGranularities.Segment, TimestampGranularities.Word]
});

foreach (var segment in detailed?.Segments ?? [])
{
    Console.WriteLine($"[{segment.Start:F2}s - {segment.End:F2}s]: {segment.Text}");
}

// Streaming transcription with GPT-4
await api.Audio.StreamTranscriptionRich(new TranscriptionRequest
{
    File = new AudioFile(audioData, AudioFileTypes.Wav),
    Model = AudioModel.OpenAi.Gpt4.Gpt4OTranscribe,
    ResponseFormat = AudioTranscriptionResponseFormats.Text
}, new TranscriptionStreamEventHandler
{
    ChunkHandler = (chunk) =>
    {
        Console.Write(chunk);
        return ValueTask.CompletedTask;
    }
});

// Speaker diarization (who said what)
TranscriptionResult? diarized = await api.Audio.CreateTranscription(new TranscriptionRequest
{
    File = new AudioFile(audioData, AudioFileTypes.Wav),
    Model = AudioModel.OpenAi.Gpt4.Gpt4OTranscribeDiarize,
    ResponseFormat = AudioTranscriptionResponseFormats.DiarizedJson
});

foreach (var segment in diarized?.Segments ?? [])
{
    Console.WriteLine($"[{segment.Start:F2}s] Speaker {segment.Speaker}: {segment.Text}");
}
// Output:
// [0.00s] Speaker 1: Good morning team.
// [1.50s] Speaker 2: Morning! Ready to start?
```

---

## Vector Database Integration

Store and query vector embeddings with built-in support for Pinecone, Qdrant, ChromaDB, pgVector, and FAISS.

```csharp
using LlmTornado;
using LlmTornado.Embedding;
using LlmTornado.Embedding.Models;
using LlmTornado.VectorDatabases;
using LlmTornado.VectorDatabases.Pinecone;
using LlmTornado.VectorDatabases.Qdrant;
using LlmTornado.VectorDatabases.Faiss.Integrations;

// Pinecone integration
TornadoPinecone pinecone = new TornadoPinecone(new PineconeConfigurationOptions("your-pinecone-key")
{
    IndexName = "my-index",
    Dimension = 1536,
    Cloud = PineconeCloud.Aws,
    Region = "us-east-1"
});

// Add documents with automatic embedding
await pinecone.AddDocumentsAsync([
    new VectorDocument(id: "doc1", content: "Apple is a fruit known for its sweetness."),
    new VectorDocument(id: "doc2", content: "Apple Inc. makes iPhones and MacBooks."),
    new VectorDocument(id: "doc3", content: "Many people enjoy apples as a healthy snack.")
]);

// Search by text query
string query = "Which company makes iPhones?";
float[] queryEmbedding = await pinecone.EmbedAsync(query);
VectorDocument[] results = await pinecone.QueryByEmbeddingAsync(queryEmbedding, topK: 3);

foreach (var doc in results)
{
    Console.WriteLine($"[{doc.Score:F4}] {doc.Content}");
}
// Output: [0.9234] Apple Inc. makes iPhones and MacBooks.

// Qdrant integration with metadata filtering
QdrantVectorDatabase qdrant = new QdrantVectorDatabase(
    host: "localhost",
    port: 6334,
    vectorDimension: 1536
);

await qdrant.InitializeCollectionAsync("products");

await qdrant.AddDocumentsAsync([
    new VectorDocument(
        id: "prod1",
        content: "Wireless Bluetooth headphones with noise cancellation",
        embedding: await GetEmbedding("Wireless Bluetooth headphones"),
        metadata: new Dictionary<string, object>
        {
            { "category", "electronics" },
            { "price", 199.99 }
        }
    )
]);

// Query with metadata filter
var filtered = await qdrant.QueryByEmbeddingAsync(
    embedding: queryEmbedding,
    where: TornadoWhereOperator.Equal("category", "electronics"),
    topK: 5,
    includeScore: true
);

// In-memory FAISS for local development
FaissVectorDatabase faiss = new FaissVectorDatabase(
    indexDirectory: "./faiss_indexes",
    vectorDimension: 1536
);

await faiss.InitializeCollection("dev_collection");
await faiss.AddDocumentsAsync(documents);
```

---

## Conversation Management and Persistence

Save and restore conversation state for multi-session interactions.

```csharp
using LlmTornado.Agents;
using LlmTornado.Agents.Utility;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

TornadoAgent agent = new TornadoAgent(
    client,
    ChatModel.OpenAi.Gpt41.V41Mini,
    instructions: "You are a helpful assistant with memory.",
    streaming: true
);

// First session
Conversation session1 = await agent.Run("My name is Alice and I love hiking.");
Console.WriteLine(session1.Messages.Last().Content);

// Save conversation to file
session1.Messages.ToList().SaveConversation("conversation.json");
Console.WriteLine("Conversation saved.");

// Later... restore the conversation
List<ChatMessage> loadedMessages = [];
await loadedMessages.LoadMessagesAsync("conversation.json");

// Continue conversation with restored context
Conversation session2 = await agent.Run(
    "What is my name and what do I enjoy?",
    appendMessages: loadedMessages
);

Console.WriteLine(session2.Messages.Last().Content);
// Output: Your name is Alice and you love hiking!
```

---

## Extended Thinking (Reasoning Models)

Access chain-of-thought reasoning with models like Claude 3.7 Sonnet and OpenAI o3.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Chat.Vendors.Anthropic;

TornadoApi api = new TornadoApi(LLmProviders.Anthropic, "your-api-key");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Anthropic.Claude37.Sonnet,
    VendorExtensions = new ChatRequestVendorExtensions(new ChatRequestVendorAnthropicExtensions
    {
        Thinking = new AnthropicThinkingSettings
        {
            Enabled = true,
            BudgetTokens = 4000  // Allocate tokens for reasoning
        }
    })
});

chat.AppendUserInput("Explain how to solve the differential equation dy/dx = 2xy");

// Stream both reasoning and final response
await chat.StreamResponseRich(new ChatStreamEventHandler
{
    ReasoningTokenHandler = (token) =>
    {
        Console.ForegroundColor = ConsoleColor.DarkGray;
        Console.Write(token.Content);  // Internal reasoning
        Console.ResetColor();
        return ValueTask.CompletedTask;
    },
    MessageTokenHandler = (token) =>
    {
        Console.Write(token);  // Final response
        return ValueTask.CompletedTask;
    },
    OnFinished = (data) =>
    {
        Console.WriteLine($"\nTotal tokens: {data.Usage?.TotalTokens}");
        return ValueTask.CompletedTask;
    }
});
```

---

## Prompt Caching (Anthropic)

Reduce costs and latency by caching frequently used prompts and context.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Chat.Vendors.Anthropic;

TornadoApi api = new TornadoApi(LLmProviders.Anthropic, "your-api-key");

// Load a large document to cache
string longDocument = await File.ReadAllTextAsync("book.txt");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Anthropic.Claude45.Sonnet250929
});

// Mark the large context for caching
chat.AppendSystemMessage([
    new ChatMessagePart("You answer questions about the following text:"),
    new ChatMessagePart(longDocument, new ChatMessagePartAnthropicExtensions
    {
        Cache = AnthropicCacheSettings.EphemeralWithTtl(AnthropicCacheTtlOptions.OneHour)
    })
]);

// First question - populates cache
chat.AppendUserInput("Who is the main character?");
ChatRichResponse response1 = await chat.GetResponseRich();
Console.WriteLine($"Response 1 (cache miss): {response1.Text}");

// Second question - uses cached context
chat.AppendUserInput("What happens in chapter 2?");
ChatRichResponse response2 = await chat.GetResponseRich();
Console.WriteLine($"Response 2 (cache hit): {response2.Text}");
// Note: Cache hits significantly reduce token costs for the cached portion
```

---

## Multimodal Input (Vision, Audio, Documents)

Process images, audio, and documents alongside text in conversations.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Code;
using LlmTornado.Images;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Image analysis from URL
Conversation imageChat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.O
});

imageChat.AppendUserInput([
    new ChatMessagePart(new Uri("https://example.com/diagram.png")),
    new ChatMessagePart("Explain what this diagram shows.")
]);

string? imageAnalysis = await imageChat.GetResponse();
Console.WriteLine(imageAnalysis);

// Image from base64
byte[] localImage = await File.ReadAllBytesAsync("chart.png");
string base64Image = $"data:image/png;base64,{Convert.ToBase64String(localImage)}";

imageChat.AppendUserInput([
    new ChatMessagePart(base64Image, ImageDetail.High, "image/png"),
    new ChatMessagePart("What trends do you see in this chart?")
]);

// Audio input
Conversation audioChat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.AudioPreview241001,
    Modalities = [ChatModelModalities.Text],
    MaxTokens = 2000
});

byte[] audioData = await File.ReadAllBytesAsync("question.wav");
audioChat.AppendUserInput([
    new ChatMessagePart(audioData, ChatAudioFormats.Wav)
]);

string? audioResponse = await audioChat.GetResponse();
Console.WriteLine($"Audio transcription and response: {audioResponse}");

// PDF document analysis (Anthropic)
Conversation pdfChat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Anthropic.Claude37.Sonnet
});

byte[] pdfBytes = await File.ReadAllBytesAsync("report.pdf");
pdfChat.AppendUserInput([
    new ChatMessagePart(Convert.ToBase64String(pdfBytes), DocumentLinkTypes.Base64),
    new ChatMessagePart("Summarize the key findings in this report.")
]);

ChatRichResponse pdfResponse = await pdfChat.GetResponseRich();
Console.WriteLine(pdfResponse.Text);
```

---

## ToolkitChat - Production Chat Wrapper

ToolkitChat provides a production-ready wrapper for single-shot LLM requests with automatic error handling and retry logic.

```csharp
using LlmTornado;
using LlmTornado.Chat.Models;
using LlmTornado.ChatFunctions;
using LlmTornado.Toolkit.Memory;

TornadoApi api = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

// Simple single response
LlmResponseParsed response = await ToolkitChat.GetSingleResponse(
    api,
    model: ChatModel.OpenAi.Gpt4.O,
    backupModel: ChatModel.OpenAi.Gpt4.Turbo,  // Fallback on failure
    sysPrompt: "You are a helpful assistant.",
    userInput: "What is the capital of France?",
    temp: 0.7,
    maxTokens: 500
);

if (response.Ok)
{
    Console.WriteLine(response.Text);
}
else
{
    Console.WriteLine($"Error: {response.Error}");
}

// With function calling
ChatFunction extractFunction = new ChatFunction("extract_info", "Extracts structured info", new
{
    type = "object",
    properties = new
    {
        name = new { type = "string" },
        age = new { type = "integer" }
    },
    required = new[] { "name" }
});

LlmResponseParsed fnResponse = await ToolkitChat.GetSingleResponse(
    api,
    ChatModel.OpenAi.Gpt4.O,
    backupModel: null,
    sysPrompt: "Extract person information from text.",
    function: extractFunction,
    userInput: "John is 30 years old and lives in New York.",
    strict: true
);

// Rich response with all blocks
LlmResponseParsedRich richResponse = await ToolkitChat.GetSingleResponseRich(
    api,
    ChatModel.OpenAi.Gpt4.O,
    backupModel: ChatModel.Anthropic.Claude35.SonnetLatest,
    sysPrompt: "You are a coding assistant.",
    userInput: "Write a Python function to reverse a string."
);

Console.WriteLine($"Response text: {richResponse.Text}");
Console.WriteLine($"Token usage: {richResponse.Usage?.TotalTokens}");
```

---

## Web Search Integration (Cohere)

Enable real-time web search capabilities with Cohere's connected search.

```csharp
using LlmTornado;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;
using LlmTornado.Chat.Vendors.Cohere;

TornadoApi api = new TornadoApi(LLmProviders.Cohere, "your-cohere-key");

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.Cohere.Command.Default,
    VendorExtensions = new ChatRequestVendorExtensions(new ChatRequestVendorCohereExtensions([
        ChatVendorCohereExtensionConnector.WebConnector
    ]))
});

chat.AppendSystemMessage("You are an assistant that searches the internet for current information.");
chat.AppendUserInput("What is the latest version of .NET and when was it released?");

ChatRichResponse response = await chat.GetResponseRich();

// Parse citations from web sources
List<VendorCohereCitationBlock>? citations = response.VendorExtensions?.Cohere?.ParseCitations();

Console.WriteLine("Answer:");
Console.WriteLine(response.Text);

if (citations != null)
{
    Console.WriteLine("\nSources:");
    foreach (var block in citations.Where(b => b.Citation != null))
    {
        Console.WriteLine($"  - {block.Citation?.Title}: {block.Citation?.Url}");
    }
}
```

---

## Conversation Compression

Automatically compress long conversations to stay within token limits while preserving context.

```csharp
using LlmTornado;
using LlmTornado.Agents;
using LlmTornado.Agents.Utility;
using LlmTornado.Chat;
using LlmTornado.Chat.Models;

TornadoApi client = new TornadoApi(LLmProviders.OpenAi, "your-api-key");

List<ChatMessage> conversation = new List<ChatMessage>();

// Configure compressor with token limit
ConversationCompressor compressor = new ConversationCompressor(
    client,
    maxTokens: 20000,  // Compress when exceeding this limit
    new ConversationCompressionOptions
    {
        CompressToolCallMessages = true,
        SummaryModel = ChatModel.OpenAi.Gpt41.V41Mini
    }
);

// Add messages to conversation during chat...
conversation.Add(new ChatMessage(ChatMessageRoles.User, "Tell me about quantum computing."));
conversation.Add(new ChatMessage(ChatMessageRoles.Assistant, "Quantum computing uses quantum mechanics..."));
// ... many more messages ...

// Check and compress if needed
if (compressor.ShouldCompress(conversation))
{
    Console.WriteLine($"Compressing {conversation.Sum(m => m.GetMessageTokens())} tokens...");

    conversation = await compressor.Compress(conversation);

    Console.WriteLine($"Compressed to {conversation.Sum(m => m.GetMessageTokens())} tokens");
    Console.WriteLine("Summary:");
    Console.WriteLine(conversation.First(m => m.Role == ChatMessageRoles.System).Content);
}
```

---

LLM Tornado serves as a comprehensive foundation for building production AI applications in .NET. Its primary use cases include conversational AI assistants with memory and tool access, autonomous agents that can research, analyze, and act on information, multi-agent systems for complex workflows, RAG (Retrieval Augmented Generation) applications with vector database integration, and multimodal applications processing text, images, audio, and documents together.

The SDK's provider-agnostic design enables graceful degradation across providers, cost optimization by routing to different models based on task complexity, and A/B testing different models without code changes. Integration patterns commonly involve combining TornadoAgent for high-level agent logic, MCP for external tool access, vector databases for knowledge retrieval, and ToolkitChat for reliable single-shot requests. The streaming-first architecture ensures responsive user experiences while comprehensive event handlers provide full visibility into agent reasoning and tool execution.