# LLM Gateway LLM Gateway is an open-source API gateway for Large Language Models that acts as middleware between applications and various LLM providers. It provides a unified OpenAI-compatible API interface for routing requests to multiple providers (OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, and 20+ others), allowing seamless provider switching and multi-provider orchestration. The gateway handles request routing, load balancing, response caching with Redis, usage tracking, cost analytics, and provider key management through a centralized system. The platform consists of a monorepo architecture built with Hono framework for APIs, Next.js for frontend applications, PostgreSQL with Drizzle ORM for data persistence, and Redis for caching. It includes a Gateway service for LLM request routing, an API service for user/organization/project management, a UI dashboard, an interactive Playground, comprehensive documentation, and an admin panel. The system supports multiple authentication methods including passkeys, tracks detailed usage metrics and costs, provides IAM-based API key management with granular permissions, and offers both free and pro subscription tiers with Stripe integration. ## Gateway API - Chat Completions OpenAI-compatible chat completions endpoint that routes requests to multiple LLM providers with automatic failover, caching, and cost optimization. ```bash # Basic chat completion request curl -X POST https://api.llmgateway.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "temperature": 0.7, "max_tokens": 100, "stream": false }' # Expected response { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677652288, "model": "gpt-4o", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 20, "completion_tokens": 8, "total_tokens": 28 } } # Streaming response with Server-Sent Events curl -X POST https://api.llmgateway.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" \ -d '{ "model": "claude-3-5-sonnet", "messages": [{"role": "user", "content": "Write a haiku about coding"}], "stream": true }' # Vision model with image input curl -X POST https://api.llmgateway.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" \ -d '{ "model": "gpt-4o", "messages": [{ "role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, { "type": "image_url", "image_url": {"url": "https://example.com/image.jpg", "detail": "high"} } ] }] }' # Function calling with tools curl -X POST https://api.llmgateway.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "What is the weather in Boston?"}], "tools": [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"} }, "required": ["location"] } } }], "tool_choice": "auto" }' ``` ## Gateway API - List Models Retrieve available LLM models with pricing, capabilities, and provider information. ```bash # List all available models curl -X GET https://api.llmgateway.io/v1/models \ -H "Authorization: Bearer llmgtwy_your_api_key_here" # Expected response structure { "data": [ { "id": "gpt-4o", "name": "GPT-4o", "aliases": ["gpt-4o-2024-11-20"], "created": 1677649963, "family": "openai", "architecture": { "input_modalities": ["text", "image"], "output_modalities": ["text"], "tokenizer": "cl100k_base" }, "providers": [ { "providerId": "openai", "modelName": "gpt-4o-2024-11-20", "pricing": { "prompt": "0.0000025", "completion": "0.00001" }, "streaming": true, "vision": true, "cancellation": true, "tools": true, "parallelToolCalls": true, "reasoning": false } ], "pricing": { "prompt": "0.0000025", "completion": "0.00001", "input_cache_read": "0.00000125" }, "context_length": 128000, "json_output": true, "structured_outputs": true, "free": false } ] } # Filter models - exclude deprecated curl -X GET "https://api.llmgateway.io/v1/models?exclude_deprecated=true" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" # Include deactivated models curl -X GET "https://api.llmgateway.io/v1/models?include_deactivated=true" \ -H "Authorization: Bearer llmgtwy_your_api_key_here" ``` ## Management API - User Operations Manage user accounts, profiles, and authentication. ```bash # Get current user information curl -X GET https://api.llmgateway.io/user/me \ -H "Authorization: Bearer session_token" \ -H "Cookie: better-auth.session_token=abc123" # Response { "user": { "id": "usr_abc123", "email": "user@example.com", "name": "John Doe", "onboardingCompleted": true, "emailVerified": true, "isAdmin": false } } # Update user profile curl -X PATCH https://api.llmgateway.io/user/me \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "name": "Jane Smith", "email": "jane@example.com" }' # Complete onboarding curl -X POST https://api.llmgateway.io/user/me/complete-onboarding \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{}' # Update password curl -X PUT https://api.llmgateway.io/user/password \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "currentPassword": "oldpass123", "newPassword": "newpass456" }' # Delete user account curl -X DELETE https://api.llmgateway.io/user/me \ -H "Authorization: Bearer session_token" # Delete a passkey curl -X DELETE https://api.llmgateway.io/user/me/passkeys/passkey_id_123 \ -H "Authorization: Bearer session_token" ``` ## Management API - API Key Management Create and manage API keys with usage limits and IAM rules for fine-grained access control. ```bash # Create a new API key curl -X POST https://api.llmgateway.io/keys/api \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "description": "Production API Key", "projectId": "proj_abc123", "usageLimit": "100.00" }' # Response includes full token (only shown once) { "apiKey": { "id": "key_xyz789", "token": "llmgtwy_0123456789abcdef", "description": "Production API Key", "status": "active", "usageLimit": "100.00", "usage": "0", "projectId": "proj_abc123", "createdBy": "usr_abc123", "createdAt": "2024-01-15T10:30:00Z", "updatedAt": "2024-01-15T10:30:00Z" } } # List API keys for a project curl -X GET "https://api.llmgateway.io/keys/api?projectId=proj_abc123&filter=all" \ -H "Authorization: Bearer session_token" # Response { "apiKeys": [ { "id": "key_xyz789", "maskedToken": "llmgtwy_...def", "description": "Production API Key", "status": "active", "usageLimit": "100.00", "usage": "23.45", "projectId": "proj_abc123", "createdBy": "usr_abc123", "creator": { "id": "usr_abc123", "name": "John Doe", "email": "john@example.com" }, "iamRules": [] } ], "planLimits": { "currentCount": 3, "maxKeys": 20, "plan": "pro" }, "userRole": "owner" } # Update API key status curl -X PATCH https://api.llmgateway.io/keys/api/key_xyz789 \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{"status": "inactive"}' # Update usage limit curl -X PATCH https://api.llmgateway.io/keys/api/limit/key_xyz789 \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{"usageLimit": "200.00"}' # Delete API key (soft delete) curl -X DELETE https://api.llmgateway.io/keys/api/key_xyz789 \ -H "Authorization: Bearer session_token" # Create IAM rule to restrict models curl -X POST https://api.llmgateway.io/keys/api/key_xyz789/iam \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "ruleType": "allow_models", "ruleValue": { "models": ["gpt-4o", "claude-3-5-sonnet"] }, "status": "active" }' # Create IAM rule for pricing restrictions curl -X POST https://api.llmgateway.io/keys/api/key_xyz789/iam \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "ruleType": "allow_pricing", "ruleValue": { "pricingType": "free", "maxInputPrice": 0.000001, "maxOutputPrice": 0.000002 }, "status": "active" }' # List IAM rules for API key curl -X GET https://api.llmgateway.io/keys/api/key_xyz789/iam \ -H "Authorization: Bearer session_token" # Update IAM rule curl -X PATCH https://api.llmgateway.io/keys/api/key_xyz789/iam/rule_abc123 \ -H "Content-Type: application/json" \ -H "Authorization: Bearer session_token" \ -d '{ "ruleValue": { "models": ["gpt-4o", "claude-3-5-sonnet", "gemini-2.0-flash"] } }' # Delete IAM rule curl -X DELETE https://api.llmgateway.io/keys/api/key_xyz789/iam/rule_abc123 \ -H "Authorization: Bearer session_token" ``` ## Database Schema - Core Tables PostgreSQL schema using Drizzle ORM with comprehensive tracking and multi-tenancy support. ```typescript import { db, tables, eq } from "@llmgateway/db"; // Create a new user const [newUser] = await db .insert(tables.user) .values({ email: "user@example.com", name: "John Doe", emailVerified: false, onboardingCompleted: false, }) .returning(); // Create an organization const [organization] = await db .insert(tables.organization) .values({ name: "Acme Corp", billingEmail: "billing@acme.com", plan: "pro", credits: "100.00", autoTopUpEnabled: true, autoTopUpThreshold: "10.00", autoTopUpAmount: "50.00", }) .returning(); // Link user to organization await db.insert(tables.userOrganization).values({ userId: newUser.id, organizationId: organization.id, role: "owner", }); // Create a project const [project] = await db .insert(tables.project) .values({ name: "Production API", organizationId: organization.id, cachingEnabled: true, cacheDurationSeconds: 300, mode: "hybrid", status: "active", }) .returning(); // Create an API key const [apiKey] = await db .insert(tables.apiKey) .values({ token: "llmgtwy_" + generateToken(), description: "Main production key", projectId: project.id, createdBy: newUser.id, status: "active", usageLimit: "1000.00", }) .returning(); // Add IAM rule to API key await db.insert(tables.apiKeyIamRule).values({ apiKeyId: apiKey.id, ruleType: "allow_models", ruleValue: { models: ["gpt-4o", "claude-3-5-sonnet"] }, status: "active", }); // Query user's projects with organization info const userProjects = await db.query.userOrganization.findMany({ where: eq(tables.userOrganization.userId, newUser.id), with: { organization: { with: { projects: { where: eq(tables.project.status, "active"), }, }, }, }, }); // Insert a log entry for LLM request await db.insert(tables.log).values({ requestId: "req_abc123", organizationId: organization.id, projectId: project.id, apiKeyId: apiKey.id, requestedModel: "gpt-4o", usedModel: "gpt-4o-2024-11-20", usedProvider: "openai", promptTokens: "150", completionTokens: "50", totalTokens: "200", cost: 0.00055, inputCost: 0.000375, outputCost: 0.0005, duration: 1250, timeToFirstToken: 320, finishReason: "stop", unifiedFinishReason: "completed", hasError: false, streamed: false, cached: false, mode: "hybrid", usedMode: "credits", }); // Query logs with filters const logs = await db.query.log.findMany({ where: eq(tables.log.projectId, project.id), limit: 100, orderBy: (log, { desc }) => [desc(log.createdAt)], }); // Add provider key for Pro plan const [providerKey] = await db .insert(tables.providerKey) .values({ provider: "openai", token: "sk-proj-...", organizationId: organization.id, status: "active", }) .returning(); // Record a transaction await db.insert(tables.transaction).values({ organizationId: organization.id, type: "credit_topup", amount: "50.00", creditAmount: "50.00", currency: "USD", status: "completed", stripePaymentIntentId: "pi_abc123", description: "Credit top-up via Stripe", }); ``` ## Provider Configuration Define and configure LLM providers with environment variables, capabilities, and routing priorities. ```typescript import { providers, getProviderDefinition } from "@llmgateway/models"; // Access provider configuration const openaiProvider = getProviderDefinition("openai"); console.log(openaiProvider); // { // id: "openai", // name: "OpenAI", // description: "OpenAI is an AI research and deployment company...", // env: { // required: { apiKey: "LLM_OPENAI_API_KEY" } // }, // streaming: true, // cancellation: true, // color: "#0ea5e9", // website: "https://openai.com" // } // Configure environment variables for providers const envVars = { LLM_OPENAI_API_KEY: "sk-proj-...", LLM_ANTHROPIC_API_KEY: "sk-ant-...", LLM_GOOGLE_AI_STUDIO_API_KEY: "AIza...", LLM_GOOGLE_VERTEX_API_KEY: "ya29...", LLM_GOOGLE_CLOUD_PROJECT: "my-project-id", LLM_GOOGLE_VERTEX_REGION: "us-central1", LLM_AWS_BEDROCK_API_KEY: "AKIA...", LLM_AWS_BEDROCK_REGION: "us-east-1", LLM_AZURE_API_KEY: "abc123...", LLM_AZURE_RESOURCE: "my-resource-name", LLM_AZURE_DEPLOYMENT_TYPE: "openai", LLM_AZURE_API_VERSION: "2024-10-21", }; // Custom provider configuration const customProvider = { id: "custom", name: "Custom OpenAI-Compatible", description: "Custom provider with base URL", env: { required: {} }, streaming: true, cancellation: true, }; // Provider with routing priority const awsBedrockProvider = getProviderDefinition("aws-bedrock"); console.log(awsBedrockProvider.priority); // 0.9 (lower priority) const googleVertexProvider = getProviderDefinition("google-vertex"); console.log(googleVertexProvider.priority); // 0.5 (lowest priority) // Iterate through all providers providers.forEach((provider) => { console.log(`${provider.name} (${provider.id}):`, { streaming: provider.streaming, cancellation: provider.cancellation, color: provider.color, requiredEnv: provider.env.required, optionalEnv: provider.env.optional || {}, }); }); // Provider key options for database storage import type { ProviderKeyOptions } from "@llmgateway/db"; const providerKeyOptions: ProviderKeyOptions = { aws_bedrock_region_prefix: "us.", azure_resource: "my-azure-resource", azure_api_version: "2024-10-21", azure_deployment_type: "openai", azure_validation_model: "gpt-4o", }; // Store provider key with options await db.insert(tables.providerKey).values({ provider: "azure", token: "abc123...", name: "azure-prod", options: providerKeyOptions, organizationId: "org_abc123", status: "active", }); ``` ## Model Definitions and Pricing Access comprehensive model information including pricing, capabilities, and provider mappings. ```typescript import { models, type ModelDefinition, type ProviderModelMapping } from "@llmgateway/models"; // Access model definitions const gpt4oModel = models.find((m) => m.id === "gpt-4o"); console.log(gpt4oModel); // { // id: "gpt-4o", // name: "GPT-4o", // aliases: ["gpt-4o-2024-11-20"], // family: "openai", // architecture: { // inputModalities: ["text", "image"], // outputModalities: ["text"], // tokenizer: "cl100k_base" // }, // jsonOutput: true, // structuredOutputs: true, // free: false, // providers: [...] // } // Access provider-specific pricing const openaiMapping: ProviderModelMapping = gpt4oModel.providers.find( (p) => p.providerId === "openai" ); console.log(openaiMapping); // { // providerId: "openai", // modelName: "gpt-4o-2024-11-20", // inputPrice: 0.0000025, // outputPrice: 0.00001, // cachedInputPrice: 0.00000125, // contextSize: 128000, // maxOutput: 16384, // streaming: true, // vision: true, // tools: true, // parallelToolCalls: true, // reasoning: false // } // Pricing tiers for context-based pricing const claudeSonnetModel = models.find((m) => m.id === "claude-3-5-sonnet"); const anthropicMapping = claudeSonnetModel.providers.find((p) => p.providerId === "anthropic"); console.log(anthropicMapping.pricingTiers); // [ // { // name: "200K", // upToTokens: 200000, // inputPrice: 0.000003, // outputPrice: 0.000015, // cachedInputPrice: 0.0000003 // }, // { // name: "1M+", // upToTokens: Infinity, // inputPrice: 0.0000015, // outputPrice: 0.0000075, // cachedInputPrice: 0.00000015 // } // ] // Calculate costs for a request const promptTokens = 1000; const completionTokens = 500; const cachedTokens = 200; const cost = promptTokens * openaiMapping.inputPrice + completionTokens * openaiMapping.outputPrice + cachedTokens * (openaiMapping.cachedInputPrice || 0); console.log(`Total cost: $${cost.toFixed(6)}`); // Filter models by capability const visionModels = models.filter((model) => model.providers.some((p) => p.vision === true) ); const freeModels = models.filter((model) => model.free === true); const reasoningModels = models.filter((model) => model.providers.some((p) => p.reasoning === true) ); // Find cheapest provider for a model import { getCheapestFromAvailableProviders } from "@llmgateway/models"; const availableProviders = ["openai", "anthropic", "google-ai-studio"]; const cheapestProvider = getCheapestFromAvailableProviders( "claude-3-5-sonnet", availableProviders ); console.log(`Cheapest provider: ${cheapestProvider.providerId}`); ``` ## Health Check and Monitoring Built-in health check endpoint for monitoring service availability and dependencies. ```bash # Check API service health curl -X GET https://api.llmgateway.io/ # Expected response (healthy) { "message": "LLM Gateway API is running", "version": "1.0.0", "health": { "status": "healthy", "database": { "connected": true }, "redis": { "connected": true } } } # Unhealthy response (503 status) { "message": "LLM Gateway API is running", "version": "1.0.0", "health": { "status": "unhealthy", "database": { "connected": true }, "redis": { "connected": false, "error": "Connection timeout" } } } ``` ## TypeScript SDK Integration Use the gateway with OpenAI SDK or any OpenAI-compatible client library. ```typescript import OpenAI from "openai"; // Initialize client with LLM Gateway const client = new OpenAI({ apiKey: "llmgtwy_your_api_key_here", baseURL: "https://api.llmgateway.io/v1", }); // Basic chat completion const completion = await client.chat.completions.create({ model: "gpt-4o", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is TypeScript?" }, ], temperature: 0.7, max_tokens: 500, }); console.log(completion.choices[0].message.content); // Streaming completion const stream = await client.chat.completions.create({ model: "claude-3-5-sonnet", messages: [{ role: "user", content: "Write a poem about code" }], stream: true, }); for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content || ""; process.stdout.write(content); } // Function calling const weatherCompletion = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "What's the weather in Tokyo?" }], tools: [ { type: "function", function: { name: "get_weather", description: "Get current weather", parameters: { type: "object", properties: { location: { type: "string" }, unit: { type: "string", enum: ["celsius", "fahrenheit"] }, }, required: ["location"], }, }, }, ], tool_choice: "auto", }); // Vision with image input const visionCompletion = await client.chat.completions.create({ model: "gpt-4o", messages: [ { role: "user", content: [ { type: "text", text: "Describe this image in detail" }, { type: "image_url", image_url: { url: "https://example.com/image.jpg", detail: "high", }, }, ], }, ], }); // JSON mode const jsonCompletion = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "List 3 colors in JSON" }], response_format: { type: "json_object" }, }); // Structured outputs with schema const structuredCompletion = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "Extract person info: John Doe, age 30, from NYC" }], response_format: { type: "json_schema", json_schema: { name: "person_info", strict: true, schema: { type: "object", properties: { name: { type: "string" }, age: { type: "number" }, city: { type: "string" }, }, required: ["name", "age", "city"], additionalProperties: false, }, }, }, }); ``` ## Summary LLM Gateway serves as a comprehensive middleware solution for applications requiring LLM integration across multiple providers. The primary use cases include cost optimization through automatic provider routing based on pricing and performance metrics, unified API access eliminating the need to maintain provider-specific code, detailed usage analytics and cost tracking for budget management, and high availability through automatic failover to alternative providers when primary providers are unavailable. Organizations use it to centralize API key management, enforce IAM policies on model access, implement request caching to reduce costs and latency, and gain visibility into LLM usage patterns across teams and projects. Integration patterns include direct API usage via REST endpoints with OpenAI-compatible format, SDK integration using existing OpenAI client libraries by changing the base URL, self-hosted deployment for complete data control and custom provider configurations, and hybrid architectures combining gateway credits for paid models with organization provider keys for cost control. The platform supports project-based isolation for multi-tenant applications, role-based access control with owner/admin/developer roles, webhook integrations for billing events via Stripe, and comprehensive logging with configurable data retention periods. The modular architecture allows extending provider support, customizing routing algorithms, and integrating with existing authentication systems through Better Auth.