# LLM Gateway

LLM Gateway is an open-source API gateway for Large Language Models that acts as middleware between applications and various LLM providers. It provides a unified OpenAI-compatible API interface for routing requests to multiple providers (OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, and 20+ others), allowing seamless provider switching and multi-provider orchestration. The gateway handles request routing, load balancing, response caching with Redis, usage tracking, cost analytics, and provider key management through a centralized system.

The platform consists of a monorepo architecture built with Hono framework for APIs, Next.js for frontend applications, PostgreSQL with Drizzle ORM for data persistence, and Redis for caching. It includes a Gateway service for LLM request routing, an API service for user/organization/project management, a UI dashboard, an interactive Playground, comprehensive documentation, and an admin panel. The system supports multiple authentication methods including passkeys, tracks detailed usage metrics and costs, provides IAM-based API key management with granular permissions, and offers both free and pro subscription tiers with Stripe integration.

## Gateway API - Chat Completions

OpenAI-compatible chat completions endpoint that routes requests to multiple LLM providers with automatic failover, caching, and cost optimization.

```bash
# Basic chat completion request
curl -X POST https://api.llmgateway.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100,
    "stream": false
  }'

# Expected response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The capital of France is Paris."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 8,
    "total_tokens": 28
  }
}

# Streaming response with Server-Sent Events
curl -X POST https://api.llmgateway.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here" \
  -d '{
    "model": "claude-3-5-sonnet",
    "messages": [{"role": "user", "content": "Write a haiku about coding"}],
    "stream": true
  }'

# Vision model with image input
curl -X POST https://api.llmgateway.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here" \
  -d '{
    "model": "gpt-4o",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {
          "type": "image_url",
          "image_url": {"url": "https://example.com/image.jpg", "detail": "high"}
        }
      ]
    }]
  }'

# Function calling with tools
curl -X POST https://api.llmgateway.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the weather in Boston?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"}
          },
          "required": ["location"]
        }
      }
    }],
    "tool_choice": "auto"
  }'
```

## Gateway API - List Models

Retrieve available LLM models with pricing, capabilities, and provider information.

```bash
# List all available models
curl -X GET https://api.llmgateway.io/v1/models \
  -H "Authorization: Bearer llmgtwy_your_api_key_here"

# Expected response structure
{
  "data": [
    {
      "id": "gpt-4o",
      "name": "GPT-4o",
      "aliases": ["gpt-4o-2024-11-20"],
      "created": 1677649963,
      "family": "openai",
      "architecture": {
        "input_modalities": ["text", "image"],
        "output_modalities": ["text"],
        "tokenizer": "cl100k_base"
      },
      "providers": [
        {
          "providerId": "openai",
          "modelName": "gpt-4o-2024-11-20",
          "pricing": {
            "prompt": "0.0000025",
            "completion": "0.00001"
          },
          "streaming": true,
          "vision": true,
          "cancellation": true,
          "tools": true,
          "parallelToolCalls": true,
          "reasoning": false
        }
      ],
      "pricing": {
        "prompt": "0.0000025",
        "completion": "0.00001",
        "input_cache_read": "0.00000125"
      },
      "context_length": 128000,
      "json_output": true,
      "structured_outputs": true,
      "free": false
    }
  ]
}

# Filter models - exclude deprecated
curl -X GET "https://api.llmgateway.io/v1/models?exclude_deprecated=true" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here"

# Include deactivated models
curl -X GET "https://api.llmgateway.io/v1/models?include_deactivated=true" \
  -H "Authorization: Bearer llmgtwy_your_api_key_here"
```

## Management API - User Operations

Manage user accounts, profiles, and authentication.

```bash
# Get current user information
curl -X GET https://api.llmgateway.io/user/me \
  -H "Authorization: Bearer session_token" \
  -H "Cookie: better-auth.session_token=abc123"

# Response
{
  "user": {
    "id": "usr_abc123",
    "email": "user@example.com",
    "name": "John Doe",
    "onboardingCompleted": true,
    "emailVerified": true,
    "isAdmin": false
  }
}

# Update user profile
curl -X PATCH https://api.llmgateway.io/user/me \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "name": "Jane Smith",
    "email": "jane@example.com"
  }'

# Complete onboarding
curl -X POST https://api.llmgateway.io/user/me/complete-onboarding \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{}'

# Update password
curl -X PUT https://api.llmgateway.io/user/password \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "currentPassword": "oldpass123",
    "newPassword": "newpass456"
  }'

# Delete user account
curl -X DELETE https://api.llmgateway.io/user/me \
  -H "Authorization: Bearer session_token"

# Delete a passkey
curl -X DELETE https://api.llmgateway.io/user/me/passkeys/passkey_id_123 \
  -H "Authorization: Bearer session_token"
```

## Management API - API Key Management

Create and manage API keys with usage limits and IAM rules for fine-grained access control.

```bash
# Create a new API key
curl -X POST https://api.llmgateway.io/keys/api \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "description": "Production API Key",
    "projectId": "proj_abc123",
    "usageLimit": "100.00"
  }'

# Response includes full token (only shown once)
{
  "apiKey": {
    "id": "key_xyz789",
    "token": "llmgtwy_0123456789abcdef",
    "description": "Production API Key",
    "status": "active",
    "usageLimit": "100.00",
    "usage": "0",
    "projectId": "proj_abc123",
    "createdBy": "usr_abc123",
    "createdAt": "2024-01-15T10:30:00Z",
    "updatedAt": "2024-01-15T10:30:00Z"
  }
}

# List API keys for a project
curl -X GET "https://api.llmgateway.io/keys/api?projectId=proj_abc123&filter=all" \
  -H "Authorization: Bearer session_token"

# Response
{
  "apiKeys": [
    {
      "id": "key_xyz789",
      "maskedToken": "llmgtwy_...def",
      "description": "Production API Key",
      "status": "active",
      "usageLimit": "100.00",
      "usage": "23.45",
      "projectId": "proj_abc123",
      "createdBy": "usr_abc123",
      "creator": {
        "id": "usr_abc123",
        "name": "John Doe",
        "email": "john@example.com"
      },
      "iamRules": []
    }
  ],
  "planLimits": {
    "currentCount": 3,
    "maxKeys": 20,
    "plan": "pro"
  },
  "userRole": "owner"
}

# Update API key status
curl -X PATCH https://api.llmgateway.io/keys/api/key_xyz789 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{"status": "inactive"}'

# Update usage limit
curl -X PATCH https://api.llmgateway.io/keys/api/limit/key_xyz789 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{"usageLimit": "200.00"}'

# Delete API key (soft delete)
curl -X DELETE https://api.llmgateway.io/keys/api/key_xyz789 \
  -H "Authorization: Bearer session_token"

# Create IAM rule to restrict models
curl -X POST https://api.llmgateway.io/keys/api/key_xyz789/iam \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "ruleType": "allow_models",
    "ruleValue": {
      "models": ["gpt-4o", "claude-3-5-sonnet"]
    },
    "status": "active"
  }'

# Create IAM rule for pricing restrictions
curl -X POST https://api.llmgateway.io/keys/api/key_xyz789/iam \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "ruleType": "allow_pricing",
    "ruleValue": {
      "pricingType": "free",
      "maxInputPrice": 0.000001,
      "maxOutputPrice": 0.000002
    },
    "status": "active"
  }'

# List IAM rules for API key
curl -X GET https://api.llmgateway.io/keys/api/key_xyz789/iam \
  -H "Authorization: Bearer session_token"

# Update IAM rule
curl -X PATCH https://api.llmgateway.io/keys/api/key_xyz789/iam/rule_abc123 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer session_token" \
  -d '{
    "ruleValue": {
      "models": ["gpt-4o", "claude-3-5-sonnet", "gemini-2.0-flash"]
    }
  }'

# Delete IAM rule
curl -X DELETE https://api.llmgateway.io/keys/api/key_xyz789/iam/rule_abc123 \
  -H "Authorization: Bearer session_token"
```

## Database Schema - Core Tables

PostgreSQL schema using Drizzle ORM with comprehensive tracking and multi-tenancy support.

```typescript
import { db, tables, eq } from "@llmgateway/db";

// Create a new user
const [newUser] = await db
	.insert(tables.user)
	.values({
		email: "user@example.com",
		name: "John Doe",
		emailVerified: false,
		onboardingCompleted: false,
	})
	.returning();

// Create an organization
const [organization] = await db
	.insert(tables.organization)
	.values({
		name: "Acme Corp",
		billingEmail: "billing@acme.com",
		plan: "pro",
		credits: "100.00",
		autoTopUpEnabled: true,
		autoTopUpThreshold: "10.00",
		autoTopUpAmount: "50.00",
	})
	.returning();

// Link user to organization
await db.insert(tables.userOrganization).values({
	userId: newUser.id,
	organizationId: organization.id,
	role: "owner",
});

// Create a project
const [project] = await db
	.insert(tables.project)
	.values({
		name: "Production API",
		organizationId: organization.id,
		cachingEnabled: true,
		cacheDurationSeconds: 300,
		mode: "hybrid",
		status: "active",
	})
	.returning();

// Create an API key
const [apiKey] = await db
	.insert(tables.apiKey)
	.values({
		token: "llmgtwy_" + generateToken(),
		description: "Main production key",
		projectId: project.id,
		createdBy: newUser.id,
		status: "active",
		usageLimit: "1000.00",
	})
	.returning();

// Add IAM rule to API key
await db.insert(tables.apiKeyIamRule).values({
	apiKeyId: apiKey.id,
	ruleType: "allow_models",
	ruleValue: { models: ["gpt-4o", "claude-3-5-sonnet"] },
	status: "active",
});

// Query user's projects with organization info
const userProjects = await db.query.userOrganization.findMany({
	where: eq(tables.userOrganization.userId, newUser.id),
	with: {
		organization: {
			with: {
				projects: {
					where: eq(tables.project.status, "active"),
				},
			},
		},
	},
});

// Insert a log entry for LLM request
await db.insert(tables.log).values({
	requestId: "req_abc123",
	organizationId: organization.id,
	projectId: project.id,
	apiKeyId: apiKey.id,
	requestedModel: "gpt-4o",
	usedModel: "gpt-4o-2024-11-20",
	usedProvider: "openai",
	promptTokens: "150",
	completionTokens: "50",
	totalTokens: "200",
	cost: 0.00055,
	inputCost: 0.000375,
	outputCost: 0.0005,
	duration: 1250,
	timeToFirstToken: 320,
	finishReason: "stop",
	unifiedFinishReason: "completed",
	hasError: false,
	streamed: false,
	cached: false,
	mode: "hybrid",
	usedMode: "credits",
});

// Query logs with filters
const logs = await db.query.log.findMany({
	where: eq(tables.log.projectId, project.id),
	limit: 100,
	orderBy: (log, { desc }) => [desc(log.createdAt)],
});

// Add provider key for Pro plan
const [providerKey] = await db
	.insert(tables.providerKey)
	.values({
		provider: "openai",
		token: "sk-proj-...",
		organizationId: organization.id,
		status: "active",
	})
	.returning();

// Record a transaction
await db.insert(tables.transaction).values({
	organizationId: organization.id,
	type: "credit_topup",
	amount: "50.00",
	creditAmount: "50.00",
	currency: "USD",
	status: "completed",
	stripePaymentIntentId: "pi_abc123",
	description: "Credit top-up via Stripe",
});
```

## Provider Configuration

Define and configure LLM providers with environment variables, capabilities, and routing priorities.

```typescript
import { providers, getProviderDefinition } from "@llmgateway/models";

// Access provider configuration
const openaiProvider = getProviderDefinition("openai");
console.log(openaiProvider);
// {
//   id: "openai",
//   name: "OpenAI",
//   description: "OpenAI is an AI research and deployment company...",
//   env: {
//     required: { apiKey: "LLM_OPENAI_API_KEY" }
//   },
//   streaming: true,
//   cancellation: true,
//   color: "#0ea5e9",
//   website: "https://openai.com"
// }

// Configure environment variables for providers
const envVars = {
	LLM_OPENAI_API_KEY: "sk-proj-...",
	LLM_ANTHROPIC_API_KEY: "sk-ant-...",
	LLM_GOOGLE_AI_STUDIO_API_KEY: "AIza...",
	LLM_GOOGLE_VERTEX_API_KEY: "ya29...",
	LLM_GOOGLE_CLOUD_PROJECT: "my-project-id",
	LLM_GOOGLE_VERTEX_REGION: "us-central1",
	LLM_AWS_BEDROCK_API_KEY: "AKIA...",
	LLM_AWS_BEDROCK_REGION: "us-east-1",
	LLM_AZURE_API_KEY: "abc123...",
	LLM_AZURE_RESOURCE: "my-resource-name",
	LLM_AZURE_DEPLOYMENT_TYPE: "openai",
	LLM_AZURE_API_VERSION: "2024-10-21",
};

// Custom provider configuration
const customProvider = {
	id: "custom",
	name: "Custom OpenAI-Compatible",
	description: "Custom provider with base URL",
	env: { required: {} },
	streaming: true,
	cancellation: true,
};

// Provider with routing priority
const awsBedrockProvider = getProviderDefinition("aws-bedrock");
console.log(awsBedrockProvider.priority); // 0.9 (lower priority)

const googleVertexProvider = getProviderDefinition("google-vertex");
console.log(googleVertexProvider.priority); // 0.5 (lowest priority)

// Iterate through all providers
providers.forEach((provider) => {
	console.log(`${provider.name} (${provider.id}):`, {
		streaming: provider.streaming,
		cancellation: provider.cancellation,
		color: provider.color,
		requiredEnv: provider.env.required,
		optionalEnv: provider.env.optional || {},
	});
});

// Provider key options for database storage
import type { ProviderKeyOptions } from "@llmgateway/db";

const providerKeyOptions: ProviderKeyOptions = {
	aws_bedrock_region_prefix: "us.",
	azure_resource: "my-azure-resource",
	azure_api_version: "2024-10-21",
	azure_deployment_type: "openai",
	azure_validation_model: "gpt-4o",
};

// Store provider key with options
await db.insert(tables.providerKey).values({
	provider: "azure",
	token: "abc123...",
	name: "azure-prod",
	options: providerKeyOptions,
	organizationId: "org_abc123",
	status: "active",
});
```

## Model Definitions and Pricing

Access comprehensive model information including pricing, capabilities, and provider mappings.

```typescript
import { models, type ModelDefinition, type ProviderModelMapping } from "@llmgateway/models";

// Access model definitions
const gpt4oModel = models.find((m) => m.id === "gpt-4o");
console.log(gpt4oModel);
// {
//   id: "gpt-4o",
//   name: "GPT-4o",
//   aliases: ["gpt-4o-2024-11-20"],
//   family: "openai",
//   architecture: {
//     inputModalities: ["text", "image"],
//     outputModalities: ["text"],
//     tokenizer: "cl100k_base"
//   },
//   jsonOutput: true,
//   structuredOutputs: true,
//   free: false,
//   providers: [...]
// }

// Access provider-specific pricing
const openaiMapping: ProviderModelMapping = gpt4oModel.providers.find(
	(p) => p.providerId === "openai"
);
console.log(openaiMapping);
// {
//   providerId: "openai",
//   modelName: "gpt-4o-2024-11-20",
//   inputPrice: 0.0000025,
//   outputPrice: 0.00001,
//   cachedInputPrice: 0.00000125,
//   contextSize: 128000,
//   maxOutput: 16384,
//   streaming: true,
//   vision: true,
//   tools: true,
//   parallelToolCalls: true,
//   reasoning: false
// }

// Pricing tiers for context-based pricing
const claudeSonnetModel = models.find((m) => m.id === "claude-3-5-sonnet");
const anthropicMapping = claudeSonnetModel.providers.find((p) => p.providerId === "anthropic");
console.log(anthropicMapping.pricingTiers);
// [
//   {
//     name: "200K",
//     upToTokens: 200000,
//     inputPrice: 0.000003,
//     outputPrice: 0.000015,
//     cachedInputPrice: 0.0000003
//   },
//   {
//     name: "1M+",
//     upToTokens: Infinity,
//     inputPrice: 0.0000015,
//     outputPrice: 0.0000075,
//     cachedInputPrice: 0.00000015
//   }
// ]

// Calculate costs for a request
const promptTokens = 1000;
const completionTokens = 500;
const cachedTokens = 200;

const cost =
	promptTokens * openaiMapping.inputPrice +
	completionTokens * openaiMapping.outputPrice +
	cachedTokens * (openaiMapping.cachedInputPrice || 0);

console.log(`Total cost: $${cost.toFixed(6)}`);

// Filter models by capability
const visionModels = models.filter((model) =>
	model.providers.some((p) => p.vision === true)
);

const freeModels = models.filter((model) => model.free === true);

const reasoningModels = models.filter((model) =>
	model.providers.some((p) => p.reasoning === true)
);

// Find cheapest provider for a model
import { getCheapestFromAvailableProviders } from "@llmgateway/models";

const availableProviders = ["openai", "anthropic", "google-ai-studio"];
const cheapestProvider = getCheapestFromAvailableProviders(
	"claude-3-5-sonnet",
	availableProviders
);
console.log(`Cheapest provider: ${cheapestProvider.providerId}`);
```

## Health Check and Monitoring

Built-in health check endpoint for monitoring service availability and dependencies.

```bash
# Check API service health
curl -X GET https://api.llmgateway.io/

# Expected response (healthy)
{
  "message": "LLM Gateway API is running",
  "version": "1.0.0",
  "health": {
    "status": "healthy",
    "database": {
      "connected": true
    },
    "redis": {
      "connected": true
    }
  }
}

# Unhealthy response (503 status)
{
  "message": "LLM Gateway API is running",
  "version": "1.0.0",
  "health": {
    "status": "unhealthy",
    "database": {
      "connected": true
    },
    "redis": {
      "connected": false,
      "error": "Connection timeout"
    }
  }
}
```

## TypeScript SDK Integration

Use the gateway with OpenAI SDK or any OpenAI-compatible client library.

```typescript
import OpenAI from "openai";

// Initialize client with LLM Gateway
const client = new OpenAI({
	apiKey: "llmgtwy_your_api_key_here",
	baseURL: "https://api.llmgateway.io/v1",
});

// Basic chat completion
const completion = await client.chat.completions.create({
	model: "gpt-4o",
	messages: [
		{ role: "system", content: "You are a helpful assistant." },
		{ role: "user", content: "What is TypeScript?" },
	],
	temperature: 0.7,
	max_tokens: 500,
});

console.log(completion.choices[0].message.content);

// Streaming completion
const stream = await client.chat.completions.create({
	model: "claude-3-5-sonnet",
	messages: [{ role: "user", content: "Write a poem about code" }],
	stream: true,
});

for await (const chunk of stream) {
	const content = chunk.choices[0]?.delta?.content || "";
	process.stdout.write(content);
}

// Function calling
const weatherCompletion = await client.chat.completions.create({
	model: "gpt-4o",
	messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
	tools: [
		{
			type: "function",
			function: {
				name: "get_weather",
				description: "Get current weather",
				parameters: {
					type: "object",
					properties: {
						location: { type: "string" },
						unit: { type: "string", enum: ["celsius", "fahrenheit"] },
					},
					required: ["location"],
				},
			},
		},
	],
	tool_choice: "auto",
});

// Vision with image input
const visionCompletion = await client.chat.completions.create({
	model: "gpt-4o",
	messages: [
		{
			role: "user",
			content: [
				{ type: "text", text: "Describe this image in detail" },
				{
					type: "image_url",
					image_url: {
						url: "https://example.com/image.jpg",
						detail: "high",
					},
				},
			],
		},
	],
});

// JSON mode
const jsonCompletion = await client.chat.completions.create({
	model: "gpt-4o",
	messages: [{ role: "user", content: "List 3 colors in JSON" }],
	response_format: { type: "json_object" },
});

// Structured outputs with schema
const structuredCompletion = await client.chat.completions.create({
	model: "gpt-4o",
	messages: [{ role: "user", content: "Extract person info: John Doe, age 30, from NYC" }],
	response_format: {
		type: "json_schema",
		json_schema: {
			name: "person_info",
			strict: true,
			schema: {
				type: "object",
				properties: {
					name: { type: "string" },
					age: { type: "number" },
					city: { type: "string" },
				},
				required: ["name", "age", "city"],
				additionalProperties: false,
			},
		},
	},
});
```

## Summary

LLM Gateway serves as a comprehensive middleware solution for applications requiring LLM integration across multiple providers. The primary use cases include cost optimization through automatic provider routing based on pricing and performance metrics, unified API access eliminating the need to maintain provider-specific code, detailed usage analytics and cost tracking for budget management, and high availability through automatic failover to alternative providers when primary providers are unavailable. Organizations use it to centralize API key management, enforce IAM policies on model access, implement request caching to reduce costs and latency, and gain visibility into LLM usage patterns across teams and projects.

Integration patterns include direct API usage via REST endpoints with OpenAI-compatible format, SDK integration using existing OpenAI client libraries by changing the base URL, self-hosted deployment for complete data control and custom provider configurations, and hybrid architectures combining gateway credits for paid models with organization provider keys for cost control. The platform supports project-based isolation for multi-tenant applications, role-based access control with owner/admin/developer roles, webhook integrations for billing events via Stripe, and comprehensive logging with configurable data retention periods. The modular architecture allows extending provider support, customizing routing algorithms, and integrating with existing authentication systems through Better Auth.