Vercel AI SDK (vercel/ai)

Vercel AI SDK

https://github.com/vercel/ai
Admin
The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source...

Tokens:1,012,167
Snippets:4,046
Trust Score:10
Update:15 hours ago
Show doc for...
Context Summary (auto-generated)
Raw
# AI SDK by Vercel

The AI SDK is a TypeScript toolkit designed to help developers build AI-powered applications and agents with React, Next.js, Vue, Svelte, Node.js, and more. It standardizes integrating artificial intelligence models across supported providers, enabling developers to focus on building great AI applications rather than dealing with provider-specific technical details.

The SDK has two main components: **AI SDK Core** provides a unified API for generating text, structured objects, tool calls, embeddings, and building agents with LLMs; **AI SDK UI** offers framework-agnostic hooks for quickly building chat and generative user interfaces. The SDK supports multiple model providers including OpenAI, Anthropic, Google, Amazon Bedrock, Mistral, Cohere, and many more through a consistent interface.

## generateText

Generates text for a given prompt and model. Ideal for non-interactive use cases like drafting emails, summarizing documents, or agents that use tools. The function returns a result object containing the generated text, tool calls, usage information, and more.

```typescript
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

const { text, usage, finishReason } = await generateText({
  model: openai('gpt-4o'),
  system: 'You are a professional writer. You write simple, clear, and concise content.',
  prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});

console.log(text);
console.log('Tokens used:', usage.totalTokens);
console.log('Finish reason:', finishReason);
```

## streamText

Streams text from a given prompt and model for real-time interactive applications like chatbots. Returns a stream that can be consumed as an async iterable or converted to HTTP responses for API routes.

```typescript
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Invent a new holiday and describe its traditions.',
  onError({ error }) {
    console.error(error);
  },
  onFinish({ text, finishReason, usage }) {
    console.log('Generation complete:', { finishReason, usage });
  },
});

// Consume as async iterable
for await (const textPart of result.textStream) {
  process.stdout.write(textPart);
}

// Or create HTTP response for API routes
// return result.toUIMessageStreamResponse();
```

## Structured Output with Output.object

Generates structured data conforming to a Zod schema. The AI SDK validates the generated data ensuring type safety and correctness. Supports streaming partial objects during generation.

```typescript
import { generateText, Output } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const { output } = await generateText({
  model: openai('gpt-4o'),
  output: Output.object({
    schema: z.object({
      recipe: z.object({
        name: z.string(),
        ingredients: z.array(
          z.object({ name: z.string(), amount: z.string() })
        ),
        steps: z.array(z.string()),
      }),
    }),
  }),
  prompt: 'Generate a lasagna recipe.',
});

console.log('Recipe:', output.recipe.name);
console.log('Ingredients:', output.recipe.ingredients);
console.log('Steps:', output.recipe.steps);
```

## Tool Calling with tool()

Defines tools that models can call to perform specific tasks. Tools contain a description, input schema, and an execute function. Supports multi-step calls where the model can use multiple tools in sequence.

```typescript
import { generateText, tool, isStepCount } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const { text, steps } = await generateText({
  model: openai('gpt-4o'),
  tools: {
    weather: tool({
      description: 'Get the weather in a location',
      inputSchema: z.object({
        location: z.string().describe('The location to get the weather for'),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
        condition: 'sunny',
      }),
    }),
    cityAttractions: tool({
      description: 'Get tourist attractions in a city',
      inputSchema: z.object({ city: z.string() }),
      execute: async ({ city }) => ({
        attractions: ['Golden Gate Bridge', 'Alcatraz Island', 'Fisherman\'s Wharf'],
      }),
    }),
  },
  stopWhen: isStepCount(5),
  prompt: 'What is the weather in San Francisco and what are the top attractions?',
});

console.log(text);
console.log('Steps taken:', steps.length);
```

## ToolLoopAgent

Encapsulates LLM configuration, tools, and behavior into reusable agent components. Handles the agent loop automatically, allowing the LLM to call tools multiple times to accomplish complex tasks.

```typescript
import { ToolLoopAgent, tool, isStepCount } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const researchAgent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  instructions: `You are a research assistant. When researching:
    1. Always start with a broad search
    2. Cross-reference multiple sources
    3. Cite your sources when presenting information`,
  tools: {
    webSearch: tool({
      description: 'Search the web for information',
      inputSchema: z.object({ query: z.string() }),
      execute: async ({ query }) => ({
        results: [`Result for: ${query}`],
      }),
    }),
  },
  stopWhen: isStepCount(10),
});

// Generate text
const { text, steps } = await researchAgent.generate({
  prompt: 'Research the latest AI trends',
});

// Or stream response
const stream = await researchAgent.stream({
  prompt: 'Tell me about quantum computing',
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
```

## useChat Hook

A React hook for building conversational chat interfaces with real-time message streaming, managed states, and seamless integration with any design.

```typescript
// Client Component (app/page.tsx)
'use client';
import { useChat } from '@ai-sdk/react';
import { DefaultChatTransport } from 'ai';
import { useState } from 'react';

export default function Chat() {
  const { messages, sendMessage, status, stop, error } = useChat({
    transport: new DefaultChatTransport({ api: '/api/chat' }),
  });
  const [input, setInput] = useState('');

  return (
    <div>
      {messages.map(message => (
        <div key={message.id}>
          {message.role === 'user' ? 'User: ' : 'AI: '}
          {message.parts.map((part, i) =>
            part.type === 'text' ? <span key={i}>{part.text}</span> : null
          )}
        </div>
      ))}

      {status === 'streaming' && <button onClick={stop}>Stop</button>}
      {error && <div>Error occurred. <button onClick={() => {}}>Retry</button></div>}

      <form onSubmit={e => {
        e.preventDefault();
        if (input.trim()) {
          sendMessage({ text: input });
          setInput('');
        }
      }}>
        <input value={input} onChange={e => setInput(e.target.value)} disabled={status !== 'ready'} />
        <button type="submit" disabled={status !== 'ready'}>Send</button>
      </form>
    </div>
  );
}
```

```typescript
// API Route (app/api/chat/route.ts)
import { convertToModelMessages, streamText, UIMessage } from 'ai';
import { openai } from '@ai-sdk/openai';

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  const result = streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages: await convertToModelMessages(messages),
  });

  return result.toUIMessageStreamResponse();
}
```

## embed and embedMany

Generates vector embeddings for text, useful for semantic search, clustering, and retrieval-augmented generation (RAG). Supports single values or batch embedding.

```typescript
import { embed, embedMany, cosineSimilarity } from 'ai';
import { openai } from '@ai-sdk/openai';

// Single embedding
const { embedding, usage } = await embed({
  model: openai.embedding('text-embedding-3-small'),
  value: 'sunny day at the beach',
});

console.log('Embedding dimensions:', embedding.length);
console.log('Tokens used:', usage.tokens);

// Batch embeddings
const { embeddings } = await embedMany({
  model: openai.embedding('text-embedding-3-small'),
  values: [
    'sunny day at the beach',
    'rainy afternoon in the city',
    'snowy night in the mountains',
  ],
});

// Calculate similarity
const similarity = cosineSimilarity(embeddings[0], embeddings[1]);
console.log('Cosine similarity:', similarity);
```

## generateImage

Generates images based on text prompts using various image models. Supports multiple sizes, aspect ratios, and provider-specific options.

```typescript
import { generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';

const { image, warnings } = await generateImage({
  model: openai.image('dall-e-3'),
  prompt: 'A futuristic city skyline at sunset with flying cars',
  size: '1024x1024',
  providerOptions: {
    openai: { style: 'vivid', quality: 'hd' },
  },
});

// Access image data
const base64Data = image.base64;
const binaryData = image.uint8Array;

// Generate multiple images
const { images } = await generateImage({
  model: openai.image('dall-e-2'),
  prompt: 'Abstract art with vibrant colors',
  n: 4,
  size: '512x512',
});

console.log('Generated', images.length, 'images');
```

## Output Types for Structured Generation

Multiple output types for different structured generation scenarios including objects, arrays, choices, and unstructured JSON.

```typescript
import { generateText, streamText, Output } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

// Object output
const { output: recipe } = await generateText({
  model: openai('gpt-4o'),
  output: Output.object({
    schema: z.object({
      name: z.string(),
      cookTime: z.number(),
    }),
  }),
  prompt: 'Generate a quick pasta recipe',
});

// Array output with streaming
const { elementStream } = streamText({
  model: openai('gpt-4o'),
  output: Output.array({
    element: z.object({
      name: z.string(),
      class: z.string(),
      description: z.string(),
    }),
  }),
  prompt: 'Generate 3 hero descriptions for a fantasy game.',
});

for await (const hero of elementStream) {
  console.log('Hero:', hero.name); // Each hero is complete and validated
}

// Choice output for classification
const { output: sentiment } = await generateText({
  model: openai('gpt-4o'),
  output: Output.choice({
    options: ['positive', 'neutral', 'negative'],
  }),
  prompt: 'Classify the sentiment: "I love this product!"',
});

console.log('Sentiment:', sentiment); // 'positive'
```

## Tool Execution Approval

Configure approval workflows for sensitive tool operations like executing commands, processing payments, or modifying data.

```typescript
import { generateText, tool, type ModelMessage } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const paymentTool = tool({
  description: 'Process a payment',
  inputSchema: z.object({
    amount: z.number(),
    recipient: z.string(),
  }),
  execute: async ({ amount, recipient }) => {
    return { success: true, transactionId: '12345' };
  },
});

const messages: ModelMessage[] = [
  { role: 'user', content: 'Send $1500 to the contractor' },
];

const result = await generateText({
  model: openai('gpt-4o'),
  tools: { processPayment: paymentTool },
  toolApproval: {
    processPayment: async ({ amount }) =>
      amount > 1000 ? 'user-approval' : undefined,
  },
  messages,
});

// Check for approval requests
for (const part of result.content) {
  if (part.type === 'tool-approval-request' && !part.isAutomatic) {
    console.log('Approval needed for:', part.toolCall.toolName);
    console.log('Input:', part.toolCall.input);
    // Handle user approval flow...
  }
}
```

## createAgentUIStreamResponse

Creates API responses for client applications using agents. Handles message conversion and streaming automatically.

```typescript
// API Route (app/api/chat/route.ts)
import { createAgentUIStreamResponse, ToolLoopAgent, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const supportAgent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  instructions: 'You are a helpful customer support agent.',
  tools: {
    checkOrderStatus: tool({
      description: 'Check the status of an order',
      inputSchema: z.object({ orderId: z.string() }),
      execute: async ({ orderId }) => ({
        status: 'shipped',
        estimatedDelivery: '2024-01-15',
      }),
    }),
  },
});

export async function POST(request: Request) {
  const { messages } = await request.json();

  return createAgentUIStreamResponse({
    agent: supportAgent,
    uiMessages: messages,
  });
}
```

## Middleware for Models

Enhance language, embedding, and image models with middleware for logging, default values, and custom transformations.

```typescript
import {
  generateText,
  wrapLanguageModel,
  wrapEmbeddingModel,
  defaultEmbeddingSettingsMiddleware,
} from 'ai';
import { openai } from '@ai-sdk/openai';

// Language model middleware for logging
const modelWithLogging = wrapLanguageModel({
  model: openai('gpt-4o'),
  middleware: {
    specificationVersion: 'v2',
    transformParams: async ({ params }) => {
      console.log('Request params:', params);
      return params;
    },
  },
});

// Embedding model middleware with defaults
const embeddingModelWithDefaults = wrapEmbeddingModel({
  model: openai.embedding('text-embedding-3-small'),
  middleware: defaultEmbeddingSettingsMiddleware({
    settings: {
      providerOptions: {
        openai: { dimensions: 512 },
      },
    },
  }),
});

const { text } = await generateText({
  model: modelWithLogging,
  prompt: 'Hello, world!',
});
```

## Stream Transformations

Transform and smooth text streams with built-in or custom transformations for enhanced user experience.

```typescript
import { smoothStream, streamText, type TextStreamPart, type ToolSet } from 'ai';
import { openai } from '@ai-sdk/openai';

// Use built-in smooth streaming
const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Tell me a story',
  experimental_transform: smoothStream(),
});

// Custom transformation (uppercase all text)
const upperCaseTransform =
  <TOOLS extends ToolSet>() =>
  (options: { tools: TOOLS; stopStream: () => void }) =>
    new TransformStream<TextStreamPart<TOOLS>, TextStreamPart<TOOLS>>({
      transform(chunk, controller) {
        controller.enqueue(
          chunk.type === 'text-delta'
            ? { ...chunk, text: chunk.text.toUpperCase() }
            : chunk,
        );
      },
    });

const transformedResult = streamText({
  model: openai('gpt-4o'),
  prompt: 'Say hello',
  experimental_transform: upperCaseTransform(),
});

for await (const text of transformedResult.textStream) {
  console.log(text); // HELLO...
}
```

## Summary

The AI SDK provides a comprehensive, provider-agnostic toolkit for building AI-powered applications in TypeScript. Key use cases include building conversational chatbots with streaming responses, creating autonomous agents that can use tools to accomplish complex tasks, generating structured data from natural language, semantic search and RAG applications using embeddings, and image generation workflows. The SDK's unified API means developers can easily switch between providers like OpenAI, Anthropic, Google, and others without changing application code.

Integration patterns typically involve using AI SDK Core functions (`generateText`, `streamText`, `embed`, `generateImage`) on the server-side, combined with AI SDK UI hooks (`useChat`, `useCompletion`, `useObject`) on the client-side for React applications. The `ToolLoopAgent` class provides a higher-level abstraction for building reusable agents with encapsulated configuration. For production applications, the SDK supports middleware for logging and observability, tool approval workflows for sensitive operations, and type-safe structured outputs using Zod schemas. The streaming-first architecture ensures responsive user experiences even for long-running AI operations.