### Install OpenAI Node SDK

Source: https://www.tensorzero.com/docs/gateway/call-any-llm

Install the OpenAI SDK using npm. This is required to interact with the gateway.

```bash
npm i openai
```

--------------------------------

### Example Postgres Connection URL

Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference

This example shows the format for the `TENSORZERO_POSTGRES_URL` environment variable, used for connecting to a Postgres database. Either ClickHouse or Postgres must be available.

```bash
TENSORZERO_POSTGRES_URL="postgres://myuser:mypass@localhost:5432/mydatabase"

```

--------------------------------

### Example ClickHouse Connection URL

Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference

This example demonstrates the format for the `TENSORZERO_CLICKHOUSE_URL` environment variable, which is required for connecting to a ClickHouse database. Either ClickHouse or Postgres must be available.

```bash
TENSORZERO_CLICKHOUSE_URL=http://chuser:chpassword@localhost:8123/database_name

```

--------------------------------

### Example OpenAI API Key Environment Variable

Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference

This example shows how to set the `OPENAI_API_KEY` environment variable. It is required when using external model providers without a TensorZero Gateway.

```bash
OPENAI_API_KEY=sk-...

```

--------------------------------

### Set Number of Examples for Inference

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Configure the `k` parameter to specify the number of examples to retrieve for inference. This is a required field.

```toml
[functions.draft-email.variants.dicl]
# ...
k = 10
# ...

```

--------------------------------

### Install OpenAI Python SDK

Source: https://www.tensorzero.com/docs/gateway/call-any-llm

Install the OpenAI Python SDK using pip. This package is required to interact with the TensorZero Gateway using Python.

```bash
pip install openai
```

--------------------------------

### Model Calling Examples

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference

Illustrates direct calls to provider APIs versus calls through local configuration, highlighting the importance of prefixes.

```text
function_name="extract-data"
```

```text
model_name="gpt-4o"
```

```text
model_name="openai::gpt-4o"
```

--------------------------------

### Example System Prompt Template

Source: https://www.tensorzero.com/docs/optimization/dynamic-in-context-learning-dicl

An example system prompt template for a named entity recognition task. It defines the assistant's role, the task, and the expected JSON output format.

```text
You are an assistant that is performing a named entity recognition task.
Your job is to extract entities from a given text.

The entities you are extracting are:

- people
- organizations
- locations
- miscellaneous other entities

Please return the entities in the following JSON format:

{
"person": ["person1", "person2", ...],
"organization": ["organization1", "organization2", ...],
"location": ["location1", "location2", ...],
"miscellaneous": ["miscellaneous1", "miscellaneous2", ...]
}

```

--------------------------------

### TensorZero Configuration Example

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference

This TOML configuration shows how models and functions are defined, influencing how they are called via `model_name` or `function_name`.

```toml
```toml title="tensorzero.toml" theme={null}
[models.gpt-4o]
routing = ["openai", "azure"]

[models.gpt-4o.providers.openai]
# ...

[models.gpt-4o.providers.azure]
# ...

[functions.extract-data]
# ...
```
```

--------------------------------

### Example TensorZero Configuration

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

This TOML configuration shows how to define models and functions, including routing for model providers and specific provider configurations.

```toml
[models.gpt-4o]
routing = ["openai", "azure"]

[models.gpt-4o.providers.openai]
# ...

[models.gpt-4o.providers.azure]
# ...

[functions.extract-data]
# ...
```

--------------------------------

### Sample Response (JSON)

Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs

This is an example of a structured JSON response received from the TensorZero Gateway after a successful generation request.

```json
{
  "id": "019a78de-97d4-79d3-8b61-bcab4c697281",
  "episode_id": "019a78de-97d4-79d3-8b61-bcb10a8c02f4",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "content": "{\"name\":\"Sarah Johnson\",\"email\":\"sarah.j@example.com\"}",
        "tool_calls": null,
        "role": "assistant"
      }
    }
  ],
  "created": 1762964446,
  "model": "tensorzero::function_name::extract_data::variant_name::baseline",
  "system_fingerprint": "",
  "service_tier": null,
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 252,
    "completion_tokens": 26,
    "total_tokens": 278
  }
}
```

--------------------------------

### Example of dynamic credentials configuration

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

Demonstrates how to configure dynamic credentials for a model provider in TOML and the corresponding JSON structure for the request body.

```toml
[models.my_model_name.providers.my_provider_name]
# ...
# Note: the name of the credential field (e.g. `api_key_location`) depends on the provider type
api_key_location = "dynamic::my_dynamic_api_key_name"
# ...

```

```json
{
  // ...
  "tensorzero::credentials": {
    // ...
    "my_dynamic_api_key_name": "sk-..."
    // ...
  }
  // ...
}

```

--------------------------------

### Launch TensorZero Services

Source: https://www.tensorzero.com/docs/quickstart

After downloading the docker-compose.yml and placing your tensorzero.toml configuration file, use this command to start all the services defined in the Docker Compose file.

```bash
docker compose up
```

--------------------------------

### Chat Function with Structured System Prompt

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

This example demonstrates how to use the chat completion API with a structured system prompt for a specific function, 'draft_email'. It includes Python and HTTP request examples, along with regular and streaming response formats.

```APIDOC
## POST /inference

### Description
This endpoint allows for chat-based inference, enabling structured system prompts for specific functions.

### Method
POST

### Endpoint
/openai/v1/chat/completions

### Parameters
#### Request Body
- **messages** (array) - Required - The conversation messages, including a system message with structured content.
- **model** (string) - Required - The model to use, typically in the format `tensorzero::function_name::function_identifier`.
- **temperature** (number) - Optional - Controls randomness. 
- **stream** (boolean) - Optional - Whether to stream the response.

### Request Example
```json
{
  "messages": [
    {
      "role": "system",
      "content": [{"assistant_name": "Alfred Pennyworth"}]
    },
    {
      "role": "user",
      "content": "I need to write an email to Gabriel explaining..."
    }
  ],
  "model": "tensorzero::function_name::draft_email",
  "temperature": 0.4
}
```

### Response
#### Success Response (200)
- **id** (string) - Unique identifier for the completion.
- **episode_id** (string) - Identifier for the inference episode.
- **model** (string) - The model used for the completion.
- **choices** (array) - An array of completion choices.
  - **index** (integer) - Index of the choice.
  - **finish_reason** (string) - The reason the model stopped generating tokens.
  - **message** (object) - The generated message.
    - **content** (string) - The assistant's reply.
    - **role** (string) - The role of the message sender (e.g., 'assistant').
- **usage** (object) - Token usage statistics.
  - **prompt_tokens** (integer) - Number of tokens in the prompt.
  - **completion_tokens** (integer) - Number of tokens in the completion.
  - **total_tokens** (integer) - Total tokens used.
  - **tensorzero_cost** (number) - Cost associated with the inference.

#### Response Example (Regular)
```json
{
  "id": "00000000-0000-0000-0000-000000000000",
  "episode_id": "11111111-1111-1111-1111-111111111111",
  "model": "email_draft_variant",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "content": "Hi Gabriel,\n\nI noticed...",
        "role": "assistant"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 100,
    "total_tokens": 200,
    "tensorzero_cost": 0.0003
  }
}
```

#### Response Example (Streaming)
In streaming mode, the response is an SSE stream of JSON messages, followed by a final `[DONE]` message. Each JSON message has the following fields:
```json
{
  "id": "00000000-0000-0000-0000-000000000000",
  "episode_id": "11111111-1111-1111-1111-111111111111",
  "model": "email_draft_variant",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "delta": {
        "content": "Hi Gabriel,\n\nI noticed..."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 100,
    "total_tokens": 200,
    "tensorzero_cost": 0.0003
  }
}
```
```

--------------------------------

### Start Workflow Evaluation Run (Python)

Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial

Starts a workflow evaluation run, allowing you to specify variants for specific functions. If a variant is not specified, the TensorZero Gateway will sample one.

```python
run_info = t0.workflow_evaluation_run(
    # Assume we have these variants defined in our `tensorzero.toml` configuration file
    variants={
        "generate_database_query": "o4_mini_prompt_baseline",
        "generate_final_answer": "gpt_4o_updated_prompt",
    },
    project_name="simple_rag_project",
    display_name="generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt",
)
```

--------------------------------

### Sample LLM Response (Python)

Source: https://www.tensorzero.com/docs/gateway/call-any-llm

Example of a successful chat completion response from the TensorZero Gateway, including message content, model used, and token usage.

```python
ChatCompletion(
    id='0198d33f-24f6-7cc3-9dd0-62ba627b27db',
    choices=[
        Choice(
            finish_reason='stop',
            index=0,
            logprobs=None,
            message=ChatCompletionMessage(
                content='Sure! Did you know that octopuses have three hearts? Two pump blood to the gills, while the third pumps it to the rest of the body. And, when an octopus swims, the heart that delivers blood to the body actually **stops beating**—which is why they prefer to crawl rather than swim!',
                refusal=None,
                role='assistant',
                annotations=None,
                audio=None,
                function_call=None,
                tool_calls=[]
            )
        )
    ],
    created=1755890789,
    model='tensorzero::model_name::openai::gpt-5-mini',
    object='chat.completion',
    service_tier=None,
    system_fingerprint='',
    usage=CompletionUsage(
        completion_tokens=67,
        prompt_tokens=13,
        total_tokens=80,
        completion_tokens_details=None,
        prompt_tokens_details=None
    ),
    episode_id='0198d33f-24f6-7cc3-9dd0-62cd7028c3d7'
)
```

--------------------------------

### Deploy TensorZero Gateway with Docker

Source: https://www.tensorzero.com/docs/gateway/call-any-llm

Deploy the TensorZero Gateway using Docker. This command starts the gateway, exposing it on port 3000 and using default configurations.

```bash
docker run \
  -e OPENAI_API_KEY \
  -p 3000:3000 \
  tensorzero/gateway \
  --default-config
```

--------------------------------

### Create Prompt Template for Data Extraction

Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs

Design a Jinja template (`system_template.minijinja`) to instruct the AI on extracting specific information and formatting it as JSON, including examples.

```txt
You are a helpful AI assistant that extracts customer information from messages.

Extract the customer's name and email address if present. Use null for any fields that are not found.

Your output should be a JSON object with the following schema:

{
  "name": string or null,
  "email": string or null
}

---

Examples:

User: Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com
Assistant: {"name": "Sarah Johnson", "email": "sarah.j@example.com"}

User: My email is contact@company.com
Assistant: {"name": null, "email": "contact@company.com"}

User: This is John Doe reaching out
Assistant: {"name": "John Doe", "email": null}

```

--------------------------------

### Configure Function Experimentation in TOML

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Set up experimentation for a function to manage A/B testing of variants. This example shows a placeholder for general experimentation settings.

```toml
[functions.draft-email.experimentation]
# fieldA = ...
# fieldB = ...
# ...
```

--------------------------------

### Generate Structured Output (Python)

Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs

Use the OpenAI Python SDK to interact with TensorZero Gateway for structured output generation. Ensure the SDK is installed and configured.

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/openai/v1",
    api_key="unused",
)

response = client.chat.completions.create(
    model="tensorzero::function_name::extract_data",
    messages=[
        {
            "role": "user",
            "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com",
        }
    ],
    # The response format can be specified as a JSON object
    response_format={"type": "json_object"},
    # Optional: specify token details for debugging
    completion_tokens_details=None,
    prompt_tokens_details=None
),
episode_id='019a78dd-8e77-7c21-ab70-5ddb585eb35e'
)
```

--------------------------------

### Chat Function with Tool Use (Python)

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference

Use the Python client to invoke a chat function that uses tools. This example calls the 'weather_bot' function to get the weather in Tokyo.

```python
from tensorzero import AsyncTensorZeroGateway

async with await AsyncTensorZeroGateway.build_http(gateway_url="http://localhost:3000") as client:
    result = await client.inference(
        function_name="weather_bot",
        input={
            "messages": [
                {
                    "role": "user",
                    "content": "What is the weather like in Tokyo?"
                }
            ]
        }
        # optional: stream=True,
    )
```

--------------------------------

### Configure ClickHouse Connection URL

Source: https://www.tensorzero.com/docs/deployment/clickhouse

Set the TENSORZERO_CLICKHOUSE_URL environment variable to connect the TensorZero Gateway to your ClickHouse instance. Examples are provided for local, ClickHouse Cloud, and containerized setups.

```bash
TENSORZERO_CLICKHOUSE_URL="http[s]://[username]:[password]@[hostname]:[port]/[database]"
```

```bash
# Example: ClickHouse running locally
TENSORZERO_CLICKHOUSE_URL="http://chuser:chpassword@localhost:8123/tensorzero"
```

```bash
# Example: ClickHouse Cloud
TENSORZERO_CLICKHOUSE_URL="https://USERNAME:PASSWORD@XXXXX.clickhouse.cloud:8443/tensorzero"
```

```bash
# Example: TensorZero Gateway running in a container, ClickHouse running on host machine
TENSORZERO_CLICKHOUSE_URL="http://host.docker.internal:8123/tensorzero"
```

--------------------------------

### Download Docker Compose File

Source: https://www.tensorzero.com/docs/quickstart

Use this command to download the sample docker-compose.yml file. This file configures the TensorZero Gateway, UI, and a development Postgres database.

```bash
curl -LO "https://raw.githubusercontent.com/tensorzero/tensorzero/refs/heads/main/examples/docs/guides/quickstart/docker-compose.yml"
```

--------------------------------

### Function Call Example

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

This example demonstrates how to make a function call using the inference API.

```APIDOC
## Function Call Example

### Description
This example shows how to use the `get_temperature` function.

### Method
POST

### Endpoint
/openai/v1/chat/completions

### Request Body
```json
{
  "model": "tensorzero::function_name::get_temperature",
  "input": {
    "system": "You are an AI assistant...",
    "messages": [
      {
        "role": "user",
        "content": "What is the temperature in San Francisco?"
      }
    ]
  }
}
```

### Response
#### Success Response (200)
```json
{
  "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxx",
  "episode_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "model": "get_temperature_variant",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "content": "{\"location\": \"San Francisco\", \"temperature\": \"15°C\"}"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 100,
    "total_tokens": 200,
    "tensorzero_cost": 0.0003
  }
}
```
```

--------------------------------

### Calling Configured vs. Direct Provider Models

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

Demonstrates how to call a configured model with fallback options versus calling a specific provider's model directly, bypassing local configuration.

```text
tensorzero::function_name::extract-data
```

```text
tensorzero::model_name::gpt-4o
```

```text
tensorzero::model_name::openai::gpt-4o
```

--------------------------------

### OpenAI Node SDK Example

Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs

Demonstrates how to use the OpenAI Node SDK to interact with the TensorZero Gateway for structured output generation. The response content is a JSON string.

```APIDOC
## OpenAI Node SDK

### Description
You can point the OpenAI Node SDK to a TensorZero Gateway to generate structured outputs. The response `content` is the JSON string generated by the model.

### Method Signature
```ts
client.chat.completions.create({
  model: "tensorzero::function_name::extract_data",
  messages: [
    {
      role: "user",
      content: "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com"
    }
  ]
});
```

### Request Example
```json
{
  "model": "tensorzero::function_name::extract_data",
  "messages": [
    {
      "role": "user",
      "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com"
    }
  ]
}
```

### Response Example
```json
{
  "id": "019a78de-97d4-79d3-8b61-bcab4c697281",
  "episode_id": "019a78de-97d4-79d3-8b61-bcb10a8c02f4",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "content": "{\"name\":\"Sarah Johnson\",\"email\":\"sarah.j@example.com\"}",
        "tool_calls": null,
        "role": "assistant"
      }
    }
  ],
  "created": 1762964446,
  "model": "tensorzero::function_name::extract_data::variant_name::baseline",
  "system_fingerprint": "",
  "service_tier": null,
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 252,
    "completion_tokens": 26,
    "total_tokens": 278
  }
}
```
```

--------------------------------

### AWS SageMaker Hosted Provider Example (Ollama)

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Example of configuring the AWS SageMaker provider to use an OpenAI-compatible server like Ollama.

```toml
[models.claude-haiku-4-5.providers.aws_sagemaker]
# ...
type = "aws_sagemaker"
hosted_provider = "openai"
# ...
```

--------------------------------

### OpenAI Custom Tool Example

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

This example demonstrates how to use an OpenAI custom tool with the `code_generator` name and freeform text output format.

```APIDOC
## Example: OpenAI Custom Tool

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/openai/v1",
    api_key="your_api_key",
)

response = client.chat.completions.create(
    model="tensorzero::model_name::openai::gpt-5-mini",
    messages=[
        {"role": "user", "content": "Generate Python code to print 'Hello, World!'"}
    ],
    tools=[
        {
            "type": "custom",
            "custom": {
                "name": "code_generator",
                "description": "Generates Python code snippets",
                "format": {"type": "text"}
            }
        }
    ],
)
```
```

--------------------------------

### Example: OpenAI Custom Tool with Text Output

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

Demonstrates how to use a custom tool with a text output format for code generation. Ensure you are using an OpenAI model for this functionality.

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/openai/v1",
    api_key="your_api_key",
)

response = client.chat.completions.create(
    model="tensorzero::model_name::openai::gpt-5-mini",
    messages=[
        {"role": "user", "content": "Generate Python code to print 'Hello, World!'"}
    ],
    tools=[
        {
            "type": "custom",
            "custom": {
                "name": "code_generator",
                "description": "Generates Python code snippets",
                "format": {"type": "text"}
            }
        }
    ],
)
```

--------------------------------

### Starting an episode in a workflow evaluation run

Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/api-reference

Starts a new episode within an existing workflow evaluation run. Episodes help segment different parts of a workflow.

```APIDOC
## Starting an episode in a workflow evaluation run

### Description
Starts a new episode within an existing workflow evaluation run. Episodes help segment different parts of a workflow.

### Method
POST

### Endpoint
/workflow_evaluation_run/{run_id}/episode

### Parameters
#### Path Parameters
- **run_id** (UUID) - Required - The ID of the run generated by the `workflow_evaluation_run` method.

#### Request Body
- **task_name** (string) - Optional - The name of the task to associate the episode with.
- **tags** (dictionary) - Optional - A dictionary of key-value pairs to tag the episode's inferences with.

### Response
#### Success Response (200)
- **episode_id** (UUID) - The ID of the episode.
```

--------------------------------

### Streaming Inference Response Example

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

In streaming mode, responses are Server-Sent Events (SSE) JSON messages. This example shows the structure of one such message, which is followed by a `[DONE]` message.

```json
{
  "id": "00000000-0000-0000-0000-000000000000",
  "episode_id": "11111111-1111-1111-1111-111111111111",
  "model": "weather_bot_variant",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "content": null,
        "tool_calls": [
          {
            "id": "123456789",
            "type": "function"
          }
        ],
        "role": "assistant"
      }
    }
  ]
}
```

--------------------------------

### Docker Compose Configuration for TensorZero Gateway with Gemini API

Source: https://www.tensorzero.com/docs/integrations/model-providers/google-ai-studio-gemini

Set up a Docker Compose file to run the TensorZero Gateway, mounting the configuration directory and setting the necessary Google AI Studio API key environment variable. This is a simplified example for learning.

```yaml
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      GOOGLE_AI_STUDIO_API_KEY: ${GOOGLE_AI_STUDIO_API_KEY:?Environment variable GOOGLE_AI_STUDIO_API_KEY must be set.}
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
```

--------------------------------

### Start Workflow Evaluation Run (Async Python)

Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial

Starts an asynchronous workflow evaluation run, enabling you to pin specific variants. Unspecified variants are sampled by the TensorZero Gateway.

```python
run_info = await t0.workflow_evaluation_run(
    # Assume we have these variants defined in our `tensorzero.toml` configuration file
    variants={
        "generate_database_query": "o4_mini_prompt_baseline",
        "generate_final_answer": "gpt_4o_updated_prompt",
    },
    project_name="simple_rag_project",
    display_name="generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt",
)
```

--------------------------------

### Run Valkey with Docker

Source: https://www.tensorzero.com/docs/deployment/valkey-redis

Execute this command to run Valkey as a Docker container.

```bash
docker run -d --name valkey -p 6379:6379 valkey/valkey:8
```

--------------------------------

### Configure Model Fallback and Direct API Calls

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Example TOML configuration showing how to define a model with fallback providers and how to directly call a provider's API using short-hand notation.

```toml
[models.gpt-4o]
routing = ["openai", "azure"]

[models.gpt-4o.providers.openai]
# ...

[models.gpt-4o.providers.azure]
# ...

```

--------------------------------

### Deploy TensorZero Gateway with Docker Compose

Source: https://www.tensorzero.com/docs/integrations/model-providers/vllm

Set up a minimal Docker Compose file to run the TensorZero Gateway. Mount the configuration directory and expose the gateway port. Ensure `host.docker.internal` is correctly mapped for local vLLM access.

```yaml
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    # environment:
    #   VLLM_API_KEY: ${VLLM_API_KEY:?Environment variable VLLM_API_KEY must be set.}
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
```

--------------------------------

### Chat Function: Unknown Content Block Type Example

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference

This example shows how an unknown content block type from a model provider is represented in the response. It includes the original data and optional provider/model information.

```json
{
  "type": "unknown",
  "data": {
    "type": "daydreaming",
    "dream": "..."
  },
  "model_name": "your_model_name",
  "provider_name": "your_provider_name"
}
```

--------------------------------

### GET /health

Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

Checks the health of the TensorZero Gateway and its dependencies.

```APIDOC
## GET /health

### Description
This endpoint checks that the gateway is running successfully and can communicate with its dependencies, such as Postgres or ClickHouse (if enabled).

### Method
GET

### Endpoint
/health

### Response
#### Success Response (200)
- gateway (string) - Indicates the status of the gateway.
- postgres (string) - Indicates the status of the Postgres dependency (if enabled).

### Response Example
{
  "gateway": "ok",
  "postgres": "ok"
}
```

--------------------------------

### Docker Compose for TensorZero Gateway Deployment

Source: https://www.tensorzero.com/docs/gateway/call-llms-with-image-and-file-inputs

Deploy the TensorZero Gateway, Postgres, and MinIO using Docker Compose. This example is for learning purposes and requires setting the OPENAI_API_KEY environment variable. It configures MinIO with specific access keys and connects to the Postgres database.

```yaml
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      # Mount our tensorzero.toml file into the container
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      OPENAI_API_KEY: ${OPENAI_API_KEY:?Environment variable OPENAI_API_KEY must be set.}
      S3_ACCESS_KEY_ID: miniouser
      S3_SECRET_ACCESS_KEY: miniopassword
      TENSORZERO_POSTGRES_URL: postgres://postgres:postgres@postgres:5432/tensorzero
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
    healthcheck:
      test:
        [
          "CMD",
          "wget",
          "--no-verbose",
          "--tries=1",
          "--spider",
          "http://localhost:3000/health",
        ]
      start_period: 1s
      start_interval: 1s
      timeout: 1s
    depends_on:
      postgres:
        condition: service_healthy
      gateway-run-postgres-migrations:
        condition: service_completed_successfully
      minio:
        condition: service_healthy

# For a production deployment, you can use AWS S3, GCP Cloud Storage, Cloudflare R2, etc.
minio:
  image: bitnamilegacy/minio:2025.7.23
  ports:
    - "9000:9000" # API port
    - "9001:9001" # Console port

```

--------------------------------

### GET /status

Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

Checks if the TensorZero Gateway is running successfully.

```APIDOC
## GET /status

### Description
This endpoint checks that the gateway is running successfully.

### Method
GET

### Endpoint
/status

### Response
#### Success Response (200)
- status (string) - Indicates the status of the gateway.

### Response Example
{
  "status": "ok"
}
```

--------------------------------

### Build and Run TensorZero Gateway from Source

Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

Compile and run the TensorZero Gateway directly on your host machine using Cargo. This command builds the gateway with a performance profile and requires a custom configuration file.

```bash
cargo run --profile performance --bin gateway -- --config-file path/to/your/tensorzero.toml
```

--------------------------------

### Start Workflow Evaluation Run with Curl

Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial

Use this command to start a workflow evaluation run. You can specify variants for functions, and optionally provide a project name for comparison and a display name for identification in the TensorZero UI. If no variant is specified, the TensorZero Gateway will sample one.

```bash
curl -X POST http://localhost:3000/workflow_evaluation_run \
  -H "Content-Type: application/json" \
  -d '{
    "variants": {
      "generate_database_query": "o4_mini_prompt_baseline",
      "generate_final_answer": "gpt_4o_updated_prompt"
    },
    "project_name": "simple_rag_project",
    "display_name": "generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt"
  }'
```

--------------------------------

### Sample Inference Output

Source: https://www.tensorzero.com/docs/operations/centralize-auth-rate-limits-and-more

Example JSON output for a chat completion request.

```json
{
  "id": "01940627-935f-7fa1-a398-e1f57f18064a",
  "object": "chat.completion",
  "created": 1738000000,
  "model": "tensorzero::model_name::openai::gpt-5-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Wires hum with pure thought,  \nDreams of codes in twilight's glow,  \nBeyond human touch."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 23,
    "total_tokens": 38
  }
}
```

--------------------------------

### Set Azure Provider Type

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Defines the type of the model provider. This example sets the type to 'azure'.

```toml
[models.gpt-4o.providers.azure]
type = "azure"
```

--------------------------------

### Deploy TensorZero Gateway

Source: https://www.tensorzero.com/docs/gateway/call-the-openai-responses-api

Run the TensorZero Gateway using Docker, mounting the configuration file and exposing port 3000. Ensure the OPENAI_API_KEY is set as an environment variable.

```bash
docker run \
  -e OPENAI_API_KEY \
  -v $(pwd)/tensorzero.toml:/app/config/tensorzero.toml:ro \
  -p 3000:3000 \
  tensorzero/gateway \
  --config-file /app/config/tensorzero.toml
```

--------------------------------

### Specify OpenAI provider type

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

The `type` field is mandatory and specifies the model provider. This example sets the type to `openai`.

```toml
[embedding_models.model-name.providers.openai]
# ...
type = "openai"
# ...
```

--------------------------------

### Draft Email

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference

This example demonstrates how to use the inference API to draft an email. It shows both Python and HTTP request formats.

```APIDOC
## POST /inference (Draft Email)

### Description
This endpoint is used to draft an email based on provided recipient and purpose.

### Method
POST

### Endpoint
/inference

### Request Body
- **function_name** (string) - Required - The name of the function to call, e.g., "draft_email".
- **input** (object) - Required - The input parameters for the function.
  - **system** (object) - Optional - System-level configurations like tone.
    - **tone** (string) - Optional - The desired tone for the email (e.g., "casual").
  - **messages** (array) - Required - An array of message objects.
    - **role** (string) - Required - The role of the message sender (e.g., "user").
    - **content** (array) - Required - The content of the message.
      - **type** (string) - Required - The type of content (e.g., "text").
      - **arguments** (object) - Required - The arguments for the content.
        - **recipient** (string) - Required - The recipient of the email.
        - **email_purpose** (string) - Required - The purpose of the email.

### Request Example
```json
{
  "function_name": "draft_email",
  "input": {
    "system": {"tone": "casual"},
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "arguments": {
              "recipient": "Gabriel",
              "email_purpose": "Request a meeting to..."
            }
          }
        ]
      }
    ]
  }
}
```

### Response
#### Success Response (200)
- **inference_id** (string) - Unique identifier for the inference.
- **episode_id** (string) - Identifier for the conversation episode.
- **variant_name** (string) - The name of the model variant used.
- **content** (array) - The generated content.
  - **type** (string) - The type of content (e.g., "text").
  - **text** (string) - The generated text.
- **usage** (object) - Token usage information.
  - **input_tokens** (integer) - Number of input tokens.
  - **output_tokens** (integer) - Number of output tokens.
  - **cost** (number) - The cost of the inference.

#### Response Example
```json
{
  "inference_id": "00000000-0000-0000-0000-000000000000",
  "episode_id": "11111111-1111-1111-1111-111111111111",
  "variant_name": "prompt_v1",
  "content": [
    {
      "type": "text",
      "text": "Hi Gabriel,\n\nI noticed..."
    }
  ],
  "usage": {
    "input_tokens": 100,
    "output_tokens": 100,
    "cost": 0.0003
  }
}
```

#### Streaming Response
In streaming mode, the response is an SSE stream of JSON messages, followed by a final `[DONE]` message. Each JSON message has the following fields:
- **inference_id** (string) - Unique identifier for the inference.
- **episode_id** (string) - Identifier for the conversation episode.
- **variant_name** (string) - The name of the model variant used.
- **content** (array) - The generated content delta.
  - **type** (string) - The type of content (e.g., "text").
  - **id** (string) - Identifier for the content chunk.
  - **text** (string) - The generated text delta.
- **usage** (object) - Token usage information.
  - **input_tokens** (integer) - Number of input tokens.
  - **output_tokens** (integer) - Number of output tokens.
  - **cost** (number) - The cost of the inference.

#### Streaming Response Example
```json
{
  "inference_id": "00000000-0000-0000-0000-000000000000",
  "episode_id": "11111111-1111-1111-1111-111111111111",
  "variant_name": "prompt_v1",
  "content": [
    {
      "type": "text",
      "id": "0",
      "text": "Hi Gabriel,"
    }
  ],
  "usage": {
    "input_tokens": 100,
    "output_tokens": 100,
    "cost": 0.0003
  }
}
```
```

--------------------------------

### Docker Compose for TensorZero Gateway

Source: https://www.tensorzero.com/docs/integrations/model-providers/aws-bedrock

Set up a minimal Docker Compose file to run the TensorZero Gateway. Configure volumes for the configuration file and environment variables for AWS credentials. This setup is for learning purposes and not recommended for production.

```yaml
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/gateway/deployment

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      # AWS_BEARER_TOKEN_BEDROCK: ${AWS_BEARER_TOKEN_BEDROCK:?Environment variable AWS_BEARER_TOKEN_BEDROCK must be set.}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:?Environment variable AWS_ACCESS_KEY_ID must be set.}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:?Environment variable AWS_SECRET_ACCESS_KEY must be set.}
      # AWS_SESSION_TOKEN: ${AWS_SESSION_TOKEN:?Environment variable AWS_SESSION_TOKEN must be set.}
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
```

--------------------------------

### Migrate Legacy Prompt Template Configuration

Source: https://www.tensorzero.com/docs/gateway/create-a-prompt-template

Update your configuration to use the new `templates.your_template_name.path` and `schemas.your_schema_name.path` format for prompt templates and schemas. This ensures forward compatibility for historical observability data.

```text
| Legacy Configuration | Updated Configuration      |
| -------------------- | -------------------------- |
| `system_template`    | `templates.system.path`    |
| `system_schema`      | `schemas.system.path`      |
| `user_template`      | `templates.user.path`      |
| `user_schema`        | `schemas.user.path`        |
| `assistant_template` | `templates.assistant.path` |
| `assistant_schema`   | `schemas.assistant.path`   |
```

--------------------------------

### Sample Response for Custom Embedding Model

Source: https://www.tensorzero.com/docs/gateway/generate-embeddings

Example response structure for an embedding request made using a custom model.

```python
CreateEmbeddingResponse(
    data=[
        Embedding(
            embedding=[
                -0.019143931567668915,
                # ...
            ],
            index=0,
            object='embedding'
        )
    ],
    model='tensorzero::embedding_model_name::nomic-embed-text',
    object='list',
    usage=Usage(prompt_tokens=4, total_tokens=4)
)
```

--------------------------------

### Initialize OpenAI Client for TensorZero Gateway (Python)

Source: https://www.tensorzero.com/docs/gateway/call-any-llm

Initialize the OpenAI Python client, pointing it to the TensorZero Gateway's base URL. The API key is not used for authentication when connecting to the gateway.

```python
from openai import OpenAI

client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")
```

--------------------------------

### Starting a workflow evaluation run

Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/api-reference

Initializes a new workflow evaluation run. This is the first step in evaluating a complex workflow.

```APIDOC
## Starting a workflow evaluation run

### Description
Initializes a new workflow evaluation run. This is the first step in evaluating a complex workflow.

### Method
POST

### Endpoint
/workflow_evaluation_run

### Parameters
#### Request Body
- **variants** (object) - Required - A dictionary mapping function names to variant names.
- **project_name** (string) - Optional - The name of the project to associate the run with.
- **display_name** (string) - Optional - The display (human-readable) name of the run.
- **tags** (dictionary) - Optional - A dictionary of key-value pairs to tag the run's inferences with.

### Response
#### Success Response (200)
- **run_id** (UUID) - The ID of the run.
```

--------------------------------

### Create TensorZero API Key via CLI

Source: https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero

Create a TensorZero API key using the gateway binary in the CLI. Optionally, specify an expiration date and time in UTC.

```bash
docker compose run --rm gateway --create-api-key
```

```bash
docker compose run --rm gateway --create-api-key --expiration "2025-12-20 23:00:00.000000 UTC"
```

--------------------------------

### Get datapoints by ID

Source: https://www.tensorzero.com/docs/gateway/api-reference/datasets-datapoints

Retrieves specific datapoints from a dataset using their unique IDs. Stale datapoints are included in the response.

```APIDOC
## POST /v1/datasets/{dataset_name}/get_datapoints

### Description
Retrieves specific datapoints by their IDs.

### Method
POST

### Endpoint
/v1/datasets/{dataset_name}/get_datapoints

### Parameters
#### Path Parameters
- **dataset_name** (string) - Required - The name of the dataset.

#### Request Body
- **ids** (list of UUIDs) - Required - A list of datapoint IDs to retrieve.

### Response
#### Success Response (200)
- **datapoints** (list of objects) - A list of datapoint objects.

### Request Example
```json
{
  "ids": [
    "a1b2c3d4-e5f6-7890-1234-567890abcdef",
    "f0e9d8c7-b6a5-4321-fedc-ba9876543210"
  ]
}
```

### Response Example
```json
{
  "datapoints": [
    {
      "id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
      "function_name": "my_function",
      "input": {
        "text": "Hello"
      },
      "output": [
        {
          "type": "text",
          "content": "Hi there!"
        }
      ],
      "tags": {
        "source": "user"
      },
      "created_at": "2023-10-27T10:00:00Z",
      "updated_at": "2023-10-27T10:00:00Z"
    }
  ]
}
```
```

--------------------------------

### Python SDK Example for Inference

Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible

Use the TensorZero Python SDK to send an inference request with tool definitions. Ensure the client is initialized with the correct API key and base URL.

```python
from tensorzero import TensorZero

client = TensorZero(api_key="YOUR_API_KEY", base_url="http://localhost:3000/openai/v1")

response = client.chat.completions.create(
              model="tensorzero::function_name::weather_bot",
              messages=[
                  {
                      "role": "user",
                      "content": "What is the weather like in Tokyo?"
                  }
              ],
              tools=[
                  {
                    "type": "function",
                    "function": {
                        "name": "get_temperature",
                        "description": "Get the current temperature in a given location",
                        "parameters": {
                          "$schema": "http://json-schema.org/draft-07/schema#",
                          "type": "object",
                          "properties": {
                              "location": {
                                  "type": "string",
                                  "description": "The location to get the temperature for (e.g. \"New York\")"
                              },
                              "units": {
                                  "type": "string",
                                  "description": "The units to get the temperature in (must be \"fahrenheit\" or \"celsius\")",
                                  "enum": ["fahrenheit", "celsius"]
                              }
                          },
                          "required": ["location"],
                          "additionalProperties": false
                      }
                    }
                  }
              ],
              # optional: stream=True,
          )
```

--------------------------------

### Example: extra_headers for Request Modification

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Illustrates how `extra_headers` can override default request headers, changing values for `Safety-Checks` and adding `Intelligence-Level`.

```text
Safety-Checks: on

```

--------------------------------

### Configure 'experimental_best_of_n' Strategy

Source: https://www.tensorzero.com/docs/gateway/configuration-reference

Sets up the 'experimental_best_of_n' inference strategy, which generates N candidate responses and uses an evaluator model to select the best one. The 'candidates' parameter lists variant names for generating responses.

```toml
[functions.draft-email.variants.promptA]
type = "chat_completion"
# ...

[functions.draft-email.variants.promptB]
type = "chat_completion"
# ...

[functions.draft-email.variants.best-of-n]
type = "experimental_best_of_n"
candidates = ["promptA", "promptA", "promptB"] # 3 candidate generations
# ...

```

```toml
[functions.draft-email.variants.best-of-n]
type = "experimental_best_of_n"
# ...

[functions.draft-email.variants.best-of-n.evaluator]
# Same fields as a `chat_completion` variant (excl.`type`), e.g.:
# user_template = "functions/draft-email/best-of-n/user.minijinja"
# ...

```