### Install OpenAI Node SDK Source: https://www.tensorzero.com/docs/gateway/call-any-llm Install the OpenAI SDK using npm. This is required to interact with the gateway. ```bash npm i openai ``` -------------------------------- ### Example Postgres Connection URL Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference This example shows the format for the `TENSORZERO_POSTGRES_URL` environment variable, used for connecting to a Postgres database. Either ClickHouse or Postgres must be available. ```bash TENSORZERO_POSTGRES_URL="postgres://myuser:mypass@localhost:5432/mydatabase" ``` -------------------------------- ### Example ClickHouse Connection URL Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference This example demonstrates the format for the `TENSORZERO_CLICKHOUSE_URL` environment variable, which is required for connecting to a ClickHouse database. Either ClickHouse or Postgres must be available. ```bash TENSORZERO_CLICKHOUSE_URL=http://chuser:chpassword@localhost:8123/database_name ``` -------------------------------- ### Example OpenAI API Key Environment Variable Source: https://www.tensorzero.com/docs/evaluations/inference-evaluations/cli-reference This example shows how to set the `OPENAI_API_KEY` environment variable. It is required when using external model providers without a TensorZero Gateway. ```bash OPENAI_API_KEY=sk-... ``` -------------------------------- ### Set Number of Examples for Inference Source: https://www.tensorzero.com/docs/gateway/configuration-reference Configure the `k` parameter to specify the number of examples to retrieve for inference. This is a required field. ```toml [functions.draft-email.variants.dicl] # ... k = 10 # ... ``` -------------------------------- ### Install OpenAI Python SDK Source: https://www.tensorzero.com/docs/gateway/call-any-llm Install the OpenAI Python SDK using pip. This package is required to interact with the TensorZero Gateway using Python. ```bash pip install openai ``` -------------------------------- ### Model Calling Examples Source: https://www.tensorzero.com/docs/gateway/api-reference/inference Illustrates direct calls to provider APIs versus calls through local configuration, highlighting the importance of prefixes. ```text function_name="extract-data" ``` ```text model_name="gpt-4o" ``` ```text model_name="openai::gpt-4o" ``` -------------------------------- ### Example System Prompt Template Source: https://www.tensorzero.com/docs/optimization/dynamic-in-context-learning-dicl An example system prompt template for a named entity recognition task. It defines the assistant's role, the task, and the expected JSON output format. ```text You are an assistant that is performing a named entity recognition task. Your job is to extract entities from a given text. The entities you are extracting are: - people - organizations - locations - miscellaneous other entities Please return the entities in the following JSON format: { "person": ["person1", "person2", ...], "organization": ["organization1", "organization2", ...], "location": ["location1", "location2", ...], "miscellaneous": ["miscellaneous1", "miscellaneous2", ...] } ``` -------------------------------- ### TensorZero Configuration Example Source: https://www.tensorzero.com/docs/gateway/api-reference/inference This TOML configuration shows how models and functions are defined, influencing how they are called via `model_name` or `function_name`. ```toml ```toml title="tensorzero.toml" theme={null} [models.gpt-4o] routing = ["openai", "azure"] [models.gpt-4o.providers.openai] # ... [models.gpt-4o.providers.azure] # ... [functions.extract-data] # ... ``` ``` -------------------------------- ### Example TensorZero Configuration Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible This TOML configuration shows how to define models and functions, including routing for model providers and specific provider configurations. ```toml [models.gpt-4o] routing = ["openai", "azure"] [models.gpt-4o.providers.openai] # ... [models.gpt-4o.providers.azure] # ... [functions.extract-data] # ... ``` -------------------------------- ### Sample Response (JSON) Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs This is an example of a structured JSON response received from the TensorZero Gateway after a successful generation request. ```json { "id": "019a78de-97d4-79d3-8b61-bcab4c697281", "episode_id": "019a78de-97d4-79d3-8b61-bcb10a8c02f4", "choices": [ { "index": 0, "finish_reason": "stop", "message": { "content": "{\"name\":\"Sarah Johnson\",\"email\":\"sarah.j@example.com\"}", "tool_calls": null, "role": "assistant" } } ], "created": 1762964446, "model": "tensorzero::function_name::extract_data::variant_name::baseline", "system_fingerprint": "", "service_tier": null, "object": "chat.completion", "usage": { "prompt_tokens": 252, "completion_tokens": 26, "total_tokens": 278 } } ``` -------------------------------- ### Example of dynamic credentials configuration Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible Demonstrates how to configure dynamic credentials for a model provider in TOML and the corresponding JSON structure for the request body. ```toml [models.my_model_name.providers.my_provider_name] # ... # Note: the name of the credential field (e.g. `api_key_location`) depends on the provider type api_key_location = "dynamic::my_dynamic_api_key_name" # ... ``` ```json { // ... "tensorzero::credentials": { // ... "my_dynamic_api_key_name": "sk-..." // ... } // ... } ``` -------------------------------- ### Launch TensorZero Services Source: https://www.tensorzero.com/docs/quickstart After downloading the docker-compose.yml and placing your tensorzero.toml configuration file, use this command to start all the services defined in the Docker Compose file. ```bash docker compose up ``` -------------------------------- ### Chat Function with Structured System Prompt Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible This example demonstrates how to use the chat completion API with a structured system prompt for a specific function, 'draft_email'. It includes Python and HTTP request examples, along with regular and streaming response formats. ```APIDOC ## POST /inference ### Description This endpoint allows for chat-based inference, enabling structured system prompts for specific functions. ### Method POST ### Endpoint /openai/v1/chat/completions ### Parameters #### Request Body - **messages** (array) - Required - The conversation messages, including a system message with structured content. - **model** (string) - Required - The model to use, typically in the format `tensorzero::function_name::function_identifier`. - **temperature** (number) - Optional - Controls randomness. - **stream** (boolean) - Optional - Whether to stream the response. ### Request Example ```json { "messages": [ { "role": "system", "content": [{"assistant_name": "Alfred Pennyworth"}] }, { "role": "user", "content": "I need to write an email to Gabriel explaining..." } ], "model": "tensorzero::function_name::draft_email", "temperature": 0.4 } ``` ### Response #### Success Response (200) - **id** (string) - Unique identifier for the completion. - **episode_id** (string) - Identifier for the inference episode. - **model** (string) - The model used for the completion. - **choices** (array) - An array of completion choices. - **index** (integer) - Index of the choice. - **finish_reason** (string) - The reason the model stopped generating tokens. - **message** (object) - The generated message. - **content** (string) - The assistant's reply. - **role** (string) - The role of the message sender (e.g., 'assistant'). - **usage** (object) - Token usage statistics. - **prompt_tokens** (integer) - Number of tokens in the prompt. - **completion_tokens** (integer) - Number of tokens in the completion. - **total_tokens** (integer) - Total tokens used. - **tensorzero_cost** (number) - Cost associated with the inference. #### Response Example (Regular) ```json { "id": "00000000-0000-0000-0000-000000000000", "episode_id": "11111111-1111-1111-1111-111111111111", "model": "email_draft_variant", "choices": [ { "index": 0, "finish_reason": "stop", "message": { "content": "Hi Gabriel,\n\nI noticed...", "role": "assistant" } } ], "usage": { "prompt_tokens": 100, "completion_tokens": 100, "total_tokens": 200, "tensorzero_cost": 0.0003 } } ``` #### Response Example (Streaming) In streaming mode, the response is an SSE stream of JSON messages, followed by a final `[DONE]` message. Each JSON message has the following fields: ```json { "id": "00000000-0000-0000-0000-000000000000", "episode_id": "11111111-1111-1111-1111-111111111111", "model": "email_draft_variant", "choices": [ { "index": 0, "finish_reason": "stop", "delta": { "content": "Hi Gabriel,\n\nI noticed..." } } ], "usage": { "prompt_tokens": 100, "completion_tokens": 100, "total_tokens": 200, "tensorzero_cost": 0.0003 } } ``` ``` -------------------------------- ### Start Workflow Evaluation Run (Python) Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial Starts a workflow evaluation run, allowing you to specify variants for specific functions. If a variant is not specified, the TensorZero Gateway will sample one. ```python run_info = t0.workflow_evaluation_run( # Assume we have these variants defined in our `tensorzero.toml` configuration file variants={ "generate_database_query": "o4_mini_prompt_baseline", "generate_final_answer": "gpt_4o_updated_prompt", }, project_name="simple_rag_project", display_name="generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt", ) ``` -------------------------------- ### Sample LLM Response (Python) Source: https://www.tensorzero.com/docs/gateway/call-any-llm Example of a successful chat completion response from the TensorZero Gateway, including message content, model used, and token usage. ```python ChatCompletion( id='0198d33f-24f6-7cc3-9dd0-62ba627b27db', choices=[ Choice( finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage( content='Sure! Did you know that octopuses have three hearts? Two pump blood to the gills, while the third pumps it to the rest of the body. And, when an octopus swims, the heart that delivers blood to the body actually **stops beating**—which is why they prefer to crawl rather than swim!', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[] ) ) ], created=1755890789, model='tensorzero::model_name::openai::gpt-5-mini', object='chat.completion', service_tier=None, system_fingerprint='', usage=CompletionUsage( completion_tokens=67, prompt_tokens=13, total_tokens=80, completion_tokens_details=None, prompt_tokens_details=None ), episode_id='0198d33f-24f6-7cc3-9dd0-62cd7028c3d7' ) ``` -------------------------------- ### Deploy TensorZero Gateway with Docker Source: https://www.tensorzero.com/docs/gateway/call-any-llm Deploy the TensorZero Gateway using Docker. This command starts the gateway, exposing it on port 3000 and using default configurations. ```bash docker run \ -e OPENAI_API_KEY \ -p 3000:3000 \ tensorzero/gateway \ --default-config ``` -------------------------------- ### Create Prompt Template for Data Extraction Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs Design a Jinja template (`system_template.minijinja`) to instruct the AI on extracting specific information and formatting it as JSON, including examples. ```txt You are a helpful AI assistant that extracts customer information from messages. Extract the customer's name and email address if present. Use null for any fields that are not found. Your output should be a JSON object with the following schema: { "name": string or null, "email": string or null } --- Examples: User: Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com Assistant: {"name": "Sarah Johnson", "email": "sarah.j@example.com"} User: My email is contact@company.com Assistant: {"name": null, "email": "contact@company.com"} User: This is John Doe reaching out Assistant: {"name": "John Doe", "email": null} ``` -------------------------------- ### Configure Function Experimentation in TOML Source: https://www.tensorzero.com/docs/gateway/configuration-reference Set up experimentation for a function to manage A/B testing of variants. This example shows a placeholder for general experimentation settings. ```toml [functions.draft-email.experimentation] # fieldA = ... # fieldB = ... # ... ``` -------------------------------- ### Generate Structured Output (Python) Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs Use the OpenAI Python SDK to interact with TensorZero Gateway for structured output generation. Ensure the SDK is installed and configured. ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:3000/openai/v1", api_key="unused", ) response = client.chat.completions.create( model="tensorzero::function_name::extract_data", messages=[ { "role": "user", "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com", } ], # The response format can be specified as a JSON object response_format={"type": "json_object"}, # Optional: specify token details for debugging completion_tokens_details=None, prompt_tokens_details=None ), episode_id='019a78dd-8e77-7c21-ab70-5ddb585eb35e' ) ``` -------------------------------- ### Chat Function with Tool Use (Python) Source: https://www.tensorzero.com/docs/gateway/api-reference/inference Use the Python client to invoke a chat function that uses tools. This example calls the 'weather_bot' function to get the weather in Tokyo. ```python from tensorzero import AsyncTensorZeroGateway async with await AsyncTensorZeroGateway.build_http(gateway_url="http://localhost:3000") as client: result = await client.inference( function_name="weather_bot", input={ "messages": [ { "role": "user", "content": "What is the weather like in Tokyo?" } ] } # optional: stream=True, ) ``` -------------------------------- ### Configure ClickHouse Connection URL Source: https://www.tensorzero.com/docs/deployment/clickhouse Set the TENSORZERO_CLICKHOUSE_URL environment variable to connect the TensorZero Gateway to your ClickHouse instance. Examples are provided for local, ClickHouse Cloud, and containerized setups. ```bash TENSORZERO_CLICKHOUSE_URL="http[s]://[username]:[password]@[hostname]:[port]/[database]" ``` ```bash # Example: ClickHouse running locally TENSORZERO_CLICKHOUSE_URL="http://chuser:chpassword@localhost:8123/tensorzero" ``` ```bash # Example: ClickHouse Cloud TENSORZERO_CLICKHOUSE_URL="https://USERNAME:PASSWORD@XXXXX.clickhouse.cloud:8443/tensorzero" ``` ```bash # Example: TensorZero Gateway running in a container, ClickHouse running on host machine TENSORZERO_CLICKHOUSE_URL="http://host.docker.internal:8123/tensorzero" ``` -------------------------------- ### Download Docker Compose File Source: https://www.tensorzero.com/docs/quickstart Use this command to download the sample docker-compose.yml file. This file configures the TensorZero Gateway, UI, and a development Postgres database. ```bash curl -LO "https://raw.githubusercontent.com/tensorzero/tensorzero/refs/heads/main/examples/docs/guides/quickstart/docker-compose.yml" ``` -------------------------------- ### Function Call Example Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible This example demonstrates how to make a function call using the inference API. ```APIDOC ## Function Call Example ### Description This example shows how to use the `get_temperature` function. ### Method POST ### Endpoint /openai/v1/chat/completions ### Request Body ```json { "model": "tensorzero::function_name::get_temperature", "input": { "system": "You are an AI assistant...", "messages": [ { "role": "user", "content": "What is the temperature in San Francisco?" } ] } } ``` ### Response #### Success Response (200) ```json { "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxx", "episode_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "model": "get_temperature_variant", "choices": [ { "index": 0, "finish_reason": "stop", "message": { "content": "{\"location\": \"San Francisco\", \"temperature\": \"15°C\"}" } } ], "usage": { "prompt_tokens": 100, "completion_tokens": 100, "total_tokens": 200, "tensorzero_cost": 0.0003 } } ``` ``` -------------------------------- ### Calling Configured vs. Direct Provider Models Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible Demonstrates how to call a configured model with fallback options versus calling a specific provider's model directly, bypassing local configuration. ```text tensorzero::function_name::extract-data ``` ```text tensorzero::model_name::gpt-4o ``` ```text tensorzero::model_name::openai::gpt-4o ``` -------------------------------- ### OpenAI Node SDK Example Source: https://www.tensorzero.com/docs/gateway/generate-structured-outputs Demonstrates how to use the OpenAI Node SDK to interact with the TensorZero Gateway for structured output generation. The response content is a JSON string. ```APIDOC ## OpenAI Node SDK ### Description You can point the OpenAI Node SDK to a TensorZero Gateway to generate structured outputs. The response `content` is the JSON string generated by the model. ### Method Signature ```ts client.chat.completions.create({ model: "tensorzero::function_name::extract_data", messages: [ { role: "user", content: "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com" } ] }); ``` ### Request Example ```json { "model": "tensorzero::function_name::extract_data", "messages": [ { "role": "user", "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com" } ] } ``` ### Response Example ```json { "id": "019a78de-97d4-79d3-8b61-bcab4c697281", "episode_id": "019a78de-97d4-79d3-8b61-bcb10a8c02f4", "choices": [ { "index": 0, "finish_reason": "stop", "message": { "content": "{\"name\":\"Sarah Johnson\",\"email\":\"sarah.j@example.com\"}", "tool_calls": null, "role": "assistant" } } ], "created": 1762964446, "model": "tensorzero::function_name::extract_data::variant_name::baseline", "system_fingerprint": "", "service_tier": null, "object": "chat.completion", "usage": { "prompt_tokens": 252, "completion_tokens": 26, "total_tokens": 278 } } ``` ``` -------------------------------- ### AWS SageMaker Hosted Provider Example (Ollama) Source: https://www.tensorzero.com/docs/gateway/configuration-reference Example of configuring the AWS SageMaker provider to use an OpenAI-compatible server like Ollama. ```toml [models.claude-haiku-4-5.providers.aws_sagemaker] # ... type = "aws_sagemaker" hosted_provider = "openai" # ... ``` -------------------------------- ### OpenAI Custom Tool Example Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible This example demonstrates how to use an OpenAI custom tool with the `code_generator` name and freeform text output format. ```APIDOC ## Example: OpenAI Custom Tool ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:3000/openai/v1", api_key="your_api_key", ) response = client.chat.completions.create( model="tensorzero::model_name::openai::gpt-5-mini", messages=[ {"role": "user", "content": "Generate Python code to print 'Hello, World!'"} ], tools=[ { "type": "custom", "custom": { "name": "code_generator", "description": "Generates Python code snippets", "format": {"type": "text"} } } ], ) ``` ``` -------------------------------- ### Example: OpenAI Custom Tool with Text Output Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible Demonstrates how to use a custom tool with a text output format for code generation. Ensure you are using an OpenAI model for this functionality. ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:3000/openai/v1", api_key="your_api_key", ) response = client.chat.completions.create( model="tensorzero::model_name::openai::gpt-5-mini", messages=[ {"role": "user", "content": "Generate Python code to print 'Hello, World!'"} ], tools=[ { "type": "custom", "custom": { "name": "code_generator", "description": "Generates Python code snippets", "format": {"type": "text"} } } ], ) ``` -------------------------------- ### Starting an episode in a workflow evaluation run Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/api-reference Starts a new episode within an existing workflow evaluation run. Episodes help segment different parts of a workflow. ```APIDOC ## Starting an episode in a workflow evaluation run ### Description Starts a new episode within an existing workflow evaluation run. Episodes help segment different parts of a workflow. ### Method POST ### Endpoint /workflow_evaluation_run/{run_id}/episode ### Parameters #### Path Parameters - **run_id** (UUID) - Required - The ID of the run generated by the `workflow_evaluation_run` method. #### Request Body - **task_name** (string) - Optional - The name of the task to associate the episode with. - **tags** (dictionary) - Optional - A dictionary of key-value pairs to tag the episode's inferences with. ### Response #### Success Response (200) - **episode_id** (UUID) - The ID of the episode. ``` -------------------------------- ### Streaming Inference Response Example Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible In streaming mode, responses are Server-Sent Events (SSE) JSON messages. This example shows the structure of one such message, which is followed by a `[DONE]` message. ```json { "id": "00000000-0000-0000-0000-000000000000", "episode_id": "11111111-1111-1111-1111-111111111111", "model": "weather_bot_variant", "choices": [ { "index": 0, "finish_reason": "stop", "message": { "content": null, "tool_calls": [ { "id": "123456789", "type": "function" } ], "role": "assistant" } } ] } ``` -------------------------------- ### Docker Compose Configuration for TensorZero Gateway with Gemini API Source: https://www.tensorzero.com/docs/integrations/model-providers/google-ai-studio-gemini Set up a Docker Compose file to run the TensorZero Gateway, mounting the configuration directory and setting the necessary Google AI Studio API key environment variable. This is a simplified example for learning. ```yaml # This is a simplified example for learning purposes. Do not use this in production. # For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway services: gateway: image: tensorzero/gateway volumes: - ./config:/app/config:ro command: --config-file /app/config/tensorzero.toml environment: GOOGLE_AI_STUDIO_API_KEY: ${GOOGLE_AI_STUDIO_API_KEY:?Environment variable GOOGLE_AI_STUDIO_API_KEY must be set.} ports: - "3000:3000" extra_hosts: - "host.docker.internal:host-gateway" ``` -------------------------------- ### Start Workflow Evaluation Run (Async Python) Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial Starts an asynchronous workflow evaluation run, enabling you to pin specific variants. Unspecified variants are sampled by the TensorZero Gateway. ```python run_info = await t0.workflow_evaluation_run( # Assume we have these variants defined in our `tensorzero.toml` configuration file variants={ "generate_database_query": "o4_mini_prompt_baseline", "generate_final_answer": "gpt_4o_updated_prompt", }, project_name="simple_rag_project", display_name="generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt", ) ``` -------------------------------- ### Run Valkey with Docker Source: https://www.tensorzero.com/docs/deployment/valkey-redis Execute this command to run Valkey as a Docker container. ```bash docker run -d --name valkey -p 6379:6379 valkey/valkey:8 ``` -------------------------------- ### Configure Model Fallback and Direct API Calls Source: https://www.tensorzero.com/docs/gateway/configuration-reference Example TOML configuration showing how to define a model with fallback providers and how to directly call a provider's API using short-hand notation. ```toml [models.gpt-4o] routing = ["openai", "azure"] [models.gpt-4o.providers.openai] # ... [models.gpt-4o.providers.azure] # ... ``` -------------------------------- ### Deploy TensorZero Gateway with Docker Compose Source: https://www.tensorzero.com/docs/integrations/model-providers/vllm Set up a minimal Docker Compose file to run the TensorZero Gateway. Mount the configuration directory and expose the gateway port. Ensure `host.docker.internal` is correctly mapped for local vLLM access. ```yaml # This is a simplified example for learning purposes. Do not use this in production. # For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway services: gateway: image: tensorzero/gateway volumes: - ./config:/app/config:ro command: --config-file /app/config/tensorzero.toml # environment: # VLLM_API_KEY: ${VLLM_API_KEY:?Environment variable VLLM_API_KEY must be set.} ports: - "3000:3000" extra_hosts: - "host.docker.internal:host-gateway" ``` -------------------------------- ### Chat Function: Unknown Content Block Type Example Source: https://www.tensorzero.com/docs/gateway/api-reference/inference This example shows how an unknown content block type from a model provider is represented in the response. It includes the original data and optional provider/model information. ```json { "type": "unknown", "data": { "type": "daydreaming", "dream": "..." }, "model_name": "your_model_name", "provider_name": "your_provider_name" } ``` -------------------------------- ### GET /health Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway Checks the health of the TensorZero Gateway and its dependencies. ```APIDOC ## GET /health ### Description This endpoint checks that the gateway is running successfully and can communicate with its dependencies, such as Postgres or ClickHouse (if enabled). ### Method GET ### Endpoint /health ### Response #### Success Response (200) - gateway (string) - Indicates the status of the gateway. - postgres (string) - Indicates the status of the Postgres dependency (if enabled). ### Response Example { "gateway": "ok", "postgres": "ok" } ``` -------------------------------- ### Docker Compose for TensorZero Gateway Deployment Source: https://www.tensorzero.com/docs/gateway/call-llms-with-image-and-file-inputs Deploy the TensorZero Gateway, Postgres, and MinIO using Docker Compose. This example is for learning purposes and requires setting the OPENAI_API_KEY environment variable. It configures MinIO with specific access keys and connects to the Postgres database. ```yaml # This is a simplified example for learning purposes. Do not use this in production. # For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway services: gateway: image: tensorzero/gateway volumes: # Mount our tensorzero.toml file into the container - ./config:/app/config:ro command: --config-file /app/config/tensorzero.toml environment: OPENAI_API_KEY: ${OPENAI_API_KEY:?Environment variable OPENAI_API_KEY must be set.} S3_ACCESS_KEY_ID: miniouser S3_SECRET_ACCESS_KEY: miniopassword TENSORZERO_POSTGRES_URL: postgres://postgres:postgres@postgres:5432/tensorzero ports: - "3000:3000" extra_hosts: - "host.docker.internal:host-gateway" healthcheck: test: [ "CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health", ] start_period: 1s start_interval: 1s timeout: 1s depends_on: postgres: condition: service_healthy gateway-run-postgres-migrations: condition: service_completed_successfully minio: condition: service_healthy # For a production deployment, you can use AWS S3, GCP Cloud Storage, Cloudflare R2, etc. minio: image: bitnamilegacy/minio:2025.7.23 ports: - "9000:9000" # API port - "9001:9001" # Console port ``` -------------------------------- ### GET /status Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway Checks if the TensorZero Gateway is running successfully. ```APIDOC ## GET /status ### Description This endpoint checks that the gateway is running successfully. ### Method GET ### Endpoint /status ### Response #### Success Response (200) - status (string) - Indicates the status of the gateway. ### Response Example { "status": "ok" } ``` -------------------------------- ### Build and Run TensorZero Gateway from Source Source: https://www.tensorzero.com/docs/deployment/tensorzero-gateway Compile and run the TensorZero Gateway directly on your host machine using Cargo. This command builds the gateway with a performance profile and requires a custom configuration file. ```bash cargo run --profile performance --bin gateway -- --config-file path/to/your/tensorzero.toml ``` -------------------------------- ### Start Workflow Evaluation Run with Curl Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial Use this command to start a workflow evaluation run. You can specify variants for functions, and optionally provide a project name for comparison and a display name for identification in the TensorZero UI. If no variant is specified, the TensorZero Gateway will sample one. ```bash curl -X POST http://localhost:3000/workflow_evaluation_run \ -H "Content-Type: application/json" \ -d '{ "variants": { "generate_database_query": "o4_mini_prompt_baseline", "generate_final_answer": "gpt_4o_updated_prompt" }, "project_name": "simple_rag_project", "display_name": "generate_database_query::o4_mini_prompt_baseline;generate_final_answer::gpt_4o_updated_prompt" }' ``` -------------------------------- ### Sample Inference Output Source: https://www.tensorzero.com/docs/operations/centralize-auth-rate-limits-and-more Example JSON output for a chat completion request. ```json { "id": "01940627-935f-7fa1-a398-e1f57f18064a", "object": "chat.completion", "created": 1738000000, "model": "tensorzero::model_name::openai::gpt-5-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Wires hum with pure thought, \nDreams of codes in twilight's glow, \nBeyond human touch." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 15, "completion_tokens": 23, "total_tokens": 38 } } ``` -------------------------------- ### Set Azure Provider Type Source: https://www.tensorzero.com/docs/gateway/configuration-reference Defines the type of the model provider. This example sets the type to 'azure'. ```toml [models.gpt-4o.providers.azure] type = "azure" ``` -------------------------------- ### Deploy TensorZero Gateway Source: https://www.tensorzero.com/docs/gateway/call-the-openai-responses-api Run the TensorZero Gateway using Docker, mounting the configuration file and exposing port 3000. Ensure the OPENAI_API_KEY is set as an environment variable. ```bash docker run \ -e OPENAI_API_KEY \ -v $(pwd)/tensorzero.toml:/app/config/tensorzero.toml:ro \ -p 3000:3000 \ tensorzero/gateway \ --config-file /app/config/tensorzero.toml ``` -------------------------------- ### Specify OpenAI provider type Source: https://www.tensorzero.com/docs/gateway/configuration-reference The `type` field is mandatory and specifies the model provider. This example sets the type to `openai`. ```toml [embedding_models.model-name.providers.openai] # ... type = "openai" # ... ``` -------------------------------- ### Draft Email Source: https://www.tensorzero.com/docs/gateway/api-reference/inference This example demonstrates how to use the inference API to draft an email. It shows both Python and HTTP request formats. ```APIDOC ## POST /inference (Draft Email) ### Description This endpoint is used to draft an email based on provided recipient and purpose. ### Method POST ### Endpoint /inference ### Request Body - **function_name** (string) - Required - The name of the function to call, e.g., "draft_email". - **input** (object) - Required - The input parameters for the function. - **system** (object) - Optional - System-level configurations like tone. - **tone** (string) - Optional - The desired tone for the email (e.g., "casual"). - **messages** (array) - Required - An array of message objects. - **role** (string) - Required - The role of the message sender (e.g., "user"). - **content** (array) - Required - The content of the message. - **type** (string) - Required - The type of content (e.g., "text"). - **arguments** (object) - Required - The arguments for the content. - **recipient** (string) - Required - The recipient of the email. - **email_purpose** (string) - Required - The purpose of the email. ### Request Example ```json { "function_name": "draft_email", "input": { "system": {"tone": "casual"}, "messages": [ { "role": "user", "content": [ { "type": "text", "arguments": { "recipient": "Gabriel", "email_purpose": "Request a meeting to..." } } ] } ] } } ``` ### Response #### Success Response (200) - **inference_id** (string) - Unique identifier for the inference. - **episode_id** (string) - Identifier for the conversation episode. - **variant_name** (string) - The name of the model variant used. - **content** (array) - The generated content. - **type** (string) - The type of content (e.g., "text"). - **text** (string) - The generated text. - **usage** (object) - Token usage information. - **input_tokens** (integer) - Number of input tokens. - **output_tokens** (integer) - Number of output tokens. - **cost** (number) - The cost of the inference. #### Response Example ```json { "inference_id": "00000000-0000-0000-0000-000000000000", "episode_id": "11111111-1111-1111-1111-111111111111", "variant_name": "prompt_v1", "content": [ { "type": "text", "text": "Hi Gabriel,\n\nI noticed..." } ], "usage": { "input_tokens": 100, "output_tokens": 100, "cost": 0.0003 } } ``` #### Streaming Response In streaming mode, the response is an SSE stream of JSON messages, followed by a final `[DONE]` message. Each JSON message has the following fields: - **inference_id** (string) - Unique identifier for the inference. - **episode_id** (string) - Identifier for the conversation episode. - **variant_name** (string) - The name of the model variant used. - **content** (array) - The generated content delta. - **type** (string) - The type of content (e.g., "text"). - **id** (string) - Identifier for the content chunk. - **text** (string) - The generated text delta. - **usage** (object) - Token usage information. - **input_tokens** (integer) - Number of input tokens. - **output_tokens** (integer) - Number of output tokens. - **cost** (number) - The cost of the inference. #### Streaming Response Example ```json { "inference_id": "00000000-0000-0000-0000-000000000000", "episode_id": "11111111-1111-1111-1111-111111111111", "variant_name": "prompt_v1", "content": [ { "type": "text", "id": "0", "text": "Hi Gabriel," } ], "usage": { "input_tokens": 100, "output_tokens": 100, "cost": 0.0003 } } ``` ``` -------------------------------- ### Docker Compose for TensorZero Gateway Source: https://www.tensorzero.com/docs/integrations/model-providers/aws-bedrock Set up a minimal Docker Compose file to run the TensorZero Gateway. Configure volumes for the configuration file and environment variables for AWS credentials. This setup is for learning purposes and not recommended for production. ```yaml # This is a simplified example for learning purposes. Do not use this in production. # For production-ready deployments, see: https://www.tensorzero.com/docs/gateway/deployment services: gateway: image: tensorzero/gateway volumes: - ./config:/app/config:ro command: --config-file /app/config/tensorzero.toml environment: # AWS_BEARER_TOKEN_BEDROCK: ${AWS_BEARER_TOKEN_BEDROCK:?Environment variable AWS_BEARER_TOKEN_BEDROCK must be set.} AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:?Environment variable AWS_ACCESS_KEY_ID must be set.} AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:?Environment variable AWS_SECRET_ACCESS_KEY must be set.} # AWS_SESSION_TOKEN: ${AWS_SESSION_TOKEN:?Environment variable AWS_SESSION_TOKEN must be set.} ports: - "3000:3000" extra_hosts: - "host.docker.internal:host-gateway" ``` -------------------------------- ### Migrate Legacy Prompt Template Configuration Source: https://www.tensorzero.com/docs/gateway/create-a-prompt-template Update your configuration to use the new `templates.your_template_name.path` and `schemas.your_schema_name.path` format for prompt templates and schemas. This ensures forward compatibility for historical observability data. ```text | Legacy Configuration | Updated Configuration | | -------------------- | -------------------------- | | `system_template` | `templates.system.path` | | `system_schema` | `schemas.system.path` | | `user_template` | `templates.user.path` | | `user_schema` | `schemas.user.path` | | `assistant_template` | `templates.assistant.path` | | `assistant_schema` | `schemas.assistant.path` | ``` -------------------------------- ### Sample Response for Custom Embedding Model Source: https://www.tensorzero.com/docs/gateway/generate-embeddings Example response structure for an embedding request made using a custom model. ```python CreateEmbeddingResponse( data=[ Embedding( embedding=[ -0.019143931567668915, # ... ], index=0, object='embedding' ) ], model='tensorzero::embedding_model_name::nomic-embed-text', object='list', usage=Usage(prompt_tokens=4, total_tokens=4) ) ``` -------------------------------- ### Initialize OpenAI Client for TensorZero Gateway (Python) Source: https://www.tensorzero.com/docs/gateway/call-any-llm Initialize the OpenAI Python client, pointing it to the TensorZero Gateway's base URL. The API key is not used for authentication when connecting to the gateway. ```python from openai import OpenAI client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used") ``` -------------------------------- ### Starting a workflow evaluation run Source: https://www.tensorzero.com/docs/evaluations/workflow-evaluations/api-reference Initializes a new workflow evaluation run. This is the first step in evaluating a complex workflow. ```APIDOC ## Starting a workflow evaluation run ### Description Initializes a new workflow evaluation run. This is the first step in evaluating a complex workflow. ### Method POST ### Endpoint /workflow_evaluation_run ### Parameters #### Request Body - **variants** (object) - Required - A dictionary mapping function names to variant names. - **project_name** (string) - Optional - The name of the project to associate the run with. - **display_name** (string) - Optional - The display (human-readable) name of the run. - **tags** (dictionary) - Optional - A dictionary of key-value pairs to tag the run's inferences with. ### Response #### Success Response (200) - **run_id** (UUID) - The ID of the run. ``` -------------------------------- ### Create TensorZero API Key via CLI Source: https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero Create a TensorZero API key using the gateway binary in the CLI. Optionally, specify an expiration date and time in UTC. ```bash docker compose run --rm gateway --create-api-key ``` ```bash docker compose run --rm gateway --create-api-key --expiration "2025-12-20 23:00:00.000000 UTC" ``` -------------------------------- ### Get datapoints by ID Source: https://www.tensorzero.com/docs/gateway/api-reference/datasets-datapoints Retrieves specific datapoints from a dataset using their unique IDs. Stale datapoints are included in the response. ```APIDOC ## POST /v1/datasets/{dataset_name}/get_datapoints ### Description Retrieves specific datapoints by their IDs. ### Method POST ### Endpoint /v1/datasets/{dataset_name}/get_datapoints ### Parameters #### Path Parameters - **dataset_name** (string) - Required - The name of the dataset. #### Request Body - **ids** (list of UUIDs) - Required - A list of datapoint IDs to retrieve. ### Response #### Success Response (200) - **datapoints** (list of objects) - A list of datapoint objects. ### Request Example ```json { "ids": [ "a1b2c3d4-e5f6-7890-1234-567890abcdef", "f0e9d8c7-b6a5-4321-fedc-ba9876543210" ] } ``` ### Response Example ```json { "datapoints": [ { "id": "a1b2c3d4-e5f6-7890-1234-567890abcdef", "function_name": "my_function", "input": { "text": "Hello" }, "output": [ { "type": "text", "content": "Hi there!" } ], "tags": { "source": "user" }, "created_at": "2023-10-27T10:00:00Z", "updated_at": "2023-10-27T10:00:00Z" } ] } ``` ``` -------------------------------- ### Python SDK Example for Inference Source: https://www.tensorzero.com/docs/gateway/api-reference/inference-openai-compatible Use the TensorZero Python SDK to send an inference request with tool definitions. Ensure the client is initialized with the correct API key and base URL. ```python from tensorzero import TensorZero client = TensorZero(api_key="YOUR_API_KEY", base_url="http://localhost:3000/openai/v1") response = client.chat.completions.create( model="tensorzero::function_name::weather_bot", messages=[ { "role": "user", "content": "What is the weather like in Tokyo?" } ], tools=[ { "type": "function", "function": { "name": "get_temperature", "description": "Get the current temperature in a given location", "parameters": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "location": { "type": "string", "description": "The location to get the temperature for (e.g. \"New York\")" }, "units": { "type": "string", "description": "The units to get the temperature in (must be \"fahrenheit\" or \"celsius\")", "enum": ["fahrenheit", "celsius"] } }, "required": ["location"], "additionalProperties": false } } } ], # optional: stream=True, ) ``` -------------------------------- ### Example: extra_headers for Request Modification Source: https://www.tensorzero.com/docs/gateway/configuration-reference Illustrates how `extra_headers` can override default request headers, changing values for `Safety-Checks` and adding `Intelligence-Level`. ```text Safety-Checks: on ``` -------------------------------- ### Configure 'experimental_best_of_n' Strategy Source: https://www.tensorzero.com/docs/gateway/configuration-reference Sets up the 'experimental_best_of_n' inference strategy, which generates N candidate responses and uses an evaluator model to select the best one. The 'candidates' parameter lists variant names for generating responses. ```toml [functions.draft-email.variants.promptA] type = "chat_completion" # ... [functions.draft-email.variants.promptB] type = "chat_completion" # ... [functions.draft-email.variants.best-of-n] type = "experimental_best_of_n" candidates = ["promptA", "promptA", "promptB"] # 3 candidate generations # ... ``` ```toml [functions.draft-email.variants.best-of-n] type = "experimental_best_of_n" # ... [functions.draft-email.variants.best-of-n.evaluator] # Same fields as a `chat_completion` variant (excl.`type`), e.g.: # user_template = "functions/draft-email/best-of-n/user.minijinja" # ... ```