### Install Llama Stack Client

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/examples/README.md

Install the client library from the repository root before running examples.

```bash
cd /Users/ashwin/local/new-stainless/llama-stack-client-python
uv sync
```

--------------------------------

### Add and Run Examples

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Create new example files in the `examples/` directory and make them executable. These files are not modified by the generator.

```py
# add an example to examples/<your-example>.py

#!/usr/bin/env -S uv run python
…
```

```sh
chmod +x examples/<your-example>.py
# run the example against your api
./examples/<your-example>.py
```

--------------------------------

### Install from Git

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Install the ogx-client-python package directly from its GitHub repository using pip.

```sh
pip install git+ssh://git@github.com/ogx-ai/ogx-client-python.git
```

--------------------------------

### Install Ogx Client

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Install the Ogx Client library from PyPI. Use '--pre' for pre-release versions.

```sh
pip install '--pre ogx_client'
```

--------------------------------

### Interactive Agent CLI Session Example

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/examples/README.md

Demonstrates a typical interaction with the agent CLI, showing configuration, connection, knowledge base setup, agent creation, and a sample user query with assistant response including turn and step events.

```text
╔══════════════════════════════════════════════════════════════╗
║                                                              ║
║        🤖  Interactive Agent Explorer  🔍                    ║
║                                                              ║
║  Explore agent turn/step events with server-side tools      ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝

🔧 Configuration:
  Model: openai/gpt-4o
  Server: http://localhost:8321

🔌 Connecting to server...
  ✓ Connected

📚 Setting up knowledge base...
  Indexing documents....... ✓
  Vector store ID: vs_abc123

🤖 Creating agent with tools...
  ✓ Agent ready

💬 Type your questions (or 'quit' to exit, 'help' for suggestions)
──────────────────────────────────────────────────────────────

🧑 You: What is Project Phoenix?

🤖 Assistant:

  ┌─── Turn turn_abc123 started ───┐
  │                                 │
  │  🧠 Inference Step 0 started    │
  │  🔍 Tool Execution Step 1       │
  │     Tool: knowledge_search      │
  │     Status: server_side         │
  │  🧠 Inference Step 2            │
  │  ✓ Response: Project Phoenix... │
  │                                 │
  └─── Turn completed ──────────────┘

Project Phoenix is a next-generation distributed systems platform launched in 2024...
```

--------------------------------

### Build and Install from Source

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Build a distributable wheel file for the library and then install it using pip. This is useful for local development or distribution.

```sh
uv build
# or
python -m build
```

```sh
pip install ./path-to-wheel-file.whl
```

--------------------------------

### Install Ogx Client with aiohttp

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Install the Ogx Client library with aiohttp support for improved concurrency performance in async operations.

```sh
# install from PyPI
pip install '--pre ogx_client[aiohttp]'
```

--------------------------------

### Start Mock Server

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Run the mock server script to simulate the API for testing purposes. This is often a prerequisite for running tests.

```sh
./scripts/mock
```

--------------------------------

### Sync Dependencies with uv

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

After installing uv manually, use this command to synchronize all project dependencies, including extras.

```sh
uv sync --all-extras
```

--------------------------------

### Start Llama Stack Server

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/examples/README.md

Prerequisite for running the interactive agent CLI. Ensure the server is running on the specified port.

```bash
cd ~/local/llama-stack
source ../stack-venv/bin/activate
export OPENAI_API_KEY=<your-key>
lama stack run ci-tests --port 8321
```

--------------------------------

### Create Response with Nested Parameters

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

This example demonstrates how to create a response using nested parameters, which are typed using `TypedDict`. Ensure `OgxClient` is imported.

```python
from ogx_client import OgxClient

client = OgxClient()

response_object = client.responses.create(
    input="string",
    model="model",
    prompt={"id": "id"},
)
print(response_object.prompt)
```

--------------------------------

### Install Dependencies without uv

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

If not using uv, ensure the correct Python version is set and install development dependencies using pip.

```sh
pip install -r requirements-dev.lock
```

--------------------------------

### Use Alpha APIs for Administration and Advanced Inference

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Provides examples for using experimental 'alpha' endpoints, including an admin health check and reranking documents by relevance to a query. Requires importing OgxClient.

```python
from ogx_client import OgxClient

client = OgxClient()

# Admin health check (v1alpha)
health = client.alpha.admin.health()
print(health.status)

# Rerank documents by relevance to a query
reranked = client.alpha.inference.rerank(
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "London is the capital of the UK.",
        "Berlin is the capital of Germany.",
    ],
    model="cross-encoder/ms-marco-MiniLM-L-6-v2",
    top_n=2,
)
for result in reranked.results:
    print(result.index, result.relevance_score, result.document)
```

--------------------------------

### Configure Default Timeout

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Set the default timeout for all requests. The default is 1 minute. This example sets it to 20 seconds.

```python
from ogx_client import OgxClient

# Configure the default for all requests:
client = OgxClient(
    # 20 seconds (default is 1 minute)
    timeout=20.0,
)
```

--------------------------------

### Paginate List Endpoints with Auto-Pagination and Cursors

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Demonstrates how to handle paginated API responses. It covers auto-paginating iterators for transparent fetching of all items and manual cursor-based pagination using `.has_next_page()` and `.get_next_page()`. Includes an asynchronous example.

```python
import asyncio
from ogx_client import OgxClient, AsyncOgxClient

client = OgxClient()

# Auto-paginate — fetches additional pages transparently
all_files = [f for f in client.files.list()]
print(f"Total files: {len(all_files)}")

# Manual cursor-based pagination
page = client.responses.list()
print(f"First page last_id: {page.last_id}")
if page.has_next_page():
    next_page = page.get_next_page()
    print(f"Second page items: {len(next_page.data)}")

# Async auto-pagination
async def collect_all_models():
    async_client = AsyncOgxClient()
    models = []
    async for m in async_client.models.list():
        models.append(m)
    return models

print(asyncio.run(collect_all_models()))
```

--------------------------------

### Pagination

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Explains how list endpoints return auto-paginating iterators and provides methods for manual cursor-based pagination using `.has_next_page()`, `.get_next_page()`, and the `last_id` cursor. Includes async examples.

```APIDOC
## Pagination

All list endpoints return auto-paginating iterators. Explicit page control is also available via `.has_next_page()`, `.get_next_page()`, and the `last_id` cursor.

```python
import asyncio
from ogx_client import OgxClient, AsyncOgxClient

client = OgxClient()

# Auto-paginate — fetches additional pages transparently
all_files = [f for f in client.files.list()]
print(f"Total files: {len(all_files)}")

# Manual cursor-based pagination
page = client.responses.list()
print(f"First page last_id: {page.last_id}")
if page.has_next_page():
    next_page = page.get_next_page()
    print(f"Second page items: {len(next_page.data)}")

# Async auto-pagination
async def collect_all_models():
    async_client = AsyncOgxClient()
    models = []
    async for m in async_client.models.list():
        models.append(m)
    return models

print(asyncio.run(collect_all_models()))
```
```

--------------------------------

### Configure Per-Request Max Retries

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Override the default retry settings for a specific request using `with_options`. This example sets `max_retries` to 5 for a single call.

```python
# Or, configure per-request:
client.with_options(max_retries=5).chat.completions.create(
    messages=[
        {
            "content": "string",
            "role": "user",
        }
    ],
    model="model",
)
```

--------------------------------

### Determine Installed ogx-client Version

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Use this snippet to check the version of the ogx-client library that is currently active in your Python environment. This is useful for debugging or verifying upgrades.

```python
import ogx_client
print(ogx_client.__version__)
```

--------------------------------

### Client Initialization

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Demonstrates how to initialize the synchronous OgxClient and asynchronous AsyncOgxClient, including options for explicit configuration and environment variable defaults.

```APIDOC
## Client Initialization

`OgxClient` and `AsyncOgxClient` are the entry points to the SDK. They auto-read `OGX_CLIENT_API_KEY` and `OGX_CLIENT_BASE_URL` from environment variables. `provider_data` injects a `X-OGX-Provider-Data` header on every request.

```python
import os
import httpx
from ogx_client import OgxClient, AsyncOgxClient, DefaultHttpxClient

# Minimal – reads OGX_CLIENT_API_KEY and OGX_CLIENT_BASE_URL from env
client = OgxClient()

# Explicit configuration with custom HTTP transport
client = OgxClient(
    api_key="my-secret-key",
    base_url="http://localhost:8321",
    timeout=30.0,
    max_retries=3,
    provider_data={"ollama": {"model": "llama3"}},
    http_client=DefaultHttpxClient(
        proxy="http://my.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

# Async client (identical surface, just use `await`)
async_client = AsyncOgxClient(
    api_key=os.environ["OGX_CLIENT_API_KEY"],
    base_url="http://localhost:8321",
    max_retries=2,
)

# Context manager for explicit connection lifecycle
with OgxClient() as client:
    models = client.models.list()
    print([m.identifier for m in models])
# HTTP connection is closed here
```
```

--------------------------------

### Get Conversation Item

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Retrieves a specific item from a conversation. Accepts optional parameters for customization.

```APIDOC
## GET /v1/conversations/{conversation_id}/items/{item_id}

### Description
Retrieves a specific item from a conversation.

### Method
GET

### Endpoint
/v1/conversations/{conversation_id}/items/{item_id}

### Parameters
#### Path Parameters
- **conversation_id** (string) - Required - The ID of the conversation the item belongs to.
- **item_id** (string) - Required - The ID of the item to retrieve.
#### Query Parameters
- **params** (object) - Optional - Parameters for retrieving the item.

### Response
#### Success Response (200)
- **ItemGetResponse** - Details of the retrieved item.
```

--------------------------------

### Initialize OgxClient and AsyncOgxClient

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Instantiate `OgxClient` (sync) or `AsyncOgxClient` (async). They automatically read API keys and base URLs from environment variables. Custom configurations for API key, base URL, timeouts, retries, provider data, and HTTP clients are supported. The client can be used as a context manager for explicit lifecycle management.

```python
import os
import httpx
from ogx_client import OgxClient, AsyncOgxClient, DefaultHttpxClient

# Minimal – reads OGX_CLIENT_API_KEY and OGX_CLIENT_BASE_URL from env
client = OgxClient()

# Explicit configuration with custom HTTP transport
client = OgxClient(
    api_key="my-secret-key",
    base_url="http://localhost:8321",
    timeout=30.0,
    max_retries=3,
    provider_data={"ollama": {"model": "llama3"}},
    http_client=DefaultHttpxClient(
        proxy="http://my.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

# Async client (identical surface, just use `await`)
async_client = AsyncOgxClient(
    api_key=os.environ["OGX_CLIENT_API_KEY"],
    base_url="http://localhost:8321",
    max_retries=2,
)

# Context manager for explicit connection lifecycle
with OgxClient() as client:
    models = client.models.list()
    print([m.identifier for m in models])
# HTTP connection is closed here
```

--------------------------------

### Bootstrap Environment with uv

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Run this script to set up the development environment using uv for dependency management. It automatically provisions the correct Python version.

```sh
./scripts/bootstrap
```

--------------------------------

### Override Per-Request Timeout

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Override the default timeout settings for a specific request using `with_options`. This example sets the timeout to 5.0 seconds.

```python
# Override per-request:
client.with_options(timeout=5.0).chat.completions.create(
    messages=[
        {
            "content": "string",
            "role": "user",
        }
    ],
    model="model",
)
```

--------------------------------

### List OpenAI-compatible models

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Demonstrates how to list models available in an OpenAI-compatible format using the client.models.openai.list() method.

```APIDOC
## List OpenAI-compatible models

```python
openai_models = client.models.openai.list()
for m in openai_models.data:
    print(m.id, m.created)
```
```

--------------------------------

### Async Client with aiohttp Backend

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Instantiate the AsyncOgxClient with DefaultAioHttpClient to use aiohttp as the HTTP backend. Use 'async with' for proper resource management.

```python
import asyncio
from ogx_client import DefaultAioHttpClient
from ogx_client import AsyncOgxClient


async def main() -> None:
    async with AsyncOgxClient(
        http_client=DefaultAioHttpClient(),
    ) as client:
        models = await client.models.list()


asyncio.run(main())
```

--------------------------------

### Run Tests

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Execute the project's test suite. Ensure the mock server is running if required by the tests.

```sh
./scripts/test
```

--------------------------------

### Build Agent with Client and Server Tools

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Constructs an `Agent` instance, configuring it with both client-side Python functions (like `get_weather`) and server-side tools (like `file_search`). This enables the agent to utilize a diverse set of capabilities.

```python
# Build agent with both client-side and server-side tools
agent = Agent(
    client=client,
    model="meta-llama/Llama-3.2-3B-Instruct",
    instructions="You are a helpful assistant. Use tools when appropriate.",
    tools=[
        get_weather,                              # client-side Python function
        {
            "type": "file_search",                # server-side tool
            "vector_store_ids": ["vs_abc123"],
        },
    ],
)
```

--------------------------------

### Upload File using PathLike

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Demonstrates uploading a file using a `PathLike` object. The file contents will be read asynchronously. Ensure `Path` and `OgxClient` are imported.

```python
from pathlib import Path
from ogx_client import OgxClient

client = OgxClient()

client.files.create(
    file=Path("/path/to/file"),
    purpose="assistants",
)
```

--------------------------------

### Manage Versioned Prompt Templates

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Create, retrieve, update, and manage versions of prompt templates. Supports variable interpolation for dynamic content. Ensure the OgxClient is initialized before use.

```python
from ogx_client import OgxClient

client = OgxClient()

# Create a versioned prompt template
prompt = client.prompts.create(
    prompt="You are a helpful assistant. Today's date is {{date}}. User query: {{query}}",
    variables=["date", "query"],
)
print(prompt.id, prompt.version)

# Retrieve with variable interpolation
rendered = client.prompts.retrieve(
    prompt.id,
    variables={"date": "2026-05-12", "query": "What's new in AI?"},
)
print(rendered.content)   # fully rendered prompt string

# Update the prompt text (creates a new version)
updated = client.prompts.update(
    prompt.id,
    prompt="You are an expert assistant. Date: {{date}}. Question: {{query}}",
)
print(updated.version)   # incremented

# Set a specific version as default
client.prompts.set_default_version(prompt.id, version=1)

# List all versions
versions = client.prompts.versions.list(prompt.id)
for v in versions.data:
    print(v.version, v.created_at)

# List all prompts and delete one
for p in client.prompts.list().data:
    print(p.id)
client.prompts.delete(prompt.id)
```

--------------------------------

### client.files.create

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a new file with the specified parameters.

```APIDOC
## POST /v1/files

### Description
Creates a new file.

### Method
POST

### Endpoint
/v1/files

### Parameters
#### Request Body
- **params** (object) - Required - Parameters for creating the file.
```

--------------------------------

### Configure HTTP client with proxies and transports

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Customize the underlying `httpx` client by providing a `DefaultHttpxClient` instance during `OgxClient` initialization. This allows configuration of proxies, transports, and other advanced settings.

```python
import httpx
from ogx_client import OgxClient, DefaultHttpxClient

client = OgxClient(
    # Or use the `OGX_CLIENT_BASE_URL` env var
    base_url="http://my.test.server.example.com:8083",
    http_client=DefaultHttpxClient(
        proxy="http://my.test.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)
```

--------------------------------

### Create a Prompt

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a new prompt using parameters defined in `prompt_create_params.py`. Returns the created `Prompt` object.

```python
client.prompts.create(**params)
```

--------------------------------

### Run Interactive Agent CLI (Default)

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/examples/README.md

Execute the interactive agent CLI using default configurations for model and server URL.

```bash
cd examples
uv run python interactive_agent_cli.py
```

--------------------------------

### Vector Stores - client.vector_stores.file_batches.create

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Create a batch for uploading multiple files to a vector store.

```APIDOC
## Vector Stores - client.vector_stores.file_batches.create

### Description
Create a batch for uploading multiple files to a vector store.

### Method
POST

### Endpoint
/v1/vector_stores/{vector_store_id}/file_batches

### Parameters
#### Path Parameters
- **vector_store_id** (string) - Required - The ID of the vector store to add the batch to.

#### Request Body
- **file_ids** (list[string]) - Required - A list of file IDs to include in the batch.

### Request Example
```python
from ogx_client import OgxClient

client = OgxClient()

# Assuming 'store' is an existing vector store object and 'uploaded' is a file object
# store = client.vector_stores.create(...)
# uploaded = client.files.create(file=Path("/path/to/manual.pdf"), purpose="assistants")

batch = client.vector_stores.file_batches.create(
    store.id,
    file_ids=[uploaded.id],
)
print(batch.status)
```

### Response
#### Success Response (200)
- **id** (string) - The ID of the file batch.
- **vector_store_id** (string) - The ID of the vector store.
- **file_ids** (list[string]) - The list of file IDs in the batch.
- **status** (string) - The status of the file batch ingestion.
```

--------------------------------

### Manage Files for OGX APIs

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Upload files from disk or raw bytes, retrieve their metadata and content, and paginate through a list of files. Files can be used with Assistants, Batch, and Vector Store APIs. Ensure the file path is correct.

```python
from pathlib import Path
from ogx_client import OgxClient

client = OgxClient()

# Upload a file from disk
file = client.files.create(
    file=Path("/data/training_set.jsonl"),
    purpose="batch",
)
print(file.id, file.filename, file.bytes)

# Upload raw bytes
file_bytes = b'{"prompt": "Hello", "completion": " World"}\n'
file = client.files.create(
    file=("data.jsonl", file_bytes, "application/jsonl"),
    purpose="batch",
)

# Retrieve metadata
file = client.files.retrieve(file.id)
print(file.status)   # "uploaded", "processed", etc.

# Download file content
content: str = client.files.content(file.id)
print(content[:200])

# Paginate through all files
for f in client.files.list(purpose="assistants"):
    print(f.id, f.filename)

# Delete a file
client.files.delete(file.id)
```

--------------------------------

### Create Conversation with Initial Items

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Initiates a new conversation, optionally with initial messages or system prompts. Metadata can be added for tracking purposes. The `items` parameter allows pre-populating the conversation.

```python
from ogx_client import OgxClient

client = OgxClient()

# Create a conversation with initial items
conv = client.conversations.create(
    metadata={"name": "support-session-001", "user_id": "u_42"},
    items=[
        {
            "type": "message",
            "role": "system",
            "content": [{"type": "input_text", "text": "You are a support agent."}]
        }
    ],
)
print(conv.id)   # e.g., "conv_abc123"
```

--------------------------------

### Vector Stores - client.vector_stores.files.create

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Upload a file and attach it to an existing vector store for ingestion.

```APIDOC
## Vector Stores - client.vector_stores.files.create

### Description
Upload a file and attach it to an existing vector store for ingestion.

### Method
POST

### Endpoint
/v1/vector_stores/{vector_store_id}/files

### Parameters
#### Path Parameters
- **vector_store_id** (string) - Required - The ID of the vector store to attach the file to.

#### Request Body
- **file_id** (string) - Required - The ID of the file to attach.

### Request Example
```python
from ogx_client import OgxClient
from pathlib import Path

client = OgxClient()

# Assuming 'store' is an existing vector store object and 'uploaded' is a file object
# store = client.vector_stores.create(...)
# uploaded = client.files.create(file=Path("/path/to/manual.pdf"), purpose="assistants")

vs_file = client.vector_stores.files.create(
    vector_store_id=store.id,
    file_id=uploaded.id,
)
print(vs_file.id)
```

### Response
#### Success Response (200)
- **id** (string) - The ID of the vector store file association.
- **vector_store_id** (string) - The ID of the vector store.
- **file_id** (string) - The ID of the attached file.
- **status** (string) - The ingestion status of the file (e.g., `pending`, `processing`, `completed`, `failed`).
```

--------------------------------

### Discover and Retrieve Available Models

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

List all available models on the OGX server, including their identifiers and providers. You can also retrieve detailed descriptions for specific models.

```python
from ogx_client import OgxClient

client = OgxClient()

# List all models
models = client.models.list()
for model in models.data:
    print(model.identifier, model.provider_id)

# Retrieve a specific model
model = client.models.retrieve("meta-llama/Llama-3.2-3B-Instruct")
print(model.description)
```

--------------------------------

### client.batches.create

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a new batch with the specified parameters.

```APIDOC
## POST /v1/batches

### Description
Creates a new batch.

### Method
POST

### Endpoint
/v1/batches

### Parameters
#### Request Body
- **params** (object) - Required - Parameters for creating the batch.
```

--------------------------------

### Run Interactive Agent CLI (Custom)

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/examples/README.md

Run the interactive agent CLI with custom model and base URL configurations.

```bash
uv run python interactive_agent_cli.py --model openai/gpt-4o-mini --base-url http://localhost:8321
```

--------------------------------

### Chat Completions

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Shows how to use the `client.chat.completions` resource for both standard and streaming chat completions, including handling tool calls and asynchronous streaming.

```APIDOC
## Chat Completions — `client.chat.completions`

The chat completions resource targets the `/v1/chat/completions` endpoint and supports streaming, tool calls, structured output, and reasoning effort controls. It also exposes `retrieve` and `list` for stored completions.

```python
import ogx_client
from ogx_client import OgxClient

client = OgxClient()

# Non-streaming chat completion with tool use
try:
    completion = client.chat.completions.create(
        model="meta-llama/Llama-3.2-3B-Instruct",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is 2 + 2?"},
        ],
        max_completion_tokens=256,
        temperature=0.7,
        reasoning_effort="medium",
        response_format={"type": "json_object"},
    )
    print(completion.id)
    print(completion.choices[0].message.content)
except ogx_client.RateLimitError:
    print("Rate limited – back off and retry")
except ogx_client.APIStatusError as e:
    print(f"API error {e.status_code}: {e.response}")

# Streaming chat completion (SSE)
stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Explain quantum entanglement briefly."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
print()

# Async streaming
import asyncio
from ogx_client import AsyncOgxClient

async def stream_async():
    async_client = AsyncOgxClient()
    stream = await async_client.chat.completions.create(
        model="meta-llama/Llama-3.2-3B-Instruct",
        messages=[{"role": "user", "content": "What is the speed of light?"}],
        stream=True,
    )
    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(stream_async())
```
```

--------------------------------

### Asynchronous Client Usage

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Import and instantiate AsyncOgxClient for asynchronous operations. API calls require the 'await' keyword.

```python
import asyncio
from ogx_client import AsyncOgxClient

client = AsyncOgxClient()


async def main() -> None:
    models = await client.models.list()


asyncio.run(main())
```

--------------------------------

### Create Vector Store

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a new vector store.

```APIDOC
## POST /v1/vector_stores

### Description
Creates a new vector store.

### Method
POST

### Endpoint
/v1/vector_stores

### Parameters
#### Request Body
- **params** (object) - Required - Parameters for creating the vector store.
```

--------------------------------

### Synchronous Client Usage

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Instantiate the synchronous OgxClient and make a call to list models. API keys can be provided directly or managed via environment variables using python-dotenv.

```python
from ogx_client import OgxClient

client = OgxClient()

models = client.models.list()
```

--------------------------------

### Create Response with File Search and Conversation Threading

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Use this to create a response that utilizes a file search tool and attaches to an existing conversation thread. Ensure the vector store ID and conversation ID are valid.

```python
response = client.responses.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    instructions="You are a research assistant. Search documents before answering.",
    input="What are the key authentication methods described in the docs?",
    tools=[
        {
            "type": "file_search",
            "vector_store_ids": ["vs_abc123"],
        }
    ],
    conversation="conv_xyz789",   # attach to an existing conversation
    max_infer_iters=5,
)
print(response.id)
print(response.output)
```

--------------------------------

### List Prompt Versions

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Retrieves a list of all versions for a specific prompt ID. Returns a `PromptListResponse`.

```python
client.prompts.versions.list(prompt_id)
```

--------------------------------

### Submit and Manage Asynchronous Batches

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Prepare and upload a JSONL input file for batch processing, submit the batch, poll for completion status, and download the output. Batches are processed asynchronously against specified endpoints with a defined completion window. Ensure the input file is correctly formatted.

```python
from pathlib import Path
from ogx_client import OgxClient
import time

client = OgxClient()

# 1. Prepare and upload the batch input file
batch_input = "\n".join([
    '{"custom_id":"req-1","method":"POST","url":"/v1/chat/completions","body":{"model":"meta-llama/Llama-3.2-3B-Instruct","messages":[{"role":"user","content":"Hello!"}]}}',
    '{"custom_id":"req-2","method":"POST","url":"/v1/chat/completions","body":{"model":"meta-llama/Llama-3.2-3B-Instruct","messages":[{"role":"user","content":"What is 2+2?"}]}}',
])
input_file = client.files.create(
    file=("batch_input.jsonl", batch_input.encode(), "application/jsonl"),
    purpose="batch",
)

# 2. Submit the batch
batch = client.batches.create(
    input_file_id=input_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={"description": "nightly completions run"},
    idempotency_key="run-2026-05-12",   # safe to retry
)
print(batch.id, batch.status)

# 3. Poll until complete
while batch.status not in {"completed", "failed", "cancelled", "expired"}:
    time.sleep(10)
    batch = client.batches.retrieve(batch.id)
    print(f"Status: {batch.status}  completed={batch.request_counts.completed}")

# 4. Download output
if batch.output_file_id:
    output = client.files.content(batch.output_file_id)
    print(output[:500])

# Cancel a running batch
client.batches.cancel(batch.id)
```

--------------------------------

### Create a Response

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Use this method to create a new response. It accepts parameters defined in `response_create_params.py` and returns a `ResponseObject`.

```python
client.responses.create(**params)
```

--------------------------------

### client.providers.list

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Lists all available providers.

```APIDOC
## GET /v1/providers

### Description
Lists all available providers.

### Method
GET

### Endpoint
/v1/providers
```

--------------------------------

### Manage HTTP resources with a context manager

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Ensure proper closure of HTTP connections by using `OgxClient` as a context manager. The client and its underlying resources will be automatically closed upon exiting the `with` block.

```python
from ogx_client import OgxClient

with OgxClient() as client:
  # make requests here
  ...

# HTTP client is now closed
```

--------------------------------

### Create Custom ClientTool for Agents

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Subclass `ClientTool` for complex tools requiring state or async execution. Define `get_name`, `get_description`, `get_input_schema`, `run_impl`, and `async_run_impl` methods. Register the tool with an `Agent` instance.

```python
import httpx
from ogx_client.lib.agents.client_tool import ClientTool, JSONSchema
from ogx_client.lib.agents.types import ToolResponse

class WebSearchTool(ClientTool):
    def get_name(self) -> str:
        return "web_search"

    def get_description(self) -> str:
        return "Search the web and return the top result snippet."

    def get_input_schema(self) -> JSONSchema:
        return {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The search query string."},
            },
            "required": ["query"],
        }

    def run_impl(self, query: str):
        # Replace with a real search API call
        return {"title": f"Result for '{query}'", "snippet": "...example snippet..."}

    async def async_run_impl(self, query: str):
        async with httpx.AsyncClient() as http:
            # Replace with actual async search
            return {"title": f"Async result for '{query}'", "snippet": "..."}

# Register the tool with an Agent
from ogx_client import OgxClient, Agent

agent = Agent(
    client=OgxClient(),
    model="meta-llama/Llama-3.2-3B-Instruct",
    instructions="Search the web to answer user questions.",
    tools=[WebSearchTool()],
)

session_id = agent.create_session("research-session")
for chunk in agent.create_turn(
    messages=[{"type": "message", "role": "user",
               "content": [{"type": "input_text", "text": "Latest news on AI?"}]}],
    session_id=session_id,
):
    pass  # process chunk events
```

--------------------------------

### Define Client-Side Tool for Agent

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Illustrates how to define a client-side tool using the `@client_tool` decorator. This allows the agent to call local Python functions. The function signature and docstring are used to define the tool's interface.

```python
from ogx_client import OgxClient, Agent, AgentEventLogger
from ogx_client.lib.agents.client_tool import client_tool

client = OgxClient()

# Define a client-side tool using the @client_tool decorator
@client_tool
def get_weather(city: str) -> dict:
    """Get current weather for a city.

    :param city: The name of the city to get weather for.
    :returns: A dict with temperature and conditions.
    """
    # In production, call a real weather API here
    return {"city": city, "temp_c": 22, "conditions": "sunny"}
```

--------------------------------

### Manual Pagination Control

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Demonstrates granular control over pagination using `.has_next_page()`, `.next_page_info()`, and `.get_next_page()` methods on a paginated response object.

```python
first_page = await client.responses.list()
if first_page.has_next_page():
    print(f"will fetch next page using these details: {first_page.next_page_info()}")
    next_page = await first_page.get_next_page()
    print(f"number of items we just fetched: {len(next_page.data)}")
```

--------------------------------

### List Models in OpenAI-compatible Format

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Retrieves a list of available models in an OpenAI-compatible format. Useful for understanding model availability.

```python
openai_models = client.models.openai.list()
for m in openai_models.data:
    print(m.id, m.created)
```

--------------------------------

### Create Completion

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a new completion based on the provided parameters.

```APIDOC
## POST /v1/completions

### Description
Creates a new completion.

### Method
POST

### Endpoint
/v1/completions

### Parameters
#### Request Body
- **params** (object) - Required - Parameters for creating the completion.
```

--------------------------------

### client.routes.list

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Lists available routes with optional filtering parameters.

```APIDOC
## GET /v1/inspect/routes

### Description
Lists available routes with optional filtering parameters.

### Method
GET

### Endpoint
/v1/inspect/routes

### Parameters
#### Query Parameters
- **params** (object) - Optional - Parameters for filtering routes.
```

--------------------------------

### List Registered Routes

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Use this snippet to list all registered routes and their associated provider IDs. Ensure the client is initialized before calling this method.

```python
routes = client.alpha.admin.list_routes()
for r in routes.data:
    print(r.route, r.provider_id)
```

--------------------------------

### Prompt Management

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Create, retrieve, update, and manage versions of prompt templates with variable interpolation.

```APIDOC
## Create a versioned prompt template
prompt = client.prompts.create(
    prompt="You are a helpful assistant. Today's date is {{date}}. User query: {{query}}",
    variables=["date", "query"],
)
print(prompt.id, prompt.version)

## Retrieve with variable interpolation
rendered = client.prompts.retrieve(
    prompt.id,
    variables={"date": "2026-05-12", "query": "What's new in AI?"},
)
print(rendered.content)   # fully rendered prompt string

## Update the prompt text (creates a new version)
updated = client.prompts.update(
    prompt.id,
    prompt="You are an expert assistant. Date: {{date}}. Question: {{query}}",
)
print(updated.version)   # incremented

## Set a specific version as default
client.prompts.set_default_version(prompt.id, version=1)

## List all versions
versions = client.prompts.versions.list(prompt.id)
for v in versions.data:
    print(v.version, v.created_at)

## List all prompts and delete one
for p in client.prompts.list().data:
    print(p.id)
client.prompts.delete(prompt.id)
```

--------------------------------

### Format Code

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Automatically format the code according to project standards using ruff and black. This command fixes most linting issues.

```sh
./scripts/format
```

--------------------------------

### Publish to PyPI with GitHub Workflow

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Utilize the 'Publish PyPI' GitHub action to automate the release process to PyPI. This requires setting up organization or repository secrets.

```sh
https://www.github.com/ogx-ai/ogx-client-python/actions/workflows/publish-pypi.yml
```

--------------------------------

### client.files.list

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Lists files with optional filtering parameters.

```APIDOC
## GET /v1/files

### Description
Lists files with optional filtering parameters.

### Method
GET

### Endpoint
/v1/files

### Parameters
#### Query Parameters
- **params** (object) - Optional - Parameters for filtering files.
```

--------------------------------

### Activate Virtual Environment

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Manually activate the virtual environment created by uv. This allows you to run Python scripts without the `uv run` prefix.

```sh
source .venv/bin/activate
```

--------------------------------

### Publish Manually

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/CONTRIBUTING.md

Manually release a package to PyPI by running the provided script with the PYPI_TOKEN environment variable set.

```sh
bin/publish-pypi
```

--------------------------------

### client.alpha.admin.version

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Retrieves the version information of the alpha admin service.

```APIDOC
## GET /v1alpha/admin/version

### Description
Retrieves the version information.

### Method
GET

### Endpoint
/v1alpha/admin/version
```

--------------------------------

### Inspect OGX Service Health, Version, and Routes

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Checks the health and version of the OGX service and lists all registered API routes with their implementing providers. Requires importing OgxClient.

```python
from ogx_client import OgxClient

client = OgxClient()

# Health check
health = client.inspect.health()
print(health.status)   # e.g., "OK"

# Version info
version = client.inspect.version()
print(version.version)

# List all available routes
routes = client.routes.list()
for route in routes.data:
    print(route.route, route.method, route.provider_id)

# List providers
providers = client.providers.list()
for p in providers.data:
    print(p.provider_id, p.provider_type)
```

--------------------------------

### Vector Stores - client.vector_stores.create

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Create a new persistent vector store for file-based RAG. Allows specifying a custom embedding provider.

```APIDOC
## Vector Stores - client.vector_stores.create

### Description
Create a new persistent vector store for file-based RAG. Allows specifying a custom embedding provider.

### Method
POST

### Endpoint
/v1/vector_stores

### Parameters
#### Request Body
- **name** (string) - Required - The name of the vector store.
- **extra_body** (object) - Optional - Additional configuration for the vector store.
  - **provider_id** (string) - Required - The ID of the vector store provider (e.g., `faiss`).
  - **embedding_model** (string) - Required - The embedding model to use for this store (e.g., `nomic-ai/nomic-embed-text-v1.5`).

### Request Example
```python
from ogx_client import OgxClient

client = OgxClient()

# Create a vector store with a specific embedding provider
store = client.vector_stores.create(
    name="product-docs-store",
    extra_body={
        "provider_id": "faiss",
        "embedding_model": "nomic-ai/nomic-embed-text-v1.5",
    },
)
print(store.id)
```

### Response
#### Success Response (200)
- **id** (string) - The unique identifier of the created vector store.
- **name** (string) - The name of the vector store.
- **created_at** (string) - The timestamp when the vector store was created.
```

--------------------------------

### Create Chat Completion

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Creates a chat completion. Accepts optional parameters for customization.

```APIDOC
## POST /v1/chat/completions

### Description
Creates a chat completion.

### Method
POST

### Endpoint
/v1/chat/completions

### Parameters
#### Request Body
- **params** (object) - Optional - Parameters for creating the chat completion.

### Response
#### Success Response (200)
- **CompletionCreateResponse** - The created chat completion.
```

--------------------------------

### Configure Granular Timeout

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/README.md

Use an `httpx.Timeout` object for more granular control over connection, read, write, and overall request timeouts.

```python
# More granular control:
client = OgxClient(
    timeout=httpx.Timeout(60.0, read=5.0, write=10.0, connect=2.0),
)
```

--------------------------------

### Prompt Versions API

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Methods for listing versions of a specific prompt.

```APIDOC
## GET /v1/prompts/{prompt_id}/versions

### Description
Lists all versions of a specific prompt.

### Method
GET

### Endpoint
/v1/prompts/{prompt_id}/versions

### Parameters
#### Path Parameters
- **prompt_id** (string) - Required - The ID of the prompt whose versions are to be listed.

### Response
#### Success Response (200)
- **PromptListResponse** (object) - A list of prompt version objects.
```

--------------------------------

### Version Information

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Retrieves the version information of the OGX service.

```APIDOC
## GET /v1/version

### Description
Retrieves the version information of the OGX service.

### Method
GET

### Endpoint
/v1/version

### Response
#### Success Response (200)
- **VersionInfo** - Information about the service's version.
```

--------------------------------

### List Registered Providers

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Use this snippet to list all registered provider IDs. Ensure the client is initialized before calling this method.

```python
providers = client.alpha.admin.list_providers()
for p in providers.data:
    print(p.provider_id)
```

--------------------------------

### Raw Response Access and Streaming Response Body

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Learn how to access HTTP headers before parsing the response body using `.with_raw_response` and how to stream response bodies lazily with `.with_streaming_response`.

```APIDOC
## Raw Response Access and Streaming Response Body

Access HTTP headers and control response body reading with `.with_raw_response` and `.with_streaming_response`.

```python
from ogx_client import OgxClient

client = OgxClient()

# Access response headers before parsing
raw = client.chat.completions.with_raw_response.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Hi!"}],
)
print(raw.headers.get("X-Request-Id"))
print(raw.headers.get("X-RateLimit-Remaining"))
completion = raw.parse()   # => CompletionCreateResponse
print(completion.choices[0].message.content)

# Stream response body lazily (useful for large payloads)
with client.chat.completions.with_streaming_response.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Write a long essay."}],
) as response:
    print(response.headers.get("Content-Type"))
    for line in response.iter_lines():
        print(line)
```
```

--------------------------------

### client.batches.list

Source: https://github.com/ogx-ai/ogx-client-python/blob/main/api.md

Lists batches with optional filtering parameters.

```APIDOC
## GET /v1/batches

### Description
Lists batches with optional filtering parameters.

### Method
GET

### Endpoint
/v1/batches

### Parameters
#### Query Parameters
- **params** (object) - Optional - Parameters for filtering batches.
```

--------------------------------

### Access Raw HTTP Headers and Stream Response Bodies

Source: https://context7.com/ogx-ai/ogx-client-python/llms.txt

Shows how to access raw HTTP headers before parsing the response body using `.with_raw_response` and how to stream large response bodies lazily with `.with_streaming_response`. Useful for inspecting metadata or handling large payloads efficiently.

```python
from ogx_client import OgxClient

client = OgxClient()

# Access response headers before parsing
raw = client.chat.completions.with_raw_response.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Hi!"}],
)
print(raw.headers.get("X-Request-Id"))
print(raw.headers.get("X-RateLimit-Remaining"))
completion = raw.parse()   # => CompletionCreateResponse
print(completion.choices[0].message.content)

# Stream response body lazily (useful for large payloads)
with client.chat.completions.with_streaming_response.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Write a long essay."}],
) as response:
    print(response.headers.get("Content-Type"))
    for line in response.iter_lines():
        print(line)
```