### Command-Line Interface - Starting the Proxy

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Instructions and examples for launching the proxy server using the command-line interface with various configuration options.

```APIDOC
## Command-Line Interface - Starting the Proxy

### Description
Launch the proxy with comprehensive configuration options for backends, authentication, and features using the `python -m src.core.cli` command.

### Usage
`python -m src.core.cli [OPTIONS]`

### Options
- `--default-backend <backend>`: Specifies the default backend to use (e.g., `openai`, `gemini`).
- `--host <host>`: The host address to bind the server to (e.g., `0.0.0.0`).
- `--port <port>`: The port number to listen on (e.g., `8000`).
- `--force-model <model>`: Forces all requests to a specific model.
- `--disable-auth`: Disables authentication.
- `--enable-planning-phase`: Enables the planning phase for complex requests.
  - `--planning-phase-strong-model <model>`: Specifies the model for the planning phase.
  - `--planning-phase-max-turns <number>`: Sets the maximum number of turns for the planning phase.
  - `--planning-phase-temperature <float>`: Sets the temperature for the planning phase model.
  - `--planning-phase-reasoning-effort <level>`: Sets the reasoning effort level for the planning phase (e.g., `high`).
- `--model-alias "<pattern>=<replacement>"`: Defines a pattern for rewriting model names.
- `--enable-edit-precision`: Enables edit-precision tuning.
  - `--edit-precision-temperature <float>`: Sets the temperature for edit-precision tuning.
  - `--edit-precision-min-top-p <float>`: Sets the minimum top-p value for edit-precision tuning.
  - `--edit-precision-override-top-p`: Overrides the top-p value for edit-precision tuning.
- `--static-route <backend>:<model>`: Bypasses backend selection and routes directly to a specific model on a backend.
- `--config <path>`: Path to a configuration file.
- `--log <path>`: Path to a log file.
- `--capture-file <path>`: Path to a file for capturing network traffic.
- `--log-level <level>`: Sets the logging level (e.g., `DEBUG`).
- `--trusted-ip <ip_address/cidr>`: Specifies trusted IP addresses or CIDR ranges.
- `--enable-brute-force-protection`: Enables protection against brute-force attacks.
- `--auth-max-failed-attempts <number>`: Sets the maximum number of failed authentication attempts.

### Examples

**Basic startup with OpenAI backend:**
```bash
python -m src.core.cli --default-backend openai
```

**Startup with custom host/port and Gemini backend:**
```bash
python -m src.core.cli \
  --default-backend gemini-cli-oauth-personal \
  --host 0.0.0.0 \
  --port 8000
```

**Force all requests to a specific model and disable authentication:**
```bash
python -m src.core.cli \
  --default-backend gemini-cli-oauth-personal \
  --force-model gemini-2.5-pro \
  --disable-auth
```

**Enable wire capture for debugging:**
```bash
python -m src.core.cli \
  --default-backend openai \
  --capture-file logs/wire_capture.log \
  --log-level DEBUG
```

**Configure planning phase:**
```bash
python -m src.core.cli \
  --default-backend openai \
  --enable-planning-phase \
  --planning-phase-strong-model openai:gpt-4o \
  --planning-phase-max-turns 8 \
  --planning-phase-temperature 0.2 \
  --planning-phase-reasoning-effort high
```

**Model name rewrites for routing:**
```bash
python -m src.core.cli \
  --default-backend openrouter \
  --model-alias '^gpt-(.*)=openrouter:openai/gpt-\1' \
  --model-alias '^claude-(.*)=anthropic:claude-\1'
```

**Edit-precision tuning configuration:**
```bash
python -m src.core.cli \
  --enable-edit-precision \
  --edit-precision-temperature 0.08 \
  --edit-precision-min-top-p 0.25 \
  --edit-precision-override-top-p
```

**Static route configuration:**
```bash
python -m src.core.cli \
  --static-route gemini-cli-oauth-personal:gemini-2.5-pro
```

**Complete production setup:**
```bash
python -m src.core.cli \
  --config config/production.yaml \
  --default-backend openai \
  --host 0.0.0.0 \
  --port 8000 \
  --log logs/proxy.log \
  --capture-file logs/wire.log \
  --trusted-ip 192.168.1.0/24 \
  --enable-brute-force-protection \
  --auth-max-failed-attempts 5
```
```

--------------------------------

### Installing and Logging in Gemini CLI (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Commands to install the Gemini CLI globally and log in to a Google account. This is a prerequisite for using Gemini-related backends with the proxy.

```bash
# Install gemini-cli (one-time)
npm install -g @google/gemini-cli
gemini login
```

--------------------------------

### Start Proxy with OpenAI Backend

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command starts the proxy service using OpenAI as the default backend. Ensure the OPENAI_API_KEY environment variable is set.

```bash
python -m src.core.cli --default-backend openai
```

--------------------------------

### Example Usage of Planning Phase CLI Flags

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This is a full command-line example demonstrating how to run the llm-proxy with the planning phase enabled. It specifies the default backend, enables the planning phase, sets the strong model to openai:gpt-4o, and configures maximum turns, file writes, temperature, top_p, reasoning effort, and thinking budget.

```bash
python -m src.core.cli \
  --default-backend openai \
  --enable-planning-phase \
  --planning-phase-strong-model openai:gpt-4o \
  --planning-phase-max-turns 8 \
  --planning-phase-max-file-writes 1 \
  --planning-phase-temperature 0.2 \
  --planning-phase-top-p 0.9 \
  --planning-phase-reasoning-effort high \
  --planning-phase-thinking-budget 8000
```

--------------------------------

### Setup Gemini CLI Agent with ACP (Node.js/Python)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Installs the Gemini CLI globally, logs in, and sets a workspace directory for project-aware agent configurations. This enables the Gemini CLI to operate within a specific project context.

```bash
npm install -g @google/gemini-cli
gemini login
export GEMINI_CLI_WORKSPACE="/path/to/project"
python -m src.core.cli --default-backend gemini-cli-acp
```

--------------------------------

### Start Proxy with Gemini CLI Agent Control Protocol (ACP)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command enables the proxy to run with the Gemini CLI using the Agent Control Protocol. It involves installing and logging into the Gemini CLI, optionally setting a workspace, and then starting the proxy.

```bash
# Install and authenticate with Google Gemini CLI (one-time)
npm install -g @google/gemini-cli
gemini login

# Set project directory (optional - defaults to current directory)
export GEMINI_CLI_WORKSPACE="/path/to/your/project"

# Start the proxy using gemini-cli as an agent
python -m src.core.cli --default-backend gemini-cli-acp

# Change project directory during conversation with slash command
!/project-dir(/path/to/another/project)
```

--------------------------------

### Example Conversation with Backend/Model Switching (Bash)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Demonstrates a conversational flow where users switch between AI backends and models using in-chat commands, including a one-off request example. This illustrates dynamic runtime control.

```bash
# Example conversation with switching
User: !/backend(openai)
Assistant: [Switched to openai backend]

User: !/model(gpt-4)
Assistant: [Using model: gpt-4]

User: What is 2+2?
Assistant: 4

User: !/oneoff(gemini-cli-oauth-personal:gemini-2.5-pro)
User: Explain quantum physics
Assistant: [Uses Gemini 2.5 Pro for this request only]

User: What is 3+3?
Assistant: [Back to gpt-4] 6
```

--------------------------------

### Start Proxy with Custom Host/Port and Gemini Backend

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command starts the proxy with a specified backend ('gemini-cli-oauth-personal') and custom host and port settings ('0.0.0.0' and '8000'). This is useful for network accessibility and specific service binding.

```bash
python -m src.core.cli \
  --default-backend gemini-cli-oauth-personal \
  --host 0.0.0.0 \
  --port 8000
```

--------------------------------

### Main Configuration File Example (YAML)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Provides a sample of the main configuration file for the LLM Interactive Proxy. This file typically defines all features, backends, and global settings for the application.

```yaml
...
```

--------------------------------

### Example Project Directory Change (Bash)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Demonstrates changing the project directory using the in-chat command, showing the assistant's confirmation and an example of an agent interaction within the specified project context.

```bash
# Example with Gemini CLI ACP
User: !/project-dir(/home/user/webapp)
Assistant: [Project directory changed to /home/user/webapp]

User: Show me all Python files in this project
Assistant: [gemini-cli agent lists files from /home/user/webapp]
```

--------------------------------

### Command-Line Interface - Backend Configuration

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Instructions and examples for configuring different LLM backends using environment variables and command-line arguments.

```APIDOC
## Command-Line Interface - Backend Configuration

### Description
Configure multiple backends with API keys, OAuth, and provider-specific settings using environment variables and the CLI.

### Configuration Methods
Backend configurations are typically set using environment variables for sensitive information like API keys, and then specified via the `--default-backend` or other CLI arguments.

### Examples

**OpenAI with API key:**
1. Set the environment variable:
   ```bash
   export OPENAI_API_KEY="sk-..."
   ```
2. Launch the proxy with OpenAI as the default backend:
   ```bash
   python -m src.core.cli --default-backend openai
   ```

**Anthropic with API key:**
1. Set the environment variable:
   ```bash
   export ANTHROPIC_API_KEY="sk-ant-..."
   ```
2. Launch the proxy with Anthropic as the default backend:
   ```bash
   python -m src.core.cli --default-backend anthropic
   ```

**Gemini with API key (metered):**
1. Set the environment variable:
   ```bash
   export GEMINI_API_KEY="AIza..."
   ```
2. Launch the proxy with Gemini as the default backend:
   ```bash
   python -m src.core.cli --default-backend gemini
   ```
```

--------------------------------

### Starting the Proxy with a Configuration File (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Command to run the LLM Interactive Proxy using a specified configuration file. This is the standard method for launching the proxy with custom settings.

```bash
python -m src.core.cli --config config.yaml
```

--------------------------------

### Start Proxy with Gemini CLI Cloud Project Backend (GCP)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command starts the proxy using the Gemini CLI with a GCP-billed cloud project backend. It requires setting the GOOGLE_CLOUD_PROJECT environment variable and authenticating with Google Cloud.

```bash
export GOOGLE_CLOUD_PROJECT="your-project-id"

# Provide Application Default Credentials via one of the following:
# Option A: User credentials (interactive)
gcloud auth application-default login

# Option B: Service account file
export GOOGLE_APPLICATION_CREDENTIALS="/absolute/path/to/service-account.json"

python -m src.core.cli --default-backend gemini-cli-cloud-project
```

--------------------------------

### Example Reasoning Mode Usage (Bash)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Illustrates how to use the in-chat commands to adjust reasoning levels for AI models, showing examples for maximum reasoning and disabling reasoning for faster responses.

```bash
# Example usage
User: !/max
User: Solve this complex math problem: ...
Assistant: [Uses high reasoning effort]

User: !/no-think
User: What's 2+2?
Assistant: [Fast response without reasoning] 4
```

--------------------------------

### Starting Proxy with Gemini CLI ACP Backend (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Command to launch the LLM Interactive Proxy with the `gemini-cli-acp` backend, enabling it to function as a Gemini CLI agent.

```bash
python -m src.core.cli --default-backend gemini-cli-acp
```

--------------------------------

### Install Development Dependencies

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Installs the project's development dependencies using pip, including optional dependencies denoted by '[dev]'.

```bash
pip install -e .[dev]
```

--------------------------------

### Install Project with Development Extras (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/docs/testing.md

Installs the project in editable mode along with the 'dev' optional dependencies, which include necessary pytest plugins. This command ensures that all testing tools are available.

```bash
python -m pip install -e .[dev]
```

--------------------------------

### Install Development Dependencies and Run Tests (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command installs the necessary development dependencies, including pytest plugins for async and parallel execution, and then runs the project's test suite. It assumes you are executing these commands within the project's virtual environment.

```bash
python -m pip install -e .[dev]
python -m pytest
```

--------------------------------

### Start Proxy with Gemini CLI OAuth Personal Backend

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command initiates the proxy service using the Gemini CLI with personal OAuth authentication. Ensure you have the necessary Gemini CLI and OAuth configurations in place.

```bash
python -m src.core.cli --default-backend gemini-cli-oauth-personal
```

--------------------------------

### Install Pre-Commit Hooks (Windows)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Installs the necessary pre-commit hooks for the repository on a Windows environment, typically within a virtual environment. This ensures that code quality and security checks are performed before each commit.

```shell
./.venv/Scripts/python.exe scripts/install-hooks.py
```

--------------------------------

### Complete Production Setup Command

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This comprehensive command sets up the proxy for production use. It loads configuration from a file, specifies the default backend, host, port, log file locations, enables brute-force protection with a defined number of failed attempts, and allows trusted IP addresses.

```bash
python -m src.core.cli \
  --config config/production.yaml \
  --default-backend openai \
  --host 0.0.0.0 \
  --port 8000 \
  --log logs/proxy.log \
  --capture-file logs/wire.log \
  --trusted-ip 192.168.1.0/24 \
  --enable-brute-force-protection \
  --auth-max-failed-attempts 5
```

--------------------------------

### Process Wire Capture Files with jq

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Provides command-line examples using `jq` to query and analyze wire capture log files. These examples demonstrate filtering by direction, extracting specific data like user messages, identifying errors, and calculating token usage.

```bash
# Count requests by backend
jq -r 'select(.direction=="outbound_request") | .backend' logs/wire_capture.log | sort | uniq -c

# Extract all user messages
jq -r 'select(.direction=="outbound_request") | .payload.messages[]? | select(.role=="user") | .content' logs/wire_capture.log

# Find failed requests (look for error responses)
jq 'select(.direction=="inbound_response" and (.payload.error or .payload.choices == null))' logs/wire_capture.log

# Calculate token usage by model
jq -r 'select(.direction=="inbound_response" and .payload.usage) | "\(.model) \(.payload.usage.total_tokens // (.payload.usage.prompt_tokens + .payload.usage.completion_tokens))"' logs/wire_capture.log
```

--------------------------------

### Authenticate with Google Gemini CLI (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command is used for the 'gemini-cli-oauth-personal' backend, which utilizes free-tier personal OAuth. It assumes the Google Gemini CLI is installed and guides the user through a one-time authentication process.

```bash
# Install and authenticate with the Google Gemini CLI (one-time):
gemini auth
```

--------------------------------

### Minimal Proxy Configuration (YAML)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

A basic YAML configuration file for the LLM Interactive Proxy, setting up a default OpenAI backend and proxy host/port. This serves as a starting point for proxy deployment.

```yaml
# config.yaml
backends:
  openai:
    type: openai
default_backend: openai
proxy:
  host: 0.0.0.0
  port: 8000
auth:
  # Set LLM_INTERACTIVE_PROXY_API_KEY env var to enable
  disable_auth: false
```

--------------------------------

### Configure OpenAI Backend with API Key

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command starts the proxy with the OpenAI backend enabled. An environment variable `OPENAI_API_KEY` must be set to provide the necessary API authentication credentials. This is a standard way to configure the proxy for OpenAI services.

```bash
export OPENAI_API_KEY="sk-..."
python -m src.core.cli --default-backend openai
```

--------------------------------

### Configure Edit-Precision Tuning via Environment Variables

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This example demonstrates configuring the edit-precision tuning feature using environment variables. It covers enabling/disabling the feature and setting parameters like temperature and top_p.

```shell
EDIT_PRECISION_ENABLED=true
EDIT_PRECISION_TEMPERATURE=0.1
EDIT_PRECISION_MIN_TOP_P=0.3
EDIT_PRECISION_OVERRIDE_TOP_P=false
EDIT_PRECISION_EXCLUDE_AGENTS_REGEX="<pattern>"
```

--------------------------------

### Bash: Ensure Starting from 'dev' Branch

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/slash-commands/git-review-merge-dev.md

Checks out the 'dev' branch and verifies that the operation successfully reset to 'dev'. It then performs a fast-forward pull to ensure the local 'dev' branch is up-to-date with the remote 'origin/dev'. This is crucial for maintaining a consistent starting point.

```bash
git checkout dev 2>/dev/null || git switch dev
git rev-parse --abbrev-ref HEAD | grep -qx "dev" || { echo "Failed to reset to dev; aborting."; exit 1; }
git pull --ff-only origin dev || { echo "Unable to fast-forward local dev"; exit 1; }
```

--------------------------------

### Export Provider API Keys

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command exports environment variables for various AI service provider API keys. This step is necessary before starting the proxy if you plan to use these backends.

```bash
export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GEMINI_API_KEY=...
export OPENROUTER_API_KEY=...
export ZAI_API_KEY=...
# GCP-based Gemini back-end
export GOOGLE_CLOUD_PROJECT=your-project-id
```

--------------------------------

### Environment-Specific Routing with Environment Variables

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This example illustrates environment-specific model routing using environment variables. It shows how to set the `MODEL_ALIASES` variable differently for development (using free models) and production (using premium models), enabling flexible deployment strategies.

```bash
# Development environment - use free models
export MODEL_ALIASES='[
  {"pattern": "^.*$", "replacement": "gemini-cli-oauth-personal:gemini-1.5-flash"}
]'

# Production environment - use premium models
export MODEL_ALIASES='[
  {"pattern": "^gpt-(.*)", "replacement": "openai:gpt-\1"},
  {"pattern": "^claude-(.*)", "replacement": "anthropic:claude-\1"}
]'
```

--------------------------------

### Run Pytest Suite (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/docs/testing.md

Executes the project's test suite using pytest. This command should be run after installing the project with its development dependencies.

```bash
python -m pytest
```

--------------------------------

### Start Proxy with Wire Capture Enabled

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command initiates the proxy with wire capture enabled, logging all network traffic to a specified file ('logs/wire_capture.log'). It also sets the log level to DEBUG for detailed output, useful for debugging network-related issues.

```bash
python -m src.core.cli \
  --default-backend openai \
  --capture-file logs/wire_capture.log \
  --log-level DEBUG
```

--------------------------------

### Gemini Streaming Generate Content Request

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This example demonstrates how to initiate a streaming content generation request with the Gemini API. It uses a different endpoint for streaming and includes a user prompt. The response is delivered in chunks, each prefixed with 'data:'.

```bash
# Streaming Gemini request
curl -X POST http://localhost:8000/v1beta/models/gemini-2.5-pro:streamGenerateContent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-key" \
  -d '{
    "contents": [{
      "parts": [{"text": "Count to 3"}],
      "role": "user"
    }]
  }'
```

--------------------------------

### Pytest Output Compression Example

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Demonstrates the transformation of verbose pytest output to a more concise format after compression. The compressed output removes timing details and 'PASSED' test results, focusing on retaining 'FAILED' tests and associated error messages.

```text
# Before compression (verbose):
test_example.py::test_function PASSED                    [ 50%] 0.001s setup 0.002s call 0.001s teardown
test_example.py::test_failure FAILED                     [100%] 0.001s setup 0.003s call 0.001s teardown

# After compression (concise):
test_example.py::test_failure FAILED                     [100%]
```

--------------------------------

### Start Proxy to Force a Specific Model

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command launches the proxy, forcing all requests to use a specific model ('gemini-2.5-pro') regardless of the client's request. Authentication is also disabled for this configuration. This is useful for testing or enforcing a particular model's behavior.

```bash
python -m src.core.cli \
  --default-backend gemini-cli-oauth-personal \
  --force-model gemini-2.5-pro \
  --disable-auth
```

--------------------------------

### Test Stage Integration with ValidatedTestStage

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/src/core/testing/README.md

Provides an example of integrating the testing framework into test stages by inheriting from `ValidatedTestStage`. It reiterates the use of safe mock creation and registration methods.

```python
from src.core.testing.base_stage import ValidatedTestStage

class MyMockStage(ValidatedTestStage):
    # Inherit from ValidatedTestStage instead of InitializationStage
    # Use create_safe_*_mock() methods
    # Use safe_register_instance() method
    # Automatic validation happens in execute()
```

--------------------------------

### Buffered JSON Lines Wire Capture Format Example

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Illustrates the structure of a single log entry in the current Buffered JSON Lines format for wire capture. This format is optimized for performance and provides structured data for each HTTP request or response.

```json
{
  "timestamp_iso": "2025-01-10T15:58:41.039145+00:00",
  "timestamp_unix": 1736524721.039145,
  "direction": "outbound_request",
  "source": "127.0.0.1(Cline/1.0)",
  "destination": "qwen-oauth",
  "session_id": "session-123",
  "backend": "qwen-oauth",
  "model": "qwen3-coder-plus",
  "key_name": "primary",
  "content_type": "json",
  "content_length": 1247,
  "payload": {
    "messages": [{"role": "user", "content": "..."}],
    "model": "qwen3-coder-plus",
    "temperature": 0.7
  },
  "metadata": {
    "client_host": "127.0.0.1",
    "user_agent": "Cline/1.0",
    "request_id": "req_abc123"
  }
}
```

--------------------------------

### Legacy Human-Readable Wire Capture Format

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Provides an example of the legacy human-readable format for wire capture logs. This format is less structured and intended primarily for quick visual inspection of individual requests and responses.

```text
----- REQUEST 2025-01-10T15:58:41Z -----
client=127.0.0.1 agent=Cline/1.0 session=session-123 -> backend=qwen-oauth model=qwen3-coder-plus
{
  "messages": [...],
  "model": "qwen3-coder-plus"
}

----- REPLY 2025-01-10T15:58:42Z -----
client=127.0.0.1 agent=Cline/1.0 session=session-123 -> backend=qwen-oauth model=qwen3-coder-plus
{
  "choices": [...]
}

```

--------------------------------

### Start Proxy with Strict Command Detection (CLI)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This command enables strict command detection mode via the command-line interface. In this mode, commands are only processed if they appear on the last non-blank line of a message.

```bash
python -m src.core.cli --strict-command-detection
```

--------------------------------

### ApplyDiff Handler Example Logic in Python

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Showcases the functionality of the built-in `ApplyDiffHandler`. It monitors for `apply_diff` tool calls and steers the LLM to prefer `patch_file` instead, citing its superiority and QA features. This handler implements per-session rate limiting and allows for a configurable steering message.

```python
# The handler automatically steers LLMs from:
# tool_call: apply_diff(...)

# To a custom response:
# "You tried to use apply_diff tool. Please prefer to use patch_file tool instead,
# as it is superior to apply_diff and provides automated Python QA checks."

```

--------------------------------

### Python Migrating Test Class Mock Creation

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/src/core/testing/README.md

Guides users on migrating existing tests by adding a GuardedMockCreationMixin to test classes and replacing direct AsyncMock creation with the framework's safe mock creation methods, such as 'create_async_mock'.

```python
# Before
mock = AsyncMock(spec=IService)

# After
mock = self.create_async_mock(spec=IService)
```

--------------------------------

### Configure Planning Phase with CLI Flags

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This list presents command-line interface flags for enabling and configuring the planning phase. It allows specifying the strong model, maximum turns and file writes, and fine-tuning parameters like temperature, top_p, reasoning effort, and thinking budget directly from the command line.

```bash
--enable-planning-phase
--planning-phase-strong-model BACKEND:MODEL
--planning-phase-max-turns N
--planning-phase-max-file-writes N
--planning-phase-temperature FLOAT
--planning-phase-top-p FLOAT
--planning-phase-reasoning-effort EFFORT
--planning-phase-thinking-budget TOKENS
```

--------------------------------

### Clone LLM Interactive Proxy Repository

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Clones the LLM Interactive Proxy project repository and navigates into the project directory. This is the first step in setting up the development environment.

```bash
git clone https://github.com/matdev83/llm-interactive-proxy.git
cd llm-interactive-proxy
```

--------------------------------

### Configure Planning Phase with Environment Variables

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This list shows environment variable settings for configuring the planning phase. Options include enabling the phase, specifying the strong model backend and name, setting limits for turns and file writes, and adjusting parameters like temperature, top_p, reasoning effort, and thinking budget.

```bash
PLANNING_PHASE_ENABLED=true|false
PLANNING_PHASE_STRONG_MODEL=backend:model (e.g., openai:gpt-4o)
PLANNING_PHASE_MAX_TURNS=10
PLANNING_PHASE_MAX_FILE_WRITES=1
PLANNING_PHASE_TEMPERATURE=0.2
PLANNING_PHASE_TOP_P=0.9
PLANNING_PHASE_REASONING_EFFORT=high
PLANNING_PHASE_THINKING_BUDGET=8000
```

--------------------------------

### Configure Planning Phase with YAML

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This YAML configuration snippet shows how to enable the planning phase, specify a strong model (openai:gpt-4o), set the maximum number of planning turns to 10, and the maximum number of file writes to 1. It also includes overrides for temperature, top_p, reasoning_effort, and thinking_budget.

```yaml
session:
  planning_phase:
    enabled: true
    strong_model: "openai:gpt-4o"
    max_turns: 10
    max_file_writes: 1
    overrides:
      temperature: 0.2
      top_p: 0.9
      reasoning_effort: "high"
      thinking_budget: 8000
```

--------------------------------

### Check Ruff Linting

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/slash-commands/git-review-merge-dev.md

Bash script to check if the 'ruff' linter is installed and then run it to check for linting errors. If 'ruff' is not found, it prints a message and skips the check.

```bash
if "$PYTHON_CMD" -m ruff --version >/dev/null 2>&1; then
  "$PYTHON_CMD" -m ruff check .
else
  echo "ruff not installed; skipping"
fi
```

--------------------------------

### Check Black Formatting

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/slash-commands/git-review-merge-dev.md

Bash script to check if the 'black' code formatter is installed and then run it in check mode. If 'black' is not found, it prints a message and skips the check.

```bash
if "$PYTHON_CMD" -m black --version >/dev/null 2>&1; then
  "$PYTHON_CMD" -m black --check .
else
  echo "black not installed; skipping"
fi
```

--------------------------------

### Configure Gemini GCP Project Billing (Python)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Sets up the Google Cloud Project for Gemini CLI usage by configuring the billing project and logging in with application default credentials. This is necessary for using Gemini with GCP resources.

```bash
export GOOGLE_CLOUD_PROJECT="your-project-id"
gcloud auth application-default login
python -m src.core.cli --default-backend gemini-cli-cloud-project
```

--------------------------------

### Configure ZAI Coding Plan Backend (Python)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Sets the ZAI API key for specialized use with the ZAI Coding Plan, optimized for agent-based coding tasks. This backend leverages Zhipu AI's coding-specific capabilities.

```bash
export ZAI_API_KEY="..."
python -m src.core.cli --default-backend zai-coding-plan
```

--------------------------------

### Configure Anthropic Backend with API Key

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command initializes the proxy to use the Anthropic backend. It requires the `ANTHROPIC_API_KEY` environment variable to be set with your Anthropic API key for authentication.

```bash
export ANTHROPIC_API_KEY="sk-ant-..."
python -m src.core.cli --default-backend anthropic
```

--------------------------------

### Configure Gemini API Key and Default Backend (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This snippet shows how to set the GEMINI_API_KEY environment variable and then run the CLI with 'gemini' as the default backend. This is useful for production environments or high-volume usage requiring a metered API key.

```bash
export GEMINI_API_KEY="AIza..."
python -m src.core.cli --default-backend gemini
```

--------------------------------

### Configure Proxy for Planning Phase

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command enables the planning phase for the proxy, specifying a strong model ('openai:gpt-4o') for planning, setting the maximum number of turns to 8, and configuring the temperature and reasoning effort. This is for advanced use cases involving multi-step reasoning.

```bash
python -m src.core.cli \
  --default-backend openai \
  --enable-planning-phase \
  --planning-phase-strong-model openai:gpt-4o \
  --planning-phase-max-turns 8 \
  --planning-phase-temperature 0.2 \
  --planning-phase-reasoning-effort high
```

--------------------------------

### Lint and Format Code with Ruff, Black, and Mypy

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Runs linting and formatting tools on the project's source code. Ruff is used for linting, Black for formatting, and Mypy for static type checking.

```bash
# Run ruff
python -m ruff check src

# Run black
python -m black src

# Run mypy
python -m mypy src
```

--------------------------------

### Bash: Verify Clean Working Tree

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/slash-commands/git-review-merge-dev.md

Checks if the Git working tree has any uncommitted changes. If modifications are found, it prints an error message and exits, ensuring operations start from a clean state. This is a precondition for safe Git operations.

```bash
test -z "$(git status --porcelain)" || { echo "Uncommitted changes found. Stash or commit first."; exit 1; }
```

--------------------------------

### Run LLM Interactive Proxy Application

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Executes the LLM Interactive Proxy application using the Python module system. Supports running with default settings, a custom configuration file, or different backend providers.

```bash
# Run with default settings
python -m src.core.cli

# Run with custom configuration
python -m src.core.cli --config path/to/config.yaml

# Run with different backends
python -m src.core.cli --default-backend openrouter
python -m src.core.cli --default-backend gemini
python -m src.core.cli --default-backend gemini-cli-oauth-personal
python -m src.core.cli --default-backend anthropic
```

--------------------------------

### YAML Configuration for Context Window Limits

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Defines how to configure context window, max input tokens, and max output tokens for models within the proxy's backend-specific or model default settings. This allows for fine-grained control over request limits.

```yaml
# Backend-specific configuration (e.g., config/backends/custom/backend.yaml)
models:
  "your-model-name":
    limits:
      context_window: 262144        # Total context window size (tokens)
      max_input_tokens: 200000      # Input token limit (tokens)
      max_output_tokens: 62144      # Output token limit (tokens)
      requests_per_minute: 60       # Rate limits
      tokens_per_minute: 1000000

# Or in main config via model_defaults
model_defaults:
  "your-model-name":
    limits:
      context_window: 128000        # 128K context window
      max_input_tokens: 100000      # 100K input limit
```

--------------------------------

### Configure ZAI Backend (Python)

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

Sets the ZAI API key as an environment variable to enable the proxy to use Zhipu AI models. This is for general access to ZAI services.

```bash
export ZAI_API_KEY="..."
python -m src.core.cli --default-backend zai
```

--------------------------------

### Configure Model Name Rewrites for Routing

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This command sets up model name aliases to route requests to different providers based on the requested model name. For example, 'gpt-(.*)' is rewritten to 'openrouter:openai/gpt-1', and 'claude-(.*)' to 'anthropic:claude-1'. This allows flexible backend routing.

```bash
python -m src.core.cli \
  --default-backend openrouter \
  --model-alias '^gpt-(.*)=openrouter:openai/gpt-\1' \
  --model-alias '^claude-(.*)=anthropic:claude-\1'
```

--------------------------------

### Wire Capture Configuration Options

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Details various YAML configuration parameters for tuning wire capture behavior, including buffer size, flush intervals, maximum entries per flush, and file rotation settings.

```yaml
logging:
  capture_file: "logs/wire_capture.log"
  # Performance tuning
  capture_buffer_size: 65536          # 64KB buffer (default)
  capture_flush_interval: 1.0         # Flush every 1 second
  capture_max_entries_per_flush: 100  # Max entries per flush
  # Rotation
  capture_max_bytes: 104857600         # 100MB per file
  capture_max_files: 5                # Keep 5 rotated files
  capture_total_max_bytes: 524288000   # 500MB total cap
```

--------------------------------

### Structured JSON Response with OpenAI API

Source: https://context7.com/matdev83/llm-interactive-proxy/llms.txt

This example demonstrates how to make a POST request to the OpenAI API endpoint for structured JSON responses. It specifies the desired model, conversation messages, and a JSON schema for the output. The response is expected to be a validated JSON object matching the provided schema.

```shell
curl -X POST http://localhost:8000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-key" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Extract person info: John Doe, age 30, works at Acme Corp"}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person_info",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "employer": {"type": "string"}
          },
          "required": ["name", "age", "employer"],
          "additionalProperties": false
        }
      }
    }
  }'
```

--------------------------------

### Force Model and Context Window with CLI Arguments

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This snippet demonstrates how to use command-line arguments to force a specific model and set a fixed context window size for all requests. It's useful for testing, enforcing model usage, and controlling token costs.

```bash
python -m src.core.cli \
  --default-backend gemini-cli-oauth-personal \
  --force-model gemini-2.5-pro \
  --disable-auth \
  --port 8000
```

```bash
python -m src.core.cli \
  --default-backend openai \
  --force-context-window 8000 \
  --disable-auth \
  --port 8000
```

--------------------------------

### Python Migrating Test Stage Service Registration

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/src/core/testing/README.md

Illustrates how to migrate service registration within test stages. It demonstrates moving the service registration logic from the 'execute' method to the '_register_services' method and using safe registration patterns provided by the framework.

```python
# Before
async def execute(self, services, config):
    mock = AsyncMock(spec=IService)
    services.add_instance(IService, mock)

# After
async def _register_services(self, services, config):
    mock = self.create_safe_session_service_mock()
    self.safe_register_instance(services, IService, mock)
```

--------------------------------

### YAML Configuration for Model Name Rewrites

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/features/model-name-rewrites/PRD.md

Example configuration snippet for the `model_aliases` feature in `config.yaml`. This demonstrates how to define rules for rewriting model names using regular expressions and capture groups. The rules are processed sequentially, with the first match determining the rewrite.

```yaml
model_aliases:
  # Statically replace a specific model
  - pattern: "^claude-3-sonnet-20240229$"
    replacement: "gemini-cli-oauth-personal:gemini-1.5-flash"

  # Dynamically replace any GPT model, keeping the version
  - pattern: "^gpt-(.*)"
    replacement: "openrouter:openai/gpt-\1"

  # Catch-all for any other model
  - pattern: ".*"
    replacement: "gemini-cli-oauth-personal:gemini-1.5-pro"
```

--------------------------------

### Setting Environment Variables for Claude Integration (Bash)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Environment variables to set up the proxy to work with Claude models. This includes specifying the Anthropic API URL and an API key.

```bash
export ANTHROPIC_API_URL=http://localhost:8001
export ANTHROPIC_API_KEY=<your-proxy-key>
```

--------------------------------

### Create and Activate Python Virtual Environment

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Creates a Python virtual environment named '.venv' and then activates it. This isolates project dependencies.

```bash
python -m venv .venv
# Windows:
# .\.venv\Scripts\activate
# Unix:
source .venv/bin/activate
```

--------------------------------

### Fix DI Violation: Manual Service Instantiation

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/CONTRIBUTING.md

Demonstrates the incorrect way of manually instantiating a service ('CommandProcessor') within a method and the correct approach using dependency injection. The corrected version injects the service via the constructor, adhering to SOLID principles.

```python
def handle_request(self, request):
    processor = CommandProcessor(self.config)  # VIOLATION!
    return processor.process(request)
```

```python
def __init__(self, command_processor: ICommandProcessor):
    self.command_processor = command_processor

def handle_request(self, request):
    return self.command_processor.process(request)  # CORRECT
```

--------------------------------

### Configure Custom Model Backends with Limits (YAML)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

This YAML configuration defines custom backend models with specific limits for context window, input tokens, and requests per minute. It helps manage costs, ensure agent compatibility, and tune performance by setting appropriate thresholds for different models.

```yaml
backend_type: "custom"
models:
  "large-context-model":
    limits:
      context_window: 262144      # 256K total context window
      max_input_tokens: 200000    # 200K input limit (leaves room for response)
      requests_per_minute: 30     # Conservative rate limits
  "small-fast-model":
    limits:
      context_window: 8192        # 8K context window
      max_input_tokens: 6000      # 6K input limit
      requests_per_minute: 120    # Higher rate for smaller model
```

--------------------------------

### Use EnforcedMockFactory for Mock Creation

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/src/core/testing/README.md

Shows the recommended way to create mocks using EnforcedMockFactory. These mocks are guaranteed to be properly configured, helping to prevent potential coroutine warnings.

```python
from src.core.testing.interfaces import EnforcedMockFactory

# These are guaranteed to be properly configured
session_service = EnforcedMockFactory.create_session_service_mock()
backend_service = EnforcedMockFactory.create_backend_service_mock()
```

--------------------------------

### Generate Review Summary File

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/dev/slash-commands/git-review-merge-dev.md

Bash script to create a temporary file for documenting review notes. It uses a heredoc to pre-fill the file with standard review checklist items and prompts the user to fill in the details.

```bash
REVIEW_SUMMARY_FILE=$(mktemp)
cat <<'EOF' >"$REVIEW_SUMMARY_FILE"
## Review Notes
- Scope alignment (PR description vs diff): <fill>
- Tests run & coverage decision: <fill>
- Residual risks / follow-ups: <fill or 'None'>
- Checklist verdict: <fill, e.g. 'All items confirmed ✅'>
EOF
echo "Edit $REVIEW_SUMMARY_FILE and replace each <fill> token with your notes before continuing."
```

--------------------------------

### Configure Planning Phase Strong Model Overrides (Environment Variables)

Source: https://github.com/matdev83/llm-interactive-proxy/blob/dev/README.md

Environment variables for configuring the planning phase feature. These allow setting the enablement status, the strong model to use, maximum turns and file writes, and specific model parameters like temperature and top_p. These settings are overridden by CLI flags.

```bash
export PLANNING_PHASE_ENABLED=true
export PLANNING_PHASE_STRONG_MODEL=backend:model
export PLANNING_PHASE_MAX_TURNS=10
export PLANNING_PHASE_MAX_FILE_WRITES=1
export PLANNING_PHASE_TEMPERATURE=0.2
export PLANNING_PHASE_TOP_P=0.9
export PLANNING_PHASE_REASONING_EFFORT=high
export PLANNING_PHASE_THINKING_BUDGET=8000
```