### Install Runpod from GitHub (Latest)

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Install the latest development version from the main branch of the GitHub repository using pip or uv.

```bash
pip install git+https://github.com/runpod/runpod-python.git
```

```bash
uv add git+https://github.com/runpod/runpod-python.git
```

--------------------------------

### Install Runpod from PyPI

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Install the stable release of the Runpod Python library using pip or uv.

```bash
pip install runpod
```

```bash
uv add runpod
```

--------------------------------

### Basic Serverless Worker Example

Source: https://github.com/runpod/runpod-python/blob/main/README.md

A simple Python script defining a handler function to check if a number is even and starting the Runpod serverless API. This script should be run as the default container start command.

```python
# my_worker.py

import runpod

def is_even(job):

    job_input = job["input"]
    the_number = job_input["number"]

    if not isinstance(the_number, int):
        return {"error": "Silly human, you need to pass an integer."}

    if the_number % 2 == 0:
        return True

    return False

runpod.serverless.start({"handler": is_even})
```

--------------------------------

### Install Runpod from GitHub (Specific Branch/Tag)

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Install a specific branch or tag/release of the Runpod Python library from GitHub.

```bash
pip install git+https://github.com/runpod/runpod-python.git@branch-name
```

```bash
pip install git+https://github.com/runpod/runpod-python.git@v1.0.0
```

--------------------------------

### Install Test Dependencies

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Ensure all necessary dependencies for testing are installed using the `uv sync` command with the `test` group. This is crucial for avoiding import errors during benchmarking.

```bash
uv sync --group test
```

--------------------------------

### Check Docker Installation

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Verify that Docker is installed and accessible on your system before proceeding with the build process.

```bash
docker --version
```

--------------------------------

### Runpod Credentials File Example

Source: https://github.com/runpod/runpod-python/blob/main/docs/getting_Started.md

Example TOML file structure for storing Runpod API key. Ensure this file is saved to ~/.runpod/credentials.toml.

```toml
[profile]
api_key = "YOUR_RUNPOD_API_KEY"
```

--------------------------------

### Example CUDA Initialization Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Indicates that CUDA devices were successfully initialized, reporting the number of devices that are ready.

```log
CUDA initialization passed: 2 device(s) initialized successfully
```

--------------------------------

### Install Runpod in Editable Mode

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Install the Runpod Python library in editable mode for development purposes by cloning the repository and installing with pip.

```bash
git clone https://github.com/runpod/runpod-python.git
cd runpod-python
pip install -e .
```

--------------------------------

### Automate Branch Cold Start Benchmarking

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Run the cold start benchmark script on the current branch or a specified branch. It handles git operations and dependency installation.

```bash
./scripts/benchmark_cold_start.sh
```

```bash
./scripts/benchmark_cold_start.sh main
```

```bash
./scripts/benchmark_cold_start.sh main feature/lazy-loading
```

--------------------------------

### Complete Fitness Checks Example

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

This comprehensive example demonstrates how to define various fitness checks using the `@runpod.serverless.register_fitness_check` decorator. It includes checks for GPU availability and memory, model file existence, disk space, environment variables, and API reachability.

```python
import runpod
import os
import torch
import shutil
from pathlib import Path
import aiohttp

# GPU checks
@runpod.serverless.register_fitness_check
def check_gpu():
    """Verify GPU is available."""
    if not torch.cuda.is_available():
        raise RuntimeError("GPU not available")

@runpod.serverless.register_fitness_check
def check_gpu_memory():
    """Verify GPU has sufficient memory."""
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3)
    if gpu_memory < 8:
        raise RuntimeError(f"GPU memory too low: {gpu_memory:.1f}GB (need 8GB)")

# File checks
@runpod.serverless.register_fitness_check
def check_models_exist():
    """Verify model files exist."""
    model_path = Path("/models/model.safetensors")
    if not model_path.exists():
        raise RuntimeError(f"Model not found: {model_path}")

# Resource checks
@runpod.serverless.register_fitness_check
def check_disk_space():
    """Verify sufficient disk space."""
    stat = shutil.disk_usage("/")
    free_gb = stat.free / (1024**3)
    if free_gb < 50:
        raise RuntimeError(f"Insufficient disk space: {free_gb:.1f}GB free")

# Environment checks
@runpod.serverless.register_fitness_check
def check_environment():
    """Verify environment variables."""
    required = ["API_KEY", "MODEL_ID"]
    missing = [v for v in required if not os.environ.get(v)]
    if missing:
        raise RuntimeError(f"Missing env vars: {', '.join(missing)}")

# Async API check
@runpod.serverless.register_fitness_check
async def check_api():
    """Verify API is reachable."""
    try:
        async with aiohttp.ClientSession() as session:
            async with session.get("https://api.example.com/health", timeout=5) as resp:
                if resp.status != 200:
                    raise RuntimeError(f"API returned {resp.status}")
    except Exception as e:
        raise RuntimeError(f"Cannot reach API: {e}")

def handler(job):
    """Process job."""
    job_input = job["input"]
    # Your processing code here
    return {"output": "success"}

if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})

```

--------------------------------

### Run Cold Start Benchmark with Pytest

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Execute the cold start benchmark tests using pytest. Results are saved to JSON files.

```bash
uv run pytest tests/test_performance/test_cold_start.py -v
```

```bash
uv run python tests/test_performance/test_cold_start.py
```

--------------------------------

### RunPod Serverless Handler Example

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

A basic example of a RunPod serverless handler function. This code runs after the automatic GPU health check passes on GPU workers.

```python
import runpod

# GPU health check runs automatically on GPU workers
# No manual registration needed!

def handler(job):
    """Your handler runs after GPU health check passes."""
    return {"output": "success"}

if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})
```

--------------------------------

### Local Development Server Usage

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

This command demonstrates how to start the local development server using FastAPI. It specifies the port and host for the API and enables the RunPod serving API.

```bash
python worker.py --rp_serve_api --rp_api_port 8000 --rp_api_host localhost
```

--------------------------------

### CI/CD Integration for Benchmarking

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

This YAML snippet shows how to add a cold start benchmark to a GitHub Actions workflow, including running the test and uploading the results artifact.

```yaml
- name: Run cold start benchmark
  run: |
    uv run pytest tests/test_performance/test_cold_start.py --timeout=120

- name: Upload benchmark results
  uses: actions/upload-artifact@v3
  with:
    name: benchmark-results
    path: benchmark_results/cold_start_latest.json
```

--------------------------------

### RunPod TOML Project Configuration Example

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/runpod.toml.md

This TOML file defines the configuration for a project on the Runpod platform, including project metadata, resource allocation, environment variables, and runtime settings.

```toml
# Runpod Project Configuration

title = "My Project"

[project]
uuid = "00000000"
name = "My Project"
base_image = "runpod/base:0.0.0"
gpu_types = ["NVIDIA RTX 3090"]
gpu_count = 1
storage_id = "00000000"
volume_mount_path = "/runpod-volume"
ports = "8080/http, 22/tcp"
container_disk_size_gb = 10

[project.env_vars]
VAR_NAME_1 = "value1"
VAR_NAME_2 = "value2"


[template]
model_type = "default"
model_name = "None"

[runtime]
python_version = "3.10"
handler_path = "handler.py"
requirements_path = "requirements.txt"

```

--------------------------------

### Manage GPU Cloud Pods

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Provides examples for interacting with RunPod GPU Cloud pods, including retrieval, creation, and termination.

```python
import runpod

runpod.api_key = "your_runpod_api_key_found_under_settings"

# Get all my pods
pods = runpod.get_pods()

# Get a specific pod
pod = runpod.get_pod(pod.id)

# Create a pod with GPU
pod = runpod.create_pod("test", "runpod/stack", "NVIDIA GeForce RTX 3070")

# Create a pod with CPU
pod = runpod.create_pod("test", "runpod/stack", instance_id="cpu3c-2-4")

# Stop the pod
runpod.stop_pod(pod.id)

# Resume the pod
runpod.resume_pod(pod.id)

# Terminate the pod
runpod.terminate_pod(pod.id)
```

--------------------------------

### Example CUDA Version Check Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Shows a successful CUDA version check, specifying the detected version and the minimum required version.

```log
CUDA version check passed: 12.2 (minimum: 11.8)
```

--------------------------------

### Example Network Connectivity Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Demonstrates a successful network connectivity check, including the target IP and the response latency.

```log
Network connectivity passed: Connected to 8.8.8.8 (45ms)
```

--------------------------------

### Add, Start, and Stop Checkpoints with Checkpoints Class

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md

Use the `Checkpoints` class to add, start, and stop timing checkpoints. This class is a singleton and aggregates timings until `get_debugger_output` is called.

```python
from rp_debugger import Checkpoints

checkpoints = Checkpoints()

# Add a checkpoint
checkpoints.add('checkpoint_name')

# Start a checkpoint
checkpoints.start('checkpoint_name')

# Stop a checkpoint
checkpoints.stop('checkpoint_name')
```

--------------------------------

### Example Memory Check Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Shows a successful memory availability check, indicating the amount of available and total system memory.

```log
Memory check passed: 12.00GB available (of 16.00GB total)
```

--------------------------------

### Download multiple files

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_download.md

This example demonstrates how to download multiple files concurrently by providing a list of URLs. The function returns a list of absolute paths for all successfully downloaded files.

```python
from runpod.serverless.utils import download_files_from_urls

job_id = "job_123"
urls = [
    "https://example.com/file1.txt",
    "https://example.com/file2.png",
    "https://example.com/file3.pdf"
]

downloaded_files = download_files_from_urls(job_id, urls)

for i, file_path in enumerate(downloaded_files):
    print(f"Downloaded file {i + 1}: {file_path}")
```

--------------------------------

### Example GPU Compute Benchmark Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Shows a successful GPU compute benchmark, reporting the time taken for a matrix multiplication operation.

```log
GPU compute benchmark passed: Matrix multiply completed in 25ms
```

--------------------------------

### API Key Precedence Example

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Illustrates the precedence order for API keys: endpoint instance key overrides the global key.

```python
import runpod

# Example showing precedence
runpod.api_key = "GLOBAL_KEY"

# This endpoint uses GLOBAL_KEY
endpoint1 = runpod.Endpoint("ENDPOINT_ID")

# This endpoint uses ENDPOINT_KEY (overrides global)
endpoint2 = runpod.Endpoint("ENDPOINT_ID", api_key="ENDPOINT_KEY")

# All requests from endpoint2 will use ENDPOINT_KEY
result = endpoint2.run_sync({"input": "data"})
```

--------------------------------

### Example Disk Space Check Log Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Indicates a successful disk space check, showing the amount of free space and its percentage of the total disk size.

```log
Disk space check passed: 50.00GB free (50.0% available)
```

--------------------------------

### Get All GPUs

Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md

Fetches a list of all available GPUs from RunPod. Requires API key to be set.

```python
import runpod

runpod.api_key = "your_runpod_api_key"

gpus = runpod.get_gpus()

for gpu in gpus:
    print(gpu)
```

--------------------------------

### Configure GPU Test Timeout

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Example of setting environment variables to customize the automatic GPU fitness check behavior, such as adjusting the timeout.

```python
import os

# Adjust timeout (default: 30 seconds)
os.environ["RUNPOD_GPU_TEST_TIMEOUT"] = "60"

# Override binary path (for custom/patched versions)
os.environ["RUNPOD_BINARY_GPU_TEST_PATH"] = "/custom/path/gpu_test"

# Cap the number of error messages parsed from gpu_test output (default: 10)
os.environ["RUNPOD_GPU_MAX_ERROR_MESSAGES"] = "20"

# Skip auto-registration of this check (primarily for testing)
os.environ["RUNPOD_SKIP_GPU_CHECK"] = "true"
```

--------------------------------

### Run Endpoint Asynchronously

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Initiate a run request to an endpoint and get its status. The output is retrieved separately.

```python
endpoint = runpod.Endpoint("ENDPOINT_ID")

run_request = endpoint.run(
    {"your_model_input_key": "your_model_input_value"}
)

# Check the status of the endpoint run request
print(run_request.status())

# Get the output of the endpoint run request, blocking until the endpoint run is complete.
print(run_request.output())
```

--------------------------------

### Lazy Loading Dependencies in Python

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Demonstrates lazy importing of heavy libraries like boto3 and fastapi to reduce cold start times. Import the library only when the function that uses it is called.

```python
# Before
import boto3
import fastapi

# After
def use_boto():
    import boto3  # Lazy load only when needed
```

--------------------------------

### Minimal Runpod Serverless Worker

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker.md

A basic worker file that imports the runpod library, defines a handler function, and starts the serverless process.

```python
import runpod

def handler(job):
    # Handle the job and return the output
    return {"output": "Job completed successfully"}

runpod.serverless.start({"handler": handler})
```

--------------------------------

### Compile GPU Test Binary in Dockerfile

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Compile the GPU test binary directly within your Dockerfile. This example uses nvcc with specific CUDA compute capabilities and links necessary libraries.

```dockerfile
# Or compile in container
COPY build_tools/gpu_test.c /tmp/
RUN cd /tmp && nvcc -O3 -arch=sm_70,sm_75,sm_80,sm_86 \
    -o /usr/local/bin/gpu_test gpu_test.c -lnvidia-ml -lcudart_static
```

--------------------------------

### Successful GPU Test Binary Compilation Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Example output indicating a successful compilation of the gpu_test binary, including details on CUDA and Ubuntu versions used, and the final binary information.

```text
Compiling gpu_test binary...
CUDA Version: 11.8.0
Ubuntu Version: ubuntu22.04
Output directory: .../runpod/serverless/binaries
Compilation successful
Binary successfully created at: .../runpod/serverless/binaries/gpu_test
Binary info:
/path/to/gpu_test: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), ...
```

--------------------------------

### Serverless Worker with Fitness Checks

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Example of a Python worker script that includes fitness checks for GPU availability and disk space using decorators. These checks run at startup to ensure the environment is ready.

```python
# my_worker.py

import runpod
import torch

# Register fitness checks using the decorator
@runpod.serverless.register_fitness_check
def check_gpu_available():
    """Verify GPU is available."""
    if not torch.cuda.is_available():
        raise RuntimeError("GPU not available")

@runpod.serverless.register_fitness_check
def check_disk_space():
    """Verify sufficient disk space."""
    import shutil
    stat = shutil.disk_usage("/")
    free_gb = stat.free / (1024**3)
    if free_gb < 10:
        raise RuntimeError(f"Insufficient disk space: {free_gb:.2f}GB free")

def handler(job):
    job_input = job["input"]
    # Your handler code here
    return {"output": "success"}

# Fitness checks run before handler initialization (production only)
runpod.serverless.start({"handler": handler})
```

--------------------------------

### Upload Local File to S3 Bucket

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_upload.md

Use this function to upload a local file to an S3-compatible bucket. Ensure boto3 is installed and S3 credentials are configured. Optionally specify a bucket name and prefix.

```python
from runpod.serverless.utils import upload_file_to_bucket

# Define your bucket credentials
bucket_creds = {
    'endpointUrl': 'https://your-bucket-endpoint-url.com',
    'accessId': 'your_key_id',
    'accessSecret': 'your_secret_access_key'
}

# Define the file name and file location
file_name = 'example.txt'
file_location = '/path/to/your/local/file/example.txt'

# Optional: Define a bucket_name and prefix
bucket_name = 'custom-bucket-name'
prefix = 'your-prefix'

# Upload the file and get the presigned URL
presigned_url = upload_file_to_bucket(file_name, file_location, bucket_creds, bucket_name, prefix)

# Print the presigned URL
print(f"Presigned URL: {presigned_url}")
```

--------------------------------

### Basic GPU Test Binary Build

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Navigate to the build tools directory and execute the compile script to build the gpu_test binary. The output is placed in ../runpod/serverless/binaries/gpu_test.

```bash
cd build_tools
./compile_gpu_test.sh
```

--------------------------------

### Runpod CLI Help Overview

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Display the main help message for the Runpod CLI, showing available top-level commands.

```bash
runpod --help
```

--------------------------------

### Comparing Multiple Approaches Workflow

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

This bash script demonstrates how to compare multiple optimization branches by running benchmarks for each and then using a comparison script.

```bash
# Compare three different optimization branches
./scripts/benchmark_cold_start.sh main > results_main.txt
./scripts/benchmark_cold_start.sh feature/approach-1 > results_1.txt
./scripts/benchmark_cold_start.sh feature/approach-2 > results_2.txt

# Then compare each against baseline
uv run python scripts/compare_benchmarks.py \
  benchmark_results/cold_start_main_*.json \
  benchmark_results/cold_start_approach-1_*.json
```

--------------------------------

### Run Tests with uv or pip

Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md

Execute tests to ensure changes do not break existing functionality. uv is recommended for speed.

```bash
# Using uv (recommended - faster)
uv sync --group test
uv run pytest

# Or using pip
pip install '.[test]'
pytest
```

--------------------------------

### Get Specific GPU

Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md

Fetches details for a specific GPU by its ID. Requires API key to be set.

```python
gpu_id = "NVIDIA A100 80GB PCIe"
gpu = runpod.get_gpu(gpu_id)

print(gpu)
```

--------------------------------

### Build GPU Test Binary with Custom Ubuntu Version

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Specify a UBUNTU_VERSION environment variable to use a different Ubuntu base image for the build. The default is Ubuntu 22.04.

```bash
cd build_tools
UBUNTU_VERSION=ubuntu20.04 ./compile_gpu_test.sh
```

```bash
cd build_tools
UBUNTU_VERSION=ubuntu22.04 ./compile_gpu_test.sh
```

--------------------------------

### Test Input JSON File

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/local_testing.md

Create this JSON file in your project's root directory to provide input for local testing.

```json
{
  "input": {
    "your_model_input_key": "your_model_input_value"
  }
}
```

--------------------------------

### Check CUDA Driver Version

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Use nvidia-smi to check the installed CUDA driver version on your system. This helps diagnose 'version mismatch' errors.

```bash
# Check CUDA driver version
nvidia-smi
```

--------------------------------

### Register Fitness Checks in Order

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Fitness checks are executed in the order they are registered using the `@runpod.serverless.register_fitness_check` decorator. This example shows two checks that will run sequentially.

```python
import runpod

@runpod.serverless.register_fitness_check
def check_first():
    print("This runs first")

@runpod.serverless.register_fitness_check
def check_second():
    print("This runs second")
```

--------------------------------

### Runpod Pod Management

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Manage your Runpod pods. Use 'list' to see available pods and 'create' to launch a new one.

```bash
runpod pod list
```

```bash
runpod pod create
```

--------------------------------

### Register Asynchronous Fitness Checks

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Register asynchronous functions as fitness checks using the `@runpod.serverless.register_fitness_check` decorator. This example demonstrates checking external API connectivity.

```python
import runpod
import aiohttp

@runpod.serverless.register_fitness_check
async def check_api_connectivity():
    """Check if external API is accessible."""
    async with aiohttp.ClientSession() as session:
        try:
            async with session.get("https://api.example.com/health", timeout=5) as resp:
                if resp.status != 200:
                    raise RuntimeError(f"API health check failed: {resp.status}")
        except Exception as e:
            raise RuntimeError(f"Cannot connect to API: {e}")

def handler(job):
    return {"output": "success"}

if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})
```

--------------------------------

### Navigate to Local Repository

Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md

Change the current directory to your local runpod-python repository.

```bash
cd runpod-python
```

--------------------------------

### Heartbeat API Endpoint

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Specifies the GET endpoint for sending heartbeats to the RunPod platform. Allows reporting multiple job IDs and indicating if a retry is needed.

```http
GET {RUNPOD_WEBHOOK_PING}?job_id={comma_separated_ids}&retry_ping={0|1}
```

--------------------------------

### Runpod Pod Connection

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Connect to a running Runpod pod.

```bash
runpod pod connect
```

--------------------------------

### Configure Built-in Checks in Python

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Shows how to set environment variables in Python to configure the thresholds for built-in system checks.

```python
import os

os.environ["RUNPOD_MIN_MEMORY_GB"] = "8.0"
os.environ["RUNPOD_MIN_DISK_PERCENT"] = "15.0"
```

--------------------------------

### API Health Check with Retries

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Implement retry logic for fitness checks that might encounter transient network issues. This example uses aiohttp and asyncio for asynchronous retries.

```python
import runpod
import aiohttp
import asyncio

@runpod.serverless.register_fitness_check
async def check_api_with_retry():
    """Check API connectivity with retries."""
    max_retries = 3
    for attempt in range(max_retries):
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get("https://api.example.com/health", timeout=5) as resp:
                    if resp.status == 200:
                        return
        except Exception as e:
            if attempt == max_retries - 1:
                raise RuntimeError(f"API check failed after {max_retries} attempts: {e}")
            await asyncio.sleep(1)  # Wait before retry
```

--------------------------------

### Job Acquisition API Endpoint

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Defines the GET endpoints for acquiring jobs from the RunPod platform. Supports fetching a single job or a batch, and indicates if jobs are currently in progress.

```http
GET {RUNPOD_WEBHOOK_GET_JOB}?job_in_progress={0|1}
```

```http
GET {RUNPOD_WEBHOOK_GET_JOB}/batch?batch_size={N}&job_in_progress={0|1}
```

--------------------------------

### Configure Runpod API Key

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/start_here.md

Store your Runpod API key. Optionally, provide the key directly or use the --profile flag to manage multiple keys.

```bash
runpod config
```

```bash
runpod config YOUR_API_KEY
```

```bash
runpod config --profile my-profile
```

--------------------------------

### Verify GPU Binary Properties

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Check the file type, size, and executability of the compiled GPU binary.

```bash
# Check binary info
file runpod/serverless/binaries/gpu_test
```

```bash
# Check binary size
ls -lh runpod/serverless/binaries/gpu_test
```

```bash
# Verify executable
test -x runpod/serverless/binaries/gpu_test && echo "Binary is executable"
```

--------------------------------

### Enable Worker Refresh on Start

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker.md

Configure the worker to automatically refresh after every job completion, even if the handler encounters an error. This is useful for complex operations requiring a clean state.

```python
from runpod.serverless import start

def handler(job):
    # Handle the job and return the output
    return {"output": "Job completed successfully"}

start({"handler": handler, "refresh_worker": True})
```

--------------------------------

### Runpod CLI Configuration

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Configure your Runpod API key for authentication. You will be prompted for your API key and profile name.

```bash
runpod config
```

--------------------------------

### Verify Binary Integrity

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Use the 'file' command to check the integrity and architecture of the compiled binary. This helps diagnose 'cannot execute binary' errors.

```bash
# Verify binary integrity
file runpod/serverless/binaries/gpu_test
```

--------------------------------

### Get Combined Debugger Output

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md

Call `get_debugger_output` to retrieve a dictionary containing system information and all recorded checkpoint timings. This function clears the debugger's internal state after retrieval.

```python
from rp_debugger import get_debugger_output

output = get_debugger_output()
```

--------------------------------

### Testing a Performance Optimization Workflow

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

This bash script outlines the steps to test a performance optimization by saving a baseline, switching to a feature branch, running the benchmark, and comparing results.

```bash
# 1. Save baseline on main branch
git checkout main
./scripts/benchmark_cold_start.sh
cp benchmark_results/cold_start_latest.json benchmark_results/cold_start_baseline.json

# 2. Switch to feature branch
git checkout feature/my-optimization

# 3. Run benchmark and compare
./scripts/benchmark_cold_start.sh
uv run python scripts/compare_benchmarks.py \
  benchmark_results/cold_start_baseline.json \
  benchmark_results/cold_start_latest.json
```

--------------------------------

### Copy Pre-compiled Binary to Dockerfile

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Include a pre-compiled GPU test binary in your Docker image by copying it to a location like /usr/local/bin.

```dockerfile
# Copy pre-compiled binary from runpod-python
COPY runpod/serverless/binaries/gpu_test /usr/local/bin/
```

--------------------------------

### Upload In-Memory Object to S3 Bucket

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_upload.md

Use this function to upload data directly from memory (as bytes) to an S3-compatible bucket. Ensure boto3 is installed and S3 credentials are configured. The bucket name must be included in the bucket_creds dictionary for this function.

```python
from runpod.serverless.utils import upload_in_memory_object

# Define your bucket credentials
bucket_creds = {
    'endpointUrl': 'https://your-bucket-endpoint-url.com',
    'accessId': 'your_key_id',
    'accessSecret': 'your_secret_access_key',
    'bucketName': 'your_bucket_name'
}

# Define the file name and file data (bytes)
file_name = 'example.txt'
file_data = b'This is an example text.'

# Optional: Define a bucket_name and prefix
bucket_name = 'custom-bucket-name'
prefix = 'your-prefix'

# Upload the in-memory object and get the presigned URL
presigned_url = upload_in_memory_object(file_name, file_data, bucket_creds, bucket_name, prefix)

# Print the presigned URL
print(f"Presigned URL: {presigned_url}")
```

--------------------------------

### Runpod Launch Pod with Template

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Launch a new Runpod pod using a YAML template file. This command is part of the 'launch pod' subcommand.

```bash
runpod launch pod --template-file template.yaml
```

--------------------------------

### Heartbeat Process Architecture

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

This diagram shows the heartbeat mechanism where the main worker process forks a separate process for pinging. The heartbeat process periodically loads job state from disk and sends an HTTP GET request to the RunPod API.

```mermaid
graph TB
    MAIN[Main Worker Process] -->|fork| PING[Heartbeat Process]
    PING -->|every 10s| LOAD[Load Job State from Disk]
    LOAD --> GET[HTTP GET Ping Endpoint]
    GET -->|job_ids| API[RunPod API]

    style PING fill:#f57c00,stroke:#e65100,stroke-width:3px,color:#fff
    style LOAD fill:#d32f2f,stroke:#b71c1c,stroke-width:3px,color:#fff
```

--------------------------------

### JobsProgress Class Methods

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Lists the key methods available in the `JobsProgress` singleton class for managing active jobs, including adding, removing, and retrieving job information.

```python
add(job)           # Add job to set, persist to disk
remove(job)        # Remove job from set, persist to disk
get_job_list()     # Return comma-separated job IDs
get_job_count()    # Return number of active jobs
_save_state()      # Serialize set to pickle file with lock
_load_state()      # Deserialize set from pickle file with lock
```

--------------------------------

### CUDA Initialization Failure Scenario

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Illustrates an error message when CUDA device initialization fails, often due to the device being busy or unavailable.

```log
ERROR  | Fitness check failed: _cuda_init_check | RuntimeError: Failed to initialize GPU 0: CUDA error: CUDA-capable device(s) is/are busy or unavailable
```

--------------------------------

### Recompile GPU Test Binary

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

If the binary is corrupted or for the wrong architecture, try recompiling it. Navigate to the build_tools directory before running the compile script.

```bash
# Try recompiling
cd build_tools && ./compile_gpu_test.sh
```

--------------------------------

### Configure Built-in Checks in Dockerfile

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Demonstrates how to set environment variables in a Dockerfile to configure the thresholds for various built-in system checks.

```dockerfile
# In your Dockerfile or container config
ENV RUNPOD_MIN_MEMORY_GB=8.0
ENV RUNPOD_MIN_DISK_PERCENT=15.0
ENV RUNPOD_MIN_CUDA_VERSION=12.0
ENV RUNPOD_NETWORK_CHECK_TIMEOUT=10
ENV RUNPOD_GPU_BENCHMARK_TIMEOUT=2
```

--------------------------------

### Compare Two Benchmark Result Files

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Analyze and visualize differences between two benchmark runs using colored terminal output. Takes baseline and optimized JSON files as input.

```bash
uv run python scripts/compare_benchmarks.py <baseline.json> <optimized.json>
```

--------------------------------

### Run GPU Test Binary

Source: https://github.com/runpod/runpod-python/blob/main/runpod/serverless/binaries/README.md

Execute the pre-compiled GPU health check binary to test CUDA GPU memory allocation and report system information. Ensure the binary is in the correct path.

```bash
./runpod/serverless/binaries/gpu_test
```

--------------------------------

### Integrate All RP Debugger Utilities

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md

Combine `Checkpoints`, `LineTimer`, `FunctionTimer`, and `get_debugger_output` to comprehensively profile serverless function execution. Timings are aggregated and cleared upon calling `get_debugger_output`.

```python
from rp_debugger import Checkpoints, LineTimer, FunctionTimer, get_debugger_output

checkpoints = Checkpoints()

checkpoints.add('checkpoint_name')
checkpoints.start('checkpoint_name')

with LineTimer('my_block_of_code'):
    # Your code here
    pass

checkpoints.stop('checkpoint_name')

@FunctionTimer
def my_function():
    # Your code here
    pass

my_function()

output = get_debugger_output()

print(output)
```

--------------------------------

### Heartbeat Key Methods

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

These are the key methods involved in the heartbeat functionality. `start_ping` initiates the process, `ping_loop` handles the periodic sending of pings, and `_send_ping` performs the actual request after loading job state.

```python
start_ping()           # Fork process and start ping loop
ping_loop()            # Infinite loop sending pings every PING_INTERVAL
_send_ping()           # Load job state, construct URL, HTTP GET
```

--------------------------------

### Async HTTP Client Session Configuration

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Configures an aiohttp.ClientSession with unlimited connections for efficient HTTP communication. This is useful for high-throughput scenarios.

```python
class AsyncClientSession(aiohttp.ClientSession):
    def __init__(self):
        connector = aiohttp.TCPConnector(limit=0)  # Unlimited connections
        super().__init__(connector=connector)
```

--------------------------------

### Runpod Launch Endpoint with Template

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Launch a new Runpod endpoint using a YAML template file. This command is part of the 'launch endpoint' subcommand.

```bash
runpod launch endpoint --template-file template.yaml
```

--------------------------------

### Run Compiled GPU Test Binary

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Execute the compiled binary on a GPU machine to test its functionality and verify expected output.

```bash
# Run the compiled binary
./runpod/serverless/binaries/gpu_test

# Expected output:
# Linux Kernel Version: 5.15.0
# CUDA Driver Version: 12.2
# Found X GPUs:
# GPU 0: [GPU Name] (UUID: ...)
# GPU 0 memory allocation test passed.
# ...
```

--------------------------------

### Run Benchmark Comparison Script

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Compare two benchmark result files using the provided Python script. Requires baseline and optimized JSON files as arguments.

```bash
uv run python scripts/compare_benchmarks.py benchmark_results/cold_start_baseline.json benchmark_results/cold_start_latest.json
```

--------------------------------

### get_gpus

Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md

Retrieves a list of all available GPUs and their specifications.

```APIDOC
## get_gpus

### Description
Retrieves a list of all available GPUs and their specifications.

### Method
```python
runpod.get_gpus()
```

### Parameters
None

### Response
Returns a list of dictionaries, where each dictionary represents a GPU with its details.

#### Response Example
```json
[
  {'id': 'NVIDIA A100 80GB PCIe', 'displayName': 'A100 80GB', 'memoryInGb': 80},
  {'id': 'NVIDIA A100-SXM4-80GB', 'displayName': 'A100 SXM 80GB', 'memoryInGb': 80},
  {'id': 'NVIDIA A30', 'displayName': 'A30', 'memoryInGb': 24}
]
```
```

--------------------------------

### Set Global API Key (Default)

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Demonstrates setting the global API key, which is used by default for all endpoint operations.

```python
import runpod

# Set global API key
runpod.api_key = "your_runpod_api_key"

# All endpoints will use this key by default
endpoint = runpod.Endpoint("ENDPOINT_ID")
result = endpoint.run_sync({"input": "data"})
```

--------------------------------

### Clone runpod-python Repository

Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md

Clone the forked runpod-python repository to your local machine.

```bash
git clone https://github.com/<your-username>/runpod-python.git
```

--------------------------------

### Fast GPU Availability Check

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Implement quick checks for essential resources like GPU availability. Avoid time-consuming operations such as model training within fitness checks.

```python
# Good: Quick checks
@runpod.serverless.register_fitness_check
def check_gpu():
    import torch
    if not torch.cuda.is_available():
        raise RuntimeError("GPU not available")

# Avoid: Time-consuming operations
@runpod.serverless.register_fitness_check
def slow_check():
    import torch
    # Don't: Train a model or process large data
    model.train()  # This is too slow!
```

--------------------------------

### RunPod Serverless Configuration Schema

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Defines the structure for configuring the RunPod serverless client, including handler, arguments, and concurrency settings. Ensure 'handler' is a Callable and other arguments match the specified types.

```python
config = {
    "handler": Callable,                    # User-defined handler function
    "rp_args": {
        "rp_log_level": str,                # ERROR, WARN, INFO, DEBUG
        "rp_debugger": bool,                # Enable debugger output
        "rp_serve_api": bool,               # Start local FastAPI server
        "rp_api_port": int,                 # FastAPI port (default: 8000)
        "rp_api_host": str,                 # FastAPI host (default: localhost)
        "rp_api_concurrency": int,          # FastAPI workers (default: 1)
        "test_input": dict,                 # Local test job input
    },
    "concurrency_modifier": Callable,       # Dynamic concurrency adjustment
    "refresh_worker": bool,                 # Kill worker after job completion
    "return_aggregate_stream": bool,        # Aggregate streaming outputs
    "reference_counter_start": float,       # Performance benchmarking timestamp
}
```

--------------------------------

### Generator User Handler

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Shows how to implement a generator-based handler using `yield`. This is suitable for streaming results incrementally as they become available, rather than waiting for the entire computation to complete.

```python
def handler(job: dict) -> Generator[dict, None, None]:
    for i in range(10):
        yield {"partial": i}

```

--------------------------------

### Check Environment Variables

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Ensures that essential environment variables (API_KEY, MODEL_PATH, CONFIG_URL) are set. Raises a RuntimeError listing any missing variables.

```python
import runpod
import os

@runpod.serverless.register_fitness_check
def check_environment():
    """Verify required environment variables are set."""
    required_vars = ["API_KEY", "MODEL_PATH", "CONFIG_URL"]
    missing = [var for var in required_vars if not os.environ.get(var)]

    if missing:
        raise RuntimeError(f"Missing environment variables: {', '.join(missing)}")
```

--------------------------------

### Test Input Argument

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/local_testing.md

Pass the input as a command-line argument to your Python handler file for local testing.

```bash
python your_handler.py --test_input '{"input": {"your_model_input_key": "your_model_input_value"}}'
```

--------------------------------

### Create a New Branch

Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md

Create a new branch for your code edits.

```bash
git checkout -b name-of-your-branch
```

--------------------------------

### Run Job Acquisition and Processing Tasks

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

This snippet shows how to concurrently run tasks for fetching jobs and processing them using asyncio. It utilizes an AsyncClientSession for API interactions and asyncio.gather to await both tasks.

```python
async def run():
    async with AsyncClientSession() as session:
        jobtake_task = asyncio.create_task(get_jobs(session))
        jobrun_task = asyncio.create_task(run_jobs(session))
        await asyncio.gather(jobtake_task, jobrun_task)
```

--------------------------------

### Check Async Model Loading

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Asynchronously verifies if models can be loaded from a checkpoint file. Raises a RuntimeError if the loading process fails.

```python
import runpod
import aiofiles.os

@runpod.serverless.register_fitness_check
async def check_models_loadable():
    """Verify models can be loaded (async)."""
    import torch

    try:
        # Test load model
        model = torch.load("/models/checkpoint.pt")
        del model  # Free memory
    except Exception as e:
        raise RuntimeError(f"Failed to load model: {e}")
```

--------------------------------

### Grouped GPU and Model Fitness Checks

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Organize fitness checks into logical groups, such as all GPU-related checks together and all model-related checks together, for better maintainability.

```python
# GPU checks
@runpod.serverless.register_fitness_check
def check_gpu_available():
    # ...

@runpod.serverless.register_fitness_check
def check_gpu_memory():
    # ...

# Model checks
@runpod.serverless.register_fitness_check
def check_model_files():
    # ...

@runpod.serverless.register_fitness_check
def check_model_loadable():
    # ...
```

--------------------------------

### Check Git Status and Restore

Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md

Use these commands to check the current git status and manually restore the repository if a script failed mid-execution. This is useful for troubleshooting benchmark script failures.

```bash
git status
```

```bash
git checkout <original-branch>
git stash pop
```

--------------------------------

### Check Disk Space

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md

Verifies if there is sufficient free disk space on the root partition. Raises a RuntimeError if the free space is less than the required amount (50GB by default).

```python
import runpod
import shutil

@runpod.serverless.register_fitness_check
def check_disk_space():
    """Verify sufficient disk space for operations."""
    stat = shutil.disk_usage("/")
    free_gb = stat.free / (1024**3)
    required_gb = 50  # Adjust based on your needs

    if free_gb < required_gb:
        raise RuntimeError(
            f"Insufficient disk space: {free_gb:.2f}GB free, "
            f"need at least {required_gb}GB"
        )
```

--------------------------------

### Dynamically Scale Concurrency with Queue Draining

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

This function demonstrates dynamic scaling of concurrency by creating a new asyncio.Queue with a specified maxsize. It includes a crucial step to wait for the current queue to drain before resizing, which can be a blocking operation.

```python
async def set_scale():
    new_concurrency = concurrency_modifier(current_concurrency)

    # Wait for queue to drain before resizing
    while current_occupancy() > 0:
        await asyncio.sleep(1)

    self.jobs_queue = asyncio.Queue(maxsize=new_concurrency)
```

--------------------------------

### Build GPU Test Binary with Custom CUDA Version

Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md

Specify a CUDA_VERSION environment variable to target a specific CUDA version during the build. The default is CUDA 11.8.0.

```bash
cd build_tools
CUDA_VERSION=12.1.0 ./compile_gpu_test.sh
```

```bash
cd build_tools
CUDA_VERSION=11.8.0 ./compile_gpu_test.sh
```

--------------------------------

### Runpod SSH Key Management

Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md

Manage SSH keys for accessing your Runpod instances. Use 'list-keys' to view existing keys and 'add-key' to add a new one.

```bash
runpod ssh list-keys
```

```bash
runpod ssh add-key
```

--------------------------------

### Run Endpoint Synchronously

Source: https://github.com/runpod/runpod-python/blob/main/README.md

Initiate a run request to an endpoint and wait for the result. Returns job status if not completed within 90 seconds.

```python
endpoint = runpod.Endpoint("ENDPOINT_ID")

run_request = endpoint.run_sync(
    {"your_model_input_key": "your_model_input_value"}
)

# Returns the job results if completed within 90 seconds, otherwise, returns the job status.
print(run_request )
```

--------------------------------

### Optional Webhook Environment Variables

Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md

Lists optional environment variables for streaming output and heartbeats. These provide additional control and monitoring capabilities for the worker's operation.

```bash
RUNPOD_WEBHOOK_POST_STREAM="https://api.runpod.io/v2/.../stream/$ID"
RUNPOD_WEBHOOK_PING="https://api.runpod.io/v2/.../ping"
RUNPOD_POD_ID="worker-12345"
RUNPOD_POD_HOSTNAME="worker-12345.runpod.io"
RUNPOD_PING_INTERVAL="10"

```