### Install Runpod from GitHub (Latest) Source: https://github.com/runpod/runpod-python/blob/main/README.md Install the latest development version from the main branch of the GitHub repository using pip or uv. ```bash pip install git+https://github.com/runpod/runpod-python.git ``` ```bash uv add git+https://github.com/runpod/runpod-python.git ``` -------------------------------- ### Install Runpod from PyPI Source: https://github.com/runpod/runpod-python/blob/main/README.md Install the stable release of the Runpod Python library using pip or uv. ```bash pip install runpod ``` ```bash uv add runpod ``` -------------------------------- ### Basic Serverless Worker Example Source: https://github.com/runpod/runpod-python/blob/main/README.md A simple Python script defining a handler function to check if a number is even and starting the Runpod serverless API. This script should be run as the default container start command. ```python # my_worker.py import runpod def is_even(job): job_input = job["input"] the_number = job_input["number"] if not isinstance(the_number, int): return {"error": "Silly human, you need to pass an integer."} if the_number % 2 == 0: return True return False runpod.serverless.start({"handler": is_even}) ``` -------------------------------- ### Install Runpod from GitHub (Specific Branch/Tag) Source: https://github.com/runpod/runpod-python/blob/main/README.md Install a specific branch or tag/release of the Runpod Python library from GitHub. ```bash pip install git+https://github.com/runpod/runpod-python.git@branch-name ``` ```bash pip install git+https://github.com/runpod/runpod-python.git@v1.0.0 ``` -------------------------------- ### Install Test Dependencies Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Ensure all necessary dependencies for testing are installed using the `uv sync` command with the `test` group. This is crucial for avoiding import errors during benchmarking. ```bash uv sync --group test ``` -------------------------------- ### Check Docker Installation Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Verify that Docker is installed and accessible on your system before proceeding with the build process. ```bash docker --version ``` -------------------------------- ### Runpod Credentials File Example Source: https://github.com/runpod/runpod-python/blob/main/docs/getting_Started.md Example TOML file structure for storing Runpod API key. Ensure this file is saved to ~/.runpod/credentials.toml. ```toml [profile] api_key = "YOUR_RUNPOD_API_KEY" ``` -------------------------------- ### Example CUDA Initialization Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Indicates that CUDA devices were successfully initialized, reporting the number of devices that are ready. ```log CUDA initialization passed: 2 device(s) initialized successfully ``` -------------------------------- ### Install Runpod in Editable Mode Source: https://github.com/runpod/runpod-python/blob/main/README.md Install the Runpod Python library in editable mode for development purposes by cloning the repository and installing with pip. ```bash git clone https://github.com/runpod/runpod-python.git cd runpod-python pip install -e . ``` -------------------------------- ### Automate Branch Cold Start Benchmarking Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Run the cold start benchmark script on the current branch or a specified branch. It handles git operations and dependency installation. ```bash ./scripts/benchmark_cold_start.sh ``` ```bash ./scripts/benchmark_cold_start.sh main ``` ```bash ./scripts/benchmark_cold_start.sh main feature/lazy-loading ``` -------------------------------- ### Complete Fitness Checks Example Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md This comprehensive example demonstrates how to define various fitness checks using the `@runpod.serverless.register_fitness_check` decorator. It includes checks for GPU availability and memory, model file existence, disk space, environment variables, and API reachability. ```python import runpod import os import torch import shutil from pathlib import Path import aiohttp # GPU checks @runpod.serverless.register_fitness_check def check_gpu(): """Verify GPU is available.""" if not torch.cuda.is_available(): raise RuntimeError("GPU not available") @runpod.serverless.register_fitness_check def check_gpu_memory(): """Verify GPU has sufficient memory.""" gpu_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3) if gpu_memory < 8: raise RuntimeError(f"GPU memory too low: {gpu_memory:.1f}GB (need 8GB)") # File checks @runpod.serverless.register_fitness_check def check_models_exist(): """Verify model files exist.""" model_path = Path("/models/model.safetensors") if not model_path.exists(): raise RuntimeError(f"Model not found: {model_path}") # Resource checks @runpod.serverless.register_fitness_check def check_disk_space(): """Verify sufficient disk space.""" stat = shutil.disk_usage("/") free_gb = stat.free / (1024**3) if free_gb < 50: raise RuntimeError(f"Insufficient disk space: {free_gb:.1f}GB free") # Environment checks @runpod.serverless.register_fitness_check def check_environment(): """Verify environment variables.""" required = ["API_KEY", "MODEL_ID"] missing = [v for v in required if not os.environ.get(v)] if missing: raise RuntimeError(f"Missing env vars: {', '.join(missing)}") # Async API check @runpod.serverless.register_fitness_check async def check_api(): """Verify API is reachable.""" try: async with aiohttp.ClientSession() as session: async with session.get("https://api.example.com/health", timeout=5) as resp: if resp.status != 200: raise RuntimeError(f"API returned {resp.status}") except Exception as e: raise RuntimeError(f"Cannot reach API: {e}") def handler(job): """Process job.""" job_input = job["input"] # Your processing code here return {"output": "success"} if __name__ == "__main__": runpod.serverless.start({"handler": handler}) ``` -------------------------------- ### Run Cold Start Benchmark with Pytest Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Execute the cold start benchmark tests using pytest. Results are saved to JSON files. ```bash uv run pytest tests/test_performance/test_cold_start.py -v ``` ```bash uv run python tests/test_performance/test_cold_start.py ``` -------------------------------- ### RunPod Serverless Handler Example Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md A basic example of a RunPod serverless handler function. This code runs after the automatic GPU health check passes on GPU workers. ```python import runpod # GPU health check runs automatically on GPU workers # No manual registration needed! def handler(job): """Your handler runs after GPU health check passes.""" return {"output": "success"} if __name__ == "__main__": runpod.serverless.start({"handler": handler}) ``` -------------------------------- ### Local Development Server Usage Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md This command demonstrates how to start the local development server using FastAPI. It specifies the port and host for the API and enables the RunPod serving API. ```bash python worker.py --rp_serve_api --rp_api_port 8000 --rp_api_host localhost ``` -------------------------------- ### CI/CD Integration for Benchmarking Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md This YAML snippet shows how to add a cold start benchmark to a GitHub Actions workflow, including running the test and uploading the results artifact. ```yaml - name: Run cold start benchmark run: | uv run pytest tests/test_performance/test_cold_start.py --timeout=120 - name: Upload benchmark results uses: actions/upload-artifact@v3 with: name: benchmark-results path: benchmark_results/cold_start_latest.json ``` -------------------------------- ### RunPod TOML Project Configuration Example Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/runpod.toml.md This TOML file defines the configuration for a project on the Runpod platform, including project metadata, resource allocation, environment variables, and runtime settings. ```toml # Runpod Project Configuration title = "My Project" [project] uuid = "00000000" name = "My Project" base_image = "runpod/base:0.0.0" gpu_types = ["NVIDIA RTX 3090"] gpu_count = 1 storage_id = "00000000" volume_mount_path = "/runpod-volume" ports = "8080/http, 22/tcp" container_disk_size_gb = 10 [project.env_vars] VAR_NAME_1 = "value1" VAR_NAME_2 = "value2" [template] model_type = "default" model_name = "None" [runtime] python_version = "3.10" handler_path = "handler.py" requirements_path = "requirements.txt" ``` -------------------------------- ### Manage GPU Cloud Pods Source: https://github.com/runpod/runpod-python/blob/main/README.md Provides examples for interacting with RunPod GPU Cloud pods, including retrieval, creation, and termination. ```python import runpod runpod.api_key = "your_runpod_api_key_found_under_settings" # Get all my pods pods = runpod.get_pods() # Get a specific pod pod = runpod.get_pod(pod.id) # Create a pod with GPU pod = runpod.create_pod("test", "runpod/stack", "NVIDIA GeForce RTX 3070") # Create a pod with CPU pod = runpod.create_pod("test", "runpod/stack", instance_id="cpu3c-2-4") # Stop the pod runpod.stop_pod(pod.id) # Resume the pod runpod.resume_pod(pod.id) # Terminate the pod runpod.terminate_pod(pod.id) ``` -------------------------------- ### Example CUDA Version Check Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Shows a successful CUDA version check, specifying the detected version and the minimum required version. ```log CUDA version check passed: 12.2 (minimum: 11.8) ``` -------------------------------- ### Example Network Connectivity Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Demonstrates a successful network connectivity check, including the target IP and the response latency. ```log Network connectivity passed: Connected to 8.8.8.8 (45ms) ``` -------------------------------- ### Add, Start, and Stop Checkpoints with Checkpoints Class Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md Use the `Checkpoints` class to add, start, and stop timing checkpoints. This class is a singleton and aggregates timings until `get_debugger_output` is called. ```python from rp_debugger import Checkpoints checkpoints = Checkpoints() # Add a checkpoint checkpoints.add('checkpoint_name') # Start a checkpoint checkpoints.start('checkpoint_name') # Stop a checkpoint checkpoints.stop('checkpoint_name') ``` -------------------------------- ### Example Memory Check Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Shows a successful memory availability check, indicating the amount of available and total system memory. ```log Memory check passed: 12.00GB available (of 16.00GB total) ``` -------------------------------- ### Download multiple files Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_download.md This example demonstrates how to download multiple files concurrently by providing a list of URLs. The function returns a list of absolute paths for all successfully downloaded files. ```python from runpod.serverless.utils import download_files_from_urls job_id = "job_123" urls = [ "https://example.com/file1.txt", "https://example.com/file2.png", "https://example.com/file3.pdf" ] downloaded_files = download_files_from_urls(job_id, urls) for i, file_path in enumerate(downloaded_files): print(f"Downloaded file {i + 1}: {file_path}") ``` -------------------------------- ### Example GPU Compute Benchmark Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Shows a successful GPU compute benchmark, reporting the time taken for a matrix multiplication operation. ```log GPU compute benchmark passed: Matrix multiply completed in 25ms ``` -------------------------------- ### API Key Precedence Example Source: https://github.com/runpod/runpod-python/blob/main/README.md Illustrates the precedence order for API keys: endpoint instance key overrides the global key. ```python import runpod # Example showing precedence runpod.api_key = "GLOBAL_KEY" # This endpoint uses GLOBAL_KEY endpoint1 = runpod.Endpoint("ENDPOINT_ID") # This endpoint uses ENDPOINT_KEY (overrides global) endpoint2 = runpod.Endpoint("ENDPOINT_ID", api_key="ENDPOINT_KEY") # All requests from endpoint2 will use ENDPOINT_KEY result = endpoint2.run_sync({"input": "data"}) ``` -------------------------------- ### Example Disk Space Check Log Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Indicates a successful disk space check, showing the amount of free space and its percentage of the total disk size. ```log Disk space check passed: 50.00GB free (50.0% available) ``` -------------------------------- ### Get All GPUs Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md Fetches a list of all available GPUs from RunPod. Requires API key to be set. ```python import runpod runpod.api_key = "your_runpod_api_key" gpus = runpod.get_gpus() for gpu in gpus: print(gpu) ``` -------------------------------- ### Configure GPU Test Timeout Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Example of setting environment variables to customize the automatic GPU fitness check behavior, such as adjusting the timeout. ```python import os # Adjust timeout (default: 30 seconds) os.environ["RUNPOD_GPU_TEST_TIMEOUT"] = "60" # Override binary path (for custom/patched versions) os.environ["RUNPOD_BINARY_GPU_TEST_PATH"] = "/custom/path/gpu_test" # Cap the number of error messages parsed from gpu_test output (default: 10) os.environ["RUNPOD_GPU_MAX_ERROR_MESSAGES"] = "20" # Skip auto-registration of this check (primarily for testing) os.environ["RUNPOD_SKIP_GPU_CHECK"] = "true" ``` -------------------------------- ### Run Endpoint Asynchronously Source: https://github.com/runpod/runpod-python/blob/main/README.md Initiate a run request to an endpoint and get its status. The output is retrieved separately. ```python endpoint = runpod.Endpoint("ENDPOINT_ID") run_request = endpoint.run( {"your_model_input_key": "your_model_input_value"} ) # Check the status of the endpoint run request print(run_request.status()) # Get the output of the endpoint run request, blocking until the endpoint run is complete. print(run_request.output()) ``` -------------------------------- ### Lazy Loading Dependencies in Python Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Demonstrates lazy importing of heavy libraries like boto3 and fastapi to reduce cold start times. Import the library only when the function that uses it is called. ```python # Before import boto3 import fastapi # After def use_boto(): import boto3 # Lazy load only when needed ``` -------------------------------- ### Minimal Runpod Serverless Worker Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker.md A basic worker file that imports the runpod library, defines a handler function, and starts the serverless process. ```python import runpod def handler(job): # Handle the job and return the output return {"output": "Job completed successfully"} runpod.serverless.start({"handler": handler}) ``` -------------------------------- ### Compile GPU Test Binary in Dockerfile Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Compile the GPU test binary directly within your Dockerfile. This example uses nvcc with specific CUDA compute capabilities and links necessary libraries. ```dockerfile # Or compile in container COPY build_tools/gpu_test.c /tmp/ RUN cd /tmp && nvcc -O3 -arch=sm_70,sm_75,sm_80,sm_86 \ -o /usr/local/bin/gpu_test gpu_test.c -lnvidia-ml -lcudart_static ``` -------------------------------- ### Successful GPU Test Binary Compilation Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Example output indicating a successful compilation of the gpu_test binary, including details on CUDA and Ubuntu versions used, and the final binary information. ```text Compiling gpu_test binary... CUDA Version: 11.8.0 Ubuntu Version: ubuntu22.04 Output directory: .../runpod/serverless/binaries Compilation successful Binary successfully created at: .../runpod/serverless/binaries/gpu_test Binary info: /path/to/gpu_test: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), ... ``` -------------------------------- ### Serverless Worker with Fitness Checks Source: https://github.com/runpod/runpod-python/blob/main/README.md Example of a Python worker script that includes fitness checks for GPU availability and disk space using decorators. These checks run at startup to ensure the environment is ready. ```python # my_worker.py import runpod import torch # Register fitness checks using the decorator @runpod.serverless.register_fitness_check def check_gpu_available(): """Verify GPU is available.""" if not torch.cuda.is_available(): raise RuntimeError("GPU not available") @runpod.serverless.register_fitness_check def check_disk_space(): """Verify sufficient disk space.""" import shutil stat = shutil.disk_usage("/") free_gb = stat.free / (1024**3) if free_gb < 10: raise RuntimeError(f"Insufficient disk space: {free_gb:.2f}GB free") def handler(job): job_input = job["input"] # Your handler code here return {"output": "success"} # Fitness checks run before handler initialization (production only) runpod.serverless.start({"handler": handler}) ``` -------------------------------- ### Upload Local File to S3 Bucket Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_upload.md Use this function to upload a local file to an S3-compatible bucket. Ensure boto3 is installed and S3 credentials are configured. Optionally specify a bucket name and prefix. ```python from runpod.serverless.utils import upload_file_to_bucket # Define your bucket credentials bucket_creds = { 'endpointUrl': 'https://your-bucket-endpoint-url.com', 'accessId': 'your_key_id', 'accessSecret': 'your_secret_access_key' } # Define the file name and file location file_name = 'example.txt' file_location = '/path/to/your/local/file/example.txt' # Optional: Define a bucket_name and prefix bucket_name = 'custom-bucket-name' prefix = 'your-prefix' # Upload the file and get the presigned URL presigned_url = upload_file_to_bucket(file_name, file_location, bucket_creds, bucket_name, prefix) # Print the presigned URL print(f"Presigned URL: {presigned_url}") ``` -------------------------------- ### Basic GPU Test Binary Build Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Navigate to the build tools directory and execute the compile script to build the gpu_test binary. The output is placed in ../runpod/serverless/binaries/gpu_test. ```bash cd build_tools ./compile_gpu_test.sh ``` -------------------------------- ### Runpod CLI Help Overview Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Display the main help message for the Runpod CLI, showing available top-level commands. ```bash runpod --help ``` -------------------------------- ### Comparing Multiple Approaches Workflow Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md This bash script demonstrates how to compare multiple optimization branches by running benchmarks for each and then using a comparison script. ```bash # Compare three different optimization branches ./scripts/benchmark_cold_start.sh main > results_main.txt ./scripts/benchmark_cold_start.sh feature/approach-1 > results_1.txt ./scripts/benchmark_cold_start.sh feature/approach-2 > results_2.txt # Then compare each against baseline uv run python scripts/compare_benchmarks.py \ benchmark_results/cold_start_main_*.json \ benchmark_results/cold_start_approach-1_*.json ``` -------------------------------- ### Run Tests with uv or pip Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md Execute tests to ensure changes do not break existing functionality. uv is recommended for speed. ```bash # Using uv (recommended - faster) uv sync --group test uv run pytest # Or using pip pip install '.[test]' pytest ``` -------------------------------- ### Get Specific GPU Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md Fetches details for a specific GPU by its ID. Requires API key to be set. ```python gpu_id = "NVIDIA A100 80GB PCIe" gpu = runpod.get_gpu(gpu_id) print(gpu) ``` -------------------------------- ### Build GPU Test Binary with Custom Ubuntu Version Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Specify a UBUNTU_VERSION environment variable to use a different Ubuntu base image for the build. The default is Ubuntu 22.04. ```bash cd build_tools UBUNTU_VERSION=ubuntu20.04 ./compile_gpu_test.sh ``` ```bash cd build_tools UBUNTU_VERSION=ubuntu22.04 ./compile_gpu_test.sh ``` -------------------------------- ### Test Input JSON File Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/local_testing.md Create this JSON file in your project's root directory to provide input for local testing. ```json { "input": { "your_model_input_key": "your_model_input_value" } } ``` -------------------------------- ### Check CUDA Driver Version Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Use nvidia-smi to check the installed CUDA driver version on your system. This helps diagnose 'version mismatch' errors. ```bash # Check CUDA driver version nvidia-smi ``` -------------------------------- ### Register Fitness Checks in Order Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Fitness checks are executed in the order they are registered using the `@runpod.serverless.register_fitness_check` decorator. This example shows two checks that will run sequentially. ```python import runpod @runpod.serverless.register_fitness_check def check_first(): print("This runs first") @runpod.serverless.register_fitness_check def check_second(): print("This runs second") ``` -------------------------------- ### Runpod Pod Management Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Manage your Runpod pods. Use 'list' to see available pods and 'create' to launch a new one. ```bash runpod pod list ``` ```bash runpod pod create ``` -------------------------------- ### Register Asynchronous Fitness Checks Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Register asynchronous functions as fitness checks using the `@runpod.serverless.register_fitness_check` decorator. This example demonstrates checking external API connectivity. ```python import runpod import aiohttp @runpod.serverless.register_fitness_check async def check_api_connectivity(): """Check if external API is accessible.""" async with aiohttp.ClientSession() as session: try: async with session.get("https://api.example.com/health", timeout=5) as resp: if resp.status != 200: raise RuntimeError(f"API health check failed: {resp.status}") except Exception as e: raise RuntimeError(f"Cannot connect to API: {e}") def handler(job): return {"output": "success"} if __name__ == "__main__": runpod.serverless.start({"handler": handler}) ``` -------------------------------- ### Navigate to Local Repository Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md Change the current directory to your local runpod-python repository. ```bash cd runpod-python ``` -------------------------------- ### Heartbeat API Endpoint Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Specifies the GET endpoint for sending heartbeats to the RunPod platform. Allows reporting multiple job IDs and indicating if a retry is needed. ```http GET {RUNPOD_WEBHOOK_PING}?job_id={comma_separated_ids}&retry_ping={0|1} ``` -------------------------------- ### Runpod Pod Connection Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Connect to a running Runpod pod. ```bash runpod pod connect ``` -------------------------------- ### Configure Built-in Checks in Python Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Shows how to set environment variables in Python to configure the thresholds for built-in system checks. ```python import os os.environ["RUNPOD_MIN_MEMORY_GB"] = "8.0" os.environ["RUNPOD_MIN_DISK_PERCENT"] = "15.0" ``` -------------------------------- ### API Health Check with Retries Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Implement retry logic for fitness checks that might encounter transient network issues. This example uses aiohttp and asyncio for asynchronous retries. ```python import runpod import aiohttp import asyncio @runpod.serverless.register_fitness_check async def check_api_with_retry(): """Check API connectivity with retries.""" max_retries = 3 for attempt in range(max_retries): try: async with aiohttp.ClientSession() as session: async with session.get("https://api.example.com/health", timeout=5) as resp: if resp.status == 200: return except Exception as e: if attempt == max_retries - 1: raise RuntimeError(f"API check failed after {max_retries} attempts: {e}") await asyncio.sleep(1) # Wait before retry ``` -------------------------------- ### Job Acquisition API Endpoint Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Defines the GET endpoints for acquiring jobs from the RunPod platform. Supports fetching a single job or a batch, and indicates if jobs are currently in progress. ```http GET {RUNPOD_WEBHOOK_GET_JOB}?job_in_progress={0|1} ``` ```http GET {RUNPOD_WEBHOOK_GET_JOB}/batch?batch_size={N}&job_in_progress={0|1} ``` -------------------------------- ### Configure Runpod API Key Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/start_here.md Store your Runpod API key. Optionally, provide the key directly or use the --profile flag to manage multiple keys. ```bash runpod config ``` ```bash runpod config YOUR_API_KEY ``` ```bash runpod config --profile my-profile ``` -------------------------------- ### Verify GPU Binary Properties Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Check the file type, size, and executability of the compiled GPU binary. ```bash # Check binary info file runpod/serverless/binaries/gpu_test ``` ```bash # Check binary size ls -lh runpod/serverless/binaries/gpu_test ``` ```bash # Verify executable test -x runpod/serverless/binaries/gpu_test && echo "Binary is executable" ``` -------------------------------- ### Enable Worker Refresh on Start Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker.md Configure the worker to automatically refresh after every job completion, even if the handler encounters an error. This is useful for complex operations requiring a clean state. ```python from runpod.serverless import start def handler(job): # Handle the job and return the output return {"output": "Job completed successfully"} start({"handler": handler, "refresh_worker": True}) ``` -------------------------------- ### Runpod CLI Configuration Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Configure your Runpod API key for authentication. You will be prompted for your API key and profile name. ```bash runpod config ``` -------------------------------- ### Verify Binary Integrity Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Use the 'file' command to check the integrity and architecture of the compiled binary. This helps diagnose 'cannot execute binary' errors. ```bash # Verify binary integrity file runpod/serverless/binaries/gpu_test ``` -------------------------------- ### Get Combined Debugger Output Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md Call `get_debugger_output` to retrieve a dictionary containing system information and all recorded checkpoint timings. This function clears the debugger's internal state after retrieval. ```python from rp_debugger import get_debugger_output output = get_debugger_output() ``` -------------------------------- ### Testing a Performance Optimization Workflow Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md This bash script outlines the steps to test a performance optimization by saving a baseline, switching to a feature branch, running the benchmark, and comparing results. ```bash # 1. Save baseline on main branch git checkout main ./scripts/benchmark_cold_start.sh cp benchmark_results/cold_start_latest.json benchmark_results/cold_start_baseline.json # 2. Switch to feature branch git checkout feature/my-optimization # 3. Run benchmark and compare ./scripts/benchmark_cold_start.sh uv run python scripts/compare_benchmarks.py \ benchmark_results/cold_start_baseline.json \ benchmark_results/cold_start_latest.json ``` -------------------------------- ### Copy Pre-compiled Binary to Dockerfile Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Include a pre-compiled GPU test binary in your Docker image by copying it to a location like /usr/local/bin. ```dockerfile # Copy pre-compiled binary from runpod-python COPY runpod/serverless/binaries/gpu_test /usr/local/bin/ ``` -------------------------------- ### Upload In-Memory Object to S3 Bucket Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_upload.md Use this function to upload data directly from memory (as bytes) to an S3-compatible bucket. Ensure boto3 is installed and S3 credentials are configured. The bucket name must be included in the bucket_creds dictionary for this function. ```python from runpod.serverless.utils import upload_in_memory_object # Define your bucket credentials bucket_creds = { 'endpointUrl': 'https://your-bucket-endpoint-url.com', 'accessId': 'your_key_id', 'accessSecret': 'your_secret_access_key', 'bucketName': 'your_bucket_name' } # Define the file name and file data (bytes) file_name = 'example.txt' file_data = b'This is an example text.' # Optional: Define a bucket_name and prefix bucket_name = 'custom-bucket-name' prefix = 'your-prefix' # Upload the in-memory object and get the presigned URL presigned_url = upload_in_memory_object(file_name, file_data, bucket_creds, bucket_name, prefix) # Print the presigned URL print(f"Presigned URL: {presigned_url}") ``` -------------------------------- ### Runpod Launch Pod with Template Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Launch a new Runpod pod using a YAML template file. This command is part of the 'launch pod' subcommand. ```bash runpod launch pod --template-file template.yaml ``` -------------------------------- ### Heartbeat Process Architecture Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md This diagram shows the heartbeat mechanism where the main worker process forks a separate process for pinging. The heartbeat process periodically loads job state from disk and sends an HTTP GET request to the RunPod API. ```mermaid graph TB MAIN[Main Worker Process] -->|fork| PING[Heartbeat Process] PING -->|every 10s| LOAD[Load Job State from Disk] LOAD --> GET[HTTP GET Ping Endpoint] GET -->|job_ids| API[RunPod API] style PING fill:#f57c00,stroke:#e65100,stroke-width:3px,color:#fff style LOAD fill:#d32f2f,stroke:#b71c1c,stroke-width:3px,color:#fff ``` -------------------------------- ### JobsProgress Class Methods Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Lists the key methods available in the `JobsProgress` singleton class for managing active jobs, including adding, removing, and retrieving job information. ```python add(job) # Add job to set, persist to disk remove(job) # Remove job from set, persist to disk get_job_list() # Return comma-separated job IDs get_job_count() # Return number of active jobs _save_state() # Serialize set to pickle file with lock _load_state() # Deserialize set from pickle file with lock ``` -------------------------------- ### CUDA Initialization Failure Scenario Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Illustrates an error message when CUDA device initialization fails, often due to the device being busy or unavailable. ```log ERROR | Fitness check failed: _cuda_init_check | RuntimeError: Failed to initialize GPU 0: CUDA error: CUDA-capable device(s) is/are busy or unavailable ``` -------------------------------- ### Recompile GPU Test Binary Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md If the binary is corrupted or for the wrong architecture, try recompiling it. Navigate to the build_tools directory before running the compile script. ```bash # Try recompiling cd build_tools && ./compile_gpu_test.sh ``` -------------------------------- ### Configure Built-in Checks in Dockerfile Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Demonstrates how to set environment variables in a Dockerfile to configure the thresholds for various built-in system checks. ```dockerfile # In your Dockerfile or container config ENV RUNPOD_MIN_MEMORY_GB=8.0 ENV RUNPOD_MIN_DISK_PERCENT=15.0 ENV RUNPOD_MIN_CUDA_VERSION=12.0 ENV RUNPOD_NETWORK_CHECK_TIMEOUT=10 ENV RUNPOD_GPU_BENCHMARK_TIMEOUT=2 ``` -------------------------------- ### Compare Two Benchmark Result Files Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Analyze and visualize differences between two benchmark runs using colored terminal output. Takes baseline and optimized JSON files as input. ```bash uv run python scripts/compare_benchmarks.py ``` -------------------------------- ### Run GPU Test Binary Source: https://github.com/runpod/runpod-python/blob/main/runpod/serverless/binaries/README.md Execute the pre-compiled GPU health check binary to test CUDA GPU memory allocation and report system information. Ensure the binary is in the correct path. ```bash ./runpod/serverless/binaries/gpu_test ``` -------------------------------- ### Integrate All RP Debugger Utilities Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/utils/rp_debugger.md Combine `Checkpoints`, `LineTimer`, `FunctionTimer`, and `get_debugger_output` to comprehensively profile serverless function execution. Timings are aggregated and cleared upon calling `get_debugger_output`. ```python from rp_debugger import Checkpoints, LineTimer, FunctionTimer, get_debugger_output checkpoints = Checkpoints() checkpoints.add('checkpoint_name') checkpoints.start('checkpoint_name') with LineTimer('my_block_of_code'): # Your code here pass checkpoints.stop('checkpoint_name') @FunctionTimer def my_function(): # Your code here pass my_function() output = get_debugger_output() print(output) ``` -------------------------------- ### Heartbeat Key Methods Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md These are the key methods involved in the heartbeat functionality. `start_ping` initiates the process, `ping_loop` handles the periodic sending of pings, and `_send_ping` performs the actual request after loading job state. ```python start_ping() # Fork process and start ping loop ping_loop() # Infinite loop sending pings every PING_INTERVAL _send_ping() # Load job state, construct URL, HTTP GET ``` -------------------------------- ### Async HTTP Client Session Configuration Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Configures an aiohttp.ClientSession with unlimited connections for efficient HTTP communication. This is useful for high-throughput scenarios. ```python class AsyncClientSession(aiohttp.ClientSession): def __init__(self): connector = aiohttp.TCPConnector(limit=0) # Unlimited connections super().__init__(connector=connector) ``` -------------------------------- ### Runpod Launch Endpoint with Template Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Launch a new Runpod endpoint using a YAML template file. This command is part of the 'launch endpoint' subcommand. ```bash runpod launch endpoint --template-file template.yaml ``` -------------------------------- ### Run Compiled GPU Test Binary Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Execute the compiled binary on a GPU machine to test its functionality and verify expected output. ```bash # Run the compiled binary ./runpod/serverless/binaries/gpu_test # Expected output: # Linux Kernel Version: 5.15.0 # CUDA Driver Version: 12.2 # Found X GPUs: # GPU 0: [GPU Name] (UUID: ...) # GPU 0 memory allocation test passed. # ... ``` -------------------------------- ### Run Benchmark Comparison Script Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Compare two benchmark result files using the provided Python script. Requires baseline and optimized JSON files as arguments. ```bash uv run python scripts/compare_benchmarks.py benchmark_results/cold_start_baseline.json benchmark_results/cold_start_latest.json ``` -------------------------------- ### get_gpus Source: https://github.com/runpod/runpod-python/blob/main/docs/api/queries.md Retrieves a list of all available GPUs and their specifications. ```APIDOC ## get_gpus ### Description Retrieves a list of all available GPUs and their specifications. ### Method ```python runpod.get_gpus() ``` ### Parameters None ### Response Returns a list of dictionaries, where each dictionary represents a GPU with its details. #### Response Example ```json [ {'id': 'NVIDIA A100 80GB PCIe', 'displayName': 'A100 80GB', 'memoryInGb': 80}, {'id': 'NVIDIA A100-SXM4-80GB', 'displayName': 'A100 SXM 80GB', 'memoryInGb': 80}, {'id': 'NVIDIA A30', 'displayName': 'A30', 'memoryInGb': 24} ] ``` ``` -------------------------------- ### Set Global API Key (Default) Source: https://github.com/runpod/runpod-python/blob/main/README.md Demonstrates setting the global API key, which is used by default for all endpoint operations. ```python import runpod # Set global API key runpod.api_key = "your_runpod_api_key" # All endpoints will use this key by default endpoint = runpod.Endpoint("ENDPOINT_ID") result = endpoint.run_sync({"input": "data"}) ``` -------------------------------- ### Clone runpod-python Repository Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md Clone the forked runpod-python repository to your local machine. ```bash git clone https://github.com//runpod-python.git ``` -------------------------------- ### Fast GPU Availability Check Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Implement quick checks for essential resources like GPU availability. Avoid time-consuming operations such as model training within fitness checks. ```python # Good: Quick checks @runpod.serverless.register_fitness_check def check_gpu(): import torch if not torch.cuda.is_available(): raise RuntimeError("GPU not available") # Avoid: Time-consuming operations @runpod.serverless.register_fitness_check def slow_check(): import torch # Don't: Train a model or process large data model.train() # This is too slow! ``` -------------------------------- ### RunPod Serverless Configuration Schema Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Defines the structure for configuring the RunPod serverless client, including handler, arguments, and concurrency settings. Ensure 'handler' is a Callable and other arguments match the specified types. ```python config = { "handler": Callable, # User-defined handler function "rp_args": { "rp_log_level": str, # ERROR, WARN, INFO, DEBUG "rp_debugger": bool, # Enable debugger output "rp_serve_api": bool, # Start local FastAPI server "rp_api_port": int, # FastAPI port (default: 8000) "rp_api_host": str, # FastAPI host (default: localhost) "rp_api_concurrency": int, # FastAPI workers (default: 1) "test_input": dict, # Local test job input }, "concurrency_modifier": Callable, # Dynamic concurrency adjustment "refresh_worker": bool, # Kill worker after job completion "return_aggregate_stream": bool, # Aggregate streaming outputs "reference_counter_start": float, # Performance benchmarking timestamp } ``` -------------------------------- ### Generator User Handler Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Shows how to implement a generator-based handler using `yield`. This is suitable for streaming results incrementally as they become available, rather than waiting for the entire computation to complete. ```python def handler(job: dict) -> Generator[dict, None, None]: for i in range(10): yield {"partial": i} ``` -------------------------------- ### Check Environment Variables Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Ensures that essential environment variables (API_KEY, MODEL_PATH, CONFIG_URL) are set. Raises a RuntimeError listing any missing variables. ```python import runpod import os @runpod.serverless.register_fitness_check def check_environment(): """Verify required environment variables are set.""" required_vars = ["API_KEY", "MODEL_PATH", "CONFIG_URL"] missing = [var for var in required_vars if not os.environ.get(var)] if missing: raise RuntimeError(f"Missing environment variables: {', '.join(missing)}") ``` -------------------------------- ### Test Input Argument Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/local_testing.md Pass the input as a command-line argument to your Python handler file for local testing. ```bash python your_handler.py --test_input '{"input": {"your_model_input_key": "your_model_input_value"}}' ``` -------------------------------- ### Create a New Branch Source: https://github.com/runpod/runpod-python/blob/main/CONTRIBUTING.md Create a new branch for your code edits. ```bash git checkout -b name-of-your-branch ``` -------------------------------- ### Run Job Acquisition and Processing Tasks Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md This snippet shows how to concurrently run tasks for fetching jobs and processing them using asyncio. It utilizes an AsyncClientSession for API interactions and asyncio.gather to await both tasks. ```python async def run(): async with AsyncClientSession() as session: jobtake_task = asyncio.create_task(get_jobs(session)) jobrun_task = asyncio.create_task(run_jobs(session)) await asyncio.gather(jobtake_task, jobrun_task) ``` -------------------------------- ### Check Async Model Loading Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Asynchronously verifies if models can be loaded from a checkpoint file. Raises a RuntimeError if the loading process fails. ```python import runpod import aiofiles.os @runpod.serverless.register_fitness_check async def check_models_loadable(): """Verify models can be loaded (async).""" import torch try: # Test load model model = torch.load("/models/checkpoint.pt") del model # Free memory except Exception as e: raise RuntimeError(f"Failed to load model: {e}") ``` -------------------------------- ### Grouped GPU and Model Fitness Checks Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Organize fitness checks into logical groups, such as all GPU-related checks together and all model-related checks together, for better maintainability. ```python # GPU checks @runpod.serverless.register_fitness_check def check_gpu_available(): # ... @runpod.serverless.register_fitness_check def check_gpu_memory(): # ... # Model checks @runpod.serverless.register_fitness_check def check_model_files(): # ... @runpod.serverless.register_fitness_check def check_model_loadable(): # ... ``` -------------------------------- ### Check Git Status and Restore Source: https://github.com/runpod/runpod-python/blob/main/scripts/README.md Use these commands to check the current git status and manually restore the repository if a script failed mid-execution. This is useful for troubleshooting benchmark script failures. ```bash git status ``` ```bash git checkout git stash pop ``` -------------------------------- ### Check Disk Space Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker_fitness_checks.md Verifies if there is sufficient free disk space on the root partition. Raises a RuntimeError if the free space is less than the required amount (50GB by default). ```python import runpod import shutil @runpod.serverless.register_fitness_check def check_disk_space(): """Verify sufficient disk space for operations.""" stat = shutil.disk_usage("/") free_gb = stat.free / (1024**3) required_gb = 50 # Adjust based on your needs if free_gb < required_gb: raise RuntimeError( f"Insufficient disk space: {free_gb:.2f}GB free, " f"need at least {required_gb}GB" ) ``` -------------------------------- ### Dynamically Scale Concurrency with Queue Draining Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md This function demonstrates dynamic scaling of concurrency by creating a new asyncio.Queue with a specified maxsize. It includes a crucial step to wait for the current queue to drain before resizing, which can be a blocking operation. ```python async def set_scale(): new_concurrency = concurrency_modifier(current_concurrency) # Wait for queue to drain before resizing while current_occupancy() > 0: await asyncio.sleep(1) self.jobs_queue = asyncio.Queue(maxsize=new_concurrency) ``` -------------------------------- ### Build GPU Test Binary with Custom CUDA Version Source: https://github.com/runpod/runpod-python/blob/main/docs/serverless/gpu_binary_compilation.md Specify a CUDA_VERSION environment variable to target a specific CUDA version during the build. The default is CUDA 11.8.0. ```bash cd build_tools CUDA_VERSION=12.1.0 ./compile_gpu_test.sh ``` ```bash cd build_tools CUDA_VERSION=11.8.0 ./compile_gpu_test.sh ``` -------------------------------- ### Runpod SSH Key Management Source: https://github.com/runpod/runpod-python/blob/main/docs/cli/references/command_line_interface.md Manage SSH keys for accessing your Runpod instances. Use 'list-keys' to view existing keys and 'add-key' to add a new one. ```bash runpod ssh list-keys ``` ```bash runpod ssh add-key ``` -------------------------------- ### Run Endpoint Synchronously Source: https://github.com/runpod/runpod-python/blob/main/README.md Initiate a run request to an endpoint and wait for the result. Returns job status if not completed within 90 seconds. ```python endpoint = runpod.Endpoint("ENDPOINT_ID") run_request = endpoint.run_sync( {"your_model_input_key": "your_model_input_value"} ) # Returns the job results if completed within 90 seconds, otherwise, returns the job status. print(run_request ) ``` -------------------------------- ### Optional Webhook Environment Variables Source: https://github.com/runpod/runpod-python/blob/main/ARCHITECTURE.md Lists optional environment variables for streaming output and heartbeats. These provide additional control and monitoring capabilities for the worker's operation. ```bash RUNPOD_WEBHOOK_POST_STREAM="https://api.runpod.io/v2/.../stream/$ID" RUNPOD_WEBHOOK_PING="https://api.runpod.io/v2/.../ping" RUNPOD_POD_ID="worker-12345" RUNPOD_POD_HOSTNAME="worker-12345.runpod.io" RUNPOD_PING_INTERVAL="10" ```