### Clone Example Repository and Install Dependencies

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

To interact with the deployed API, clone the example repository, navigate to the vLLM directory, and install the necessary `openai` Python library. This sets up your local environment for communication with the Beam-hosted LLM.

```bash
git clone https://github.com/beam-cloud/examples.git
cd examples/vllm
pip install openai
```

--------------------------------

### Install Beam Client and Configure Authentication

Source: https://github.com/beam-cloud/examples/blob/main/jupyter_notebooks/beam-notebook.ipynb

Install the beam-client package and configure your authentication token. This is the initial setup required to interact with the Beam platform from your notebook.

```python
# Install beam-client
!pip install beam-client
```

```python
# Import the Beam client
import beam
```

```python
# Add your Beam API key
!beam configure default --token [YOUR-BEAM-TOKEN]
```

```python
!beam config select default
```

--------------------------------

### Install Beam Client

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl_turbo_streaming/README.md

Installs the Beam Python client library. Ensure your virtual environment is activated.

```bash
pip3 install beam-client
```

--------------------------------

### Example Deployment Command

Source: https://github.com/beam-cloud/examples/blob/main/endpoints/README.md

This is a concrete example of deploying the `multiply` function from `app.py` with the endpoint name `my-app`.

```bash
beam deploy app.py:multiply --name my-app
```

--------------------------------

### Install Beam and Reflex

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl/README.md

Install the necessary libraries for Beam and Reflex. Ensure you are in a Python virtual environment.

```bash
pip3 install reflex beam-client
```

--------------------------------

### Install MCP Server Dependencies

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/README.md

Installs server dependencies using uv. Ensure Python 3.11+ and uv are installed. Update the uv path in server.py if necessary.

```bash
uv sync 

```

--------------------------------

### Install Frontend Dependencies

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl_turbo_streaming/README.md

Installs Node.js dependencies for the Next.js frontend. Ensure Node.js is installed and meets Next.js requirements.

```bash
cd frontend
npm install
```

--------------------------------

### Install Beam and Reflex

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl_turbo/README.md

Install the necessary libraries for Beam and Reflex. Ensure you have a Python virtual environment activated.

```bash
python3 -m virtualenv .venv && source .venv/bin/activate
pip3 install reflex beam-client
```

--------------------------------

### Example API Request

Source: https://github.com/beam-cloud/examples/blob/main/web_scraping/headless_chrome/README.md

An example of a JSON request body to capture a screenshot of 'example.com'.

```json
{
  "url": "https://example.com"
}
```

--------------------------------

### Install MCP Server

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/README.md

Installs the Beam MCP server for use with Claude Desktop. After installation, interact with the server via the Claude Desktop app.

```bash
mcp install server.py

```

--------------------------------

### Example API Call with Curl

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Example of how to make a POST request to the deployed endpoint using curl.

```bash
curl -X POST 'https://app.beam.cloud/endpoint/llama-ft/v2' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{}'
```

--------------------------------

### Run Frontend Development Server

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl_turbo_streaming/README.md

Starts the Next.js development server for the frontend. Make sure the .env file is configured with Beam credentials.

```bash
cd frontend
npm run dev
```

--------------------------------

### Run Reflex Frontend

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl/README.md

Start the Reflex development server to run the frontend application. Navigate to the frontend directory before running this command.

```bash
cd frontend && reflex run
```

--------------------------------

### Example Deployment Output

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

The output from the beam deploy command indicates the stages of building, syncing, and deploying the image. It also provides invocation details.

```bash
=> Building image
=> Syncing files 
=> Deploying 
=> Deployed 🎉 
=> Invocation details 
curl -X POST 'https://app.beam.cloud/endpoint/gemma-ft/v2' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{}'
```

--------------------------------

### Deploy Clarity Upscaler Server

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/clarity_upscaler/README.md

Deploy the clarity-upscaler server using the beam CLI. Ensure Beam is installed and the repository is cloned before deployment.

```bash
beam deploy app.py:clarity_upscaler_server
```

--------------------------------

### Example API Call with Python Requests

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Example of how to make a POST request to the deployed endpoint using the Python requests library.

```python
response = requests.post(
    "https://app.beam.cloud/endpoint/llama-ft/v2", 
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_AUTH_TOKEN"
    }, 
    json={
        "prompt": "hi"
    }
)
```

--------------------------------

### Create Virtual Environment

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl_turbo_streaming/README.md

Creates a Python virtual environment for project dependencies. Activate it before installing packages.

```bash
python3 -m virtualenv .venv && source .venv/bin/activate
```

--------------------------------

### Monitor Gemma Fine-Tuning Progress

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

Example output from the terminal during Gemma fine-tuning, showing build, sync, and training progress with loss and epoch information.

```bash
=> Building image 
=> Syncing files 
...
=> Running function: <finetune:gemma_fine_tune> 
Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]
...
Generating train split: 12947 examples [00:00, 114393.80 examples/s]
...
Map:  93%|#########2| 12000/12947 [00:13<00:01, 921.12 examples/s]
...
  1%|          | 6/809 [00:08<16:35,  1.24s/it]
...
{'loss': 1.617, 'grad_norm': 0.4805833399295807, 'learning_rate': 0.00019752781211372064, 'epoch': 0.01}
...
```

--------------------------------

### Example API Response

Source: https://github.com/beam-cloud/examples/blob/main/web_scraping/headless_chrome/README.md

The API response will contain an 'output_url' pointing to the captured screenshot.

```json
{
  "output_url": "https://app.beam.cloud/output/id/9dfbb7a1-a3de-489c-a602-423b4c859f84"
}
```

--------------------------------

### Example API Call to Clarity Upscaler

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/clarity_upscaler/README.md

This Python script demonstrates how to make an API call to the deployed Clarity Upscaler service. It assumes the service is accessible via API.

```python
import requests

# Assuming the upscaler is deployed to a local server or accessible endpoint
# Replace with your actual API endpoint
api_url = "http://localhost:8000/upscale"

# Path to the image you want to upscale
image_path = "path/to/your/image.png"

with open(image_path, "rb") as f:
    files = {"file": f}
    try:
        response = requests.post(api_url, files=files)
        response.raise_for_status()  # Raise an exception for bad status codes

        # Assuming the response contains the upscaled image data
        with open("upscaled_image.png", "wb") as out_f:
            out_f.write(response.content)
        print("Image upscaled successfully and saved as upscaled_image.png")

    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")

```

--------------------------------

### Detailed Video Generation Prompt

Source: https://github.com/beam-cloud/examples/blob/main/video_models/mochi1/README.md

This is a detailed example of a prompt for video generation, specifying camera movement, subject, environment, and lighting.

```json
{
    "prompt": "The camera follows behind a rugged green Jeep with a black snorkel as it speeds along a narrow dirt trail cutting through a dense jungle. Thick vines hang from towering trees with sprawling canopies, their leaves forming a vibrant green tunnel above the vehicle. Mud splashes up from the Jeep’s tires as it powers through a shallow stream crossing the path. Sunlight filters through gaps in the trees, casting dappled golden light over the scene. The dirt trail twists sharply into the distance, overgrown with wild ferns and tropical plants. The vehicle is seen from the rear, leaning into the curve as it maneuvers through the untamed terrain, emphasizing the adventure of the rugged journey. The surrounding jungle is alive with texture and color, with distant mountains barely visible through the mist and an overcast sky heavy with the promise of rain.",
}
```

--------------------------------

### ComfyUI API Example Request

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/comfy_ui/controlnet/README.md

An example of a JSON request body for the /generate endpoint, specifying a detailed prompt and an image URL.

```json
{
  "prompt": "A photorealistic golden retriever sitting in a field of flowers, soft light, professional lens, background blur",
  "image_url": "https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcRHWNgtJvfMe4yIAOCPTpxRjPalxseEhGfMvW_B3Y3RLjhvwtCIHaYKt0D3K1h2qcWqqvvroakXVGKcAFnTDk7HhA"
}
```

--------------------------------

### Deploy Mochi-1 App

Source: https://github.com/beam-cloud/examples/blob/main/video_models/mochi1/README.md

Use this command to deploy the Mochi-1 application. Ensure you have the Beam CLI installed and configured.

```bash
beam deploy app.py:generate_video
```

--------------------------------

### Interact with vLLM API using OpenAI SDK

Source: https://github.com/beam-cloud/examples/blob/main/vllm/vision_models/README.md

This Python script demonstrates how to use the OpenAI SDK to send a multimodal request (text and image) to a vLLM API endpoint. Ensure the OpenAI SDK is installed (`pip install openai`).

```python
import base64
import requests
from openai import OpenAI

openai_api_key = "your-beam-token"
openai_api_base = "https:your-beam-app/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

image_url = "https://tinypng.com/static/images/boat.png"

chat_completion_from_url = client.chat.completions.create(
        messages=[{
            "role":
            "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_url
                    },
                },
            ],
        }],
        model="InternVL2_5-8B"
    )
print(chat_completion_from_url)
```

--------------------------------

### Create ComfyUI Server Deployment Script

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

This script sets up a Beam Pod with ComfyUI, installs necessary dependencies, downloads a specified model, and launches the server. Ensure the model variables are correctly set for your desired model.

```python
from beam import Image, Pod
ORG_NAME = "Comfy-Org"
REPO_NAME = "flux1-schnell"
WEIGHTS_FILE = "flux1-schnell-fp8.safetensors"
COMMIT = "f2808ab17fe9ff81dcf89ed0301cf644c281be0a"
image =
    ( Image()
    .add_commands(["apt update && apt install git -y"])
    .add_python_packages(
        [
            "fastapi[standard]==0.115.4",
            "comfy-cli==1.3.5",
            "huggingface_hub[hf_transfer]==0.26.2",
        ]
    )
    .add_commands(
        [
            "comfy --skip-prompt install --nvidia --version 0.3.10",
            "comfy node install was-node-suite-comfyui@1.0.2",
            "mkdir -p /root/comfy/ComfyUI/models/checkpoints/",
            f"huggingface-cli download {ORG_NAME}/{REPO_NAME} {WEIGHTS_FILE} --cache-dir /comfy-cache",
            f"ln -s /comfy-cache/models--{ORG_NAME}--{REPO_NAME}/snapshots/{COMMIT}/{WEIGHTS_FILE} /root/comfy/ComfyUI/models/checkpoints/{WEIGHTS_FILE}",
        ]
    ) )
comfyui_server = Pod(
    image=image,
    ports=[8000],
    cpu=12,
    memory="32Gi",
    gpu="A100-40",
    entrypoint=["sh", "-c", "comfy launch -- --listen 0.0.0.0 --port 8000"],
)
res = comfyui_server.create()
print("✨ ComfyUI hosted at:", res.url)
```

--------------------------------

### InstantID Beam Endpoint Response Example

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/instant_id/README.md

This is an example of the JSON response you can expect from the InstantID Beam endpoint after a successful image generation request. It contains a URL to the generated output.

```json
{
  "output_url": "https://app.beam.cloud/output/id/f43f5411-d96b-48b1-bab6-28b2defb9b36"
}
```

--------------------------------

### Deploy Zonos TTS API

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/zonos/README.md

Use this command to deploy the Zonos TTS API application. Ensure you have the Beam CLI installed and configured.

```bash
beam deploy app.py:generate
```

--------------------------------

### Download GenBank data with volumes and remote execution

Source: https://github.com/beam-cloud/examples/blob/main/bioinformatics/dnabert/README.md

This example demonstrates downloading data from GenBank using Biopython, writing it to a volume, and executing the function remotely. It requires setting `Entrez.email` and uses `env.is_remote()` to conditionally import necessary libraries.

```python
from beam import function, Image, Volume, env
import os

if env.is_remote():
    from Bio import Entrez, SeqIO

image = Image(python_packages=["biopython"])
BEAM_VOLUME_PATH = "./seq"


@function(volumes=[Volume(name="seq", mount_path=BEAM_VOLUME_PATH)], image=image)
def download(accession_number):
    Entrez.email = "your.email@example.com"

    with Entrez.efetch(
        db="nucleotide", id=accession_number, rettype="gb", retmode="text"
    ) as handle:
        record = SeqIO.read(handle, "genbank")

    file_path = os.path.join(BEAM_VOLUME_PATH, f"{accession_number}.gb")
    SeqIO.write(record, open(file_path, "w"), "genbank")

    print(
        f"Sequence ID: {record.id}\nLength: {len(record.seq)}\nDescription: {record.description}"
    )


if __name__ == "__main__":
    download.remote("CM004190.1")
```

--------------------------------

### Run Chat Script for API Interaction

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

Execute the `chat.py` script to start an interactive command-line chat application. This script will prompt you for the URL of your deployed function and allow you to communicate with the LLM.

```bash
python chat.py
```

--------------------------------

### Example API Invocation using curl

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

This curl command demonstrates how to invoke the deployed vLLM API. Replace 'YOUR_TOKEN' with your actual API token. The command sends a POST request with an empty JSON payload.

```bash
curl -X POST 'https://internvl-15c4487-v4.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer YOUR_TOKEN' \
-d '{}'
```

--------------------------------

### Create ComfyUI API Script with ASGI

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

This script exposes ComfyUI workflows as APIs using Beam's ASGI support. It sets up the environment, installs dependencies, downloads models, and defines an ASGI application for programmatic image generation.

```python
from beam import Image, asgi, Output
image =
    ( Image()
    .add_commands(["apt update && apt install git -y"])
    .add_python_packages(
        [
            "fastapi[standard]==0.115.4",
            "comfy-cli",
            "huggingface_hub[hf_transfer]==0.26.2",
        ]
    )
    .add_commands(
        [
            "yes | comfy install --nvidia --version 0.3.10",
            "comfy node install was-node-suite-comfyui@1.0.2",
            "mkdir -p /root/comfy/ComfyUI/models/checkpoints/",
            "huggingface-cli download Comfy-Org/flux1-schnell flux1-schnell-fp8.safetensors --cache-dir /comfy-cache",
            "ln -s /comfy-cache/models--Comfy-Org--flux1-schnell/snapshots/f2808ab17fe9ff81dcf89ed0301cf644c281be0a/flux1-schnell-fp8.safetensors /root/comfy/ComfyUI/models/checkpoints/flux1-schnell-fp8.safetensors",
        ]
    ) )
def init_models():
    import subprocess
    cmd = "comfy launch --background"
    subprocess.run(cmd, shell=True, check=True)

@asgi(
    name="comfy",
    image=image,
    on_start=init_models,
    cpu=8,
    memory="32Gi",
    gpu="A100-40",
    timeout=-1,
)
def handler():
    from fastapi import FastAPI, HTTPException
    import subprocess
    import json
    from pathlib import Path
    import uuid
    from typing import Dict
    app = FastAPI()
    # This is where you specify the path to your workflow file.
    # Make sure "workflow_api.json" exists in the same directory as this script.
    WORKFLOW_FILE = Path(__file__).parent / "wor"
```

--------------------------------

### Invoking the Inference Endpoint

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

When calling the deployed inference endpoint, include a prompt in the request body. The example shows how to send a prompt and the expected response format.

```bash
curl -X POST 'https://app.beam.cloud/endpoint/gemma-ft/v2' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{"prompt": "hi"}'
```

--------------------------------

### Parler TTS Model Loading and Setup

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

Loads the Parler TTS model and tokenizer for conditional generation. This function is intended to be called once at startup using Beam's `on_start`.

```python
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf
import uuid

def load_models():
    model = ParlerTTSForConditionalGeneration.from_pretrained("parler-tts/parler-tts-mini-v1").to("cuda:0")
    tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler-tts-mini-v1")
    return model, tokenizer
```

--------------------------------

### Define a Beam Function Loading Local Weights

Source: https://github.com/beam-cloud/examples/blob/main/jupyter_notebooks/beam-notebook.ipynb

Define a Beam function that loads local model weights. Ensure the weights file is in the same directory as the notebook. This example uses PyTorch.

```python
from beam import function, Image

# Load local model weights
WEIGHTS_PATH = "./weights.pth"

@function(cpu=2, memory="1Gi", image=Image(python_packages=["torch"]))
def handler():
    import torch
    # Load model weights from a local file
    model = torch.load(WEIGHTS_PATH)
    model.eval()
    
    return {"success": "true"} 
```

--------------------------------

### Llama3 LoRA Fine-tuning Configuration

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

This Python script configures and performs LoRA PEFT fine-tuning on the Llama3 model. It includes setup for CUDA, model and tokenizer loading, LoRA configuration, dataset preparation, and training arguments. The fine-tuned model and tokenizer are saved to a specified output directory.

```python
# finetune.py
def llama_fine_tune():
    import os
    import torch
    from datasets import load_dataset
    from transformers import (
        AutoTokenizer,
        AutoModelForCausalLM,
        TrainingArguments,
        Trainer,
        DataCollatorForLanguageModeling,
    )
    from peft import LoraConfig, get_peft_model, TaskType

    os.environ["TOKENIZERS_PARALLELISM"] = "false"

    if not torch.cuda.is_available():
        return "CUDA is not available"

    torch.set_float32_matmul_precision("high")

    # Load the Llama3 model and tokenizer
    model = AutoModelForCausalLM.from_pretrained(
        WEIGHT_PATH, device_map="auto", attn_implementation="eager", use_cache=False
    )
    tokenizer = AutoTokenizer.from_pretrained(WEIGHT_PATH, use_fast=False)
    
    # Set the pad_token to eos_token
    tokenizer.pad_token = tokenizer.eos_token


    lora_config = LoraConfig(
        r=16,
        lora_alpha=32,
        target_modules=["q_proj", "v_proj"],
        lora_dropout=0.05,
        bias="none",
        task_type=TaskType.CAUSAL_LM,
    )

    model = get_peft_model(model, lora_config)

    # Load the Yelp Reviews dataset from Hugging Face
    dataset = load_dataset(DATASET_PATH)

    def prepare_dataset(examples):
        return tokenizer(examples["text"], padding="max_length", truncation=True)

    tokenized_dataset = dataset.map(prepare_dataset, batched=True)

    training_args = TrainingArguments(
        # This output directory is on our mounted volume
        output_dir="./llama-ft/llama-finetuned",
        num_train_epochs=1,
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        learning_rate=2e-4,
        weight_decay=0.01,
        logging_steps=10,
        save_steps=100,
        save_total_limit=3,
        fp16=True,
        gradient_checkpointing=False,
        remove_unused_columns=False,
    )

    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=tokenized_dataset,
        data_collator=DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False),
    )

    trainer.train()

    # Saving the LORA model and tokenizer to our mounted volume so that our inference endpoint can access it.
    model.save_pretrained("./llama-ft/llama-finetuned")
    tokenizer.save_pretrained("./llama-ft/llama-finetuned")
```

--------------------------------

### Configure API Request

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/whisperx_stt/README.md

Set up your authentication token, Beam deployment URL, and the audio file URL in this Python script before sending a request.

```python
AUTH_TOKEN = "BEAM_AUTH_TOKEN"
BEAM_URL = "id/8836f704-b521-4e1c-8979-bc74c97dc47b"
AUDIO_URL = ""
```

--------------------------------

### Deploy SDXL Backend

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/sdxl/README.md

Deploy the backend application using the Beam CLI. This command assumes your app is in `app.py` and the entry point is `generate`.

```bash
cd backend && beam deploy app.py:generate
```

--------------------------------

### Example Parler TTS API Request

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/parler-tts/README.md

An example of a JSON payload for the Parler TTS API, specifying the text and a detailed voice description.

```json
{
    "prompt": "On Beam run AI workloads anywhere with zero complexity. One line of Python, global GPUs, full control",
    "description": "A female speaker delivers a slightly expressive and animated speech with a moderate speed and pitch. The recording is of very high quality, with the speaker's voice sounding clear and very close up."
}
```

--------------------------------

### Zonos TTS API Example Request Payload

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/zonos/README.md

An example JSON payload for the Zonos TTS API, specifying the text to be converted into speech.

```json
{
    "text": "On Beam run AI workloads anywhere with zero complexity. One line of Python, global GPUs, full control",
}
```

--------------------------------

### WhisperX Transcription Response Example

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/whisperx_stt/README.md

This is an example of the JSON response received after sending an audio file for transcription. It includes segmented text and word-level details.

```json
{"result":{"segments":[{"start":0.309,"end":3.133,"text":" My thought, I have nobody by a beauty and will as you t'ward.","words":[{"word":"My","start":0.309,"end":0.45,"score":0.93},{"word":"thought,","start":0.49,"end":0.85,"score":0.863},{"word":"I","start":0.91,"end":0.97,"score":0.999},{"word":"have","start":1.01,"end":1.191,"score":0.874},{"word":"nobody","start":1.251,"end":1.571,"score":0.91},{"word":"by","start":1.611,"end":1.751,"score":0.863},{"word":"a","start":1.791,"end":1.832,"score":0.975},{"word":"beauty","start":1.872,"end":2.152,"score":0.836},{"word":"and","start":2.192,"end":2.272,"score":0.82},{"word":"will","start":2.292,"end":2.472,"score":0.853},{"word":"as","start":2.512,"end":2.593,"score":0.838},{"word":"you","start":2.613,"end":2.753,"score":0.842},{"word":"t'ward.","start":2.793,"end":3.133,"score":0.217}]},{"start":3.874,"end":9.943,"text":"Mr. Rochester is sub, and that so don't find simpus, and devoted abode, to hath might in a","words":[{"word":"Mr.","start":3.874,"end":4.175,"score":0.563},{"word":"Rochester","start":4.235,"end":4.756,"score":0.94},{"word":"is","start":4.836,"end":4.916,"score":0.816},{"word":"sub,","start":4.936,"end":5.236,"score":0.877},{"word":"and","start":5.276,"end":5.356,"score":0.802},{"word":"that","start":5.397,"end":5.577,"score":0.948},{"word":"so","start":5.617,"end":5.777,"score":0.982},{"word":"don't","start":5.817,"end":6.017,"score":0.863},{"word":"find","start":6.057,"end":6.358,"score":0.873},{"word":"simpus,","start":6.398,"end":6.839,"score":0.865},{"word":"and","start":7.399,"end":7.499,"score":0.884},{"word":"devoted","start":7.54,"end":7.92,"score":0.969},{"word":"abode,","start":8.0,"end":8.461,"score":0.635},{"word":"to","start":9.102,"end":9.222,"score":0.839},{"word":"hath","start":9.262,"end":9.402,"score":0.65},{"word":"might","start":9.442,"end":9.703,"score":0.855},{"word":"in","start":9.783,"end":9.883,"score":0.8},{"word":"a","start":9.923,"end":9.943,"score":0.97}]}}],"word_segments":[{"word":"My","start":0.309,"end":0.45,"score":0.93},{"word":"thought,","start":0.49,"end":0.85,"score":0.863},{"word":"I","start":0.91,"end":0.97,"score":0.999},{"word":"have","start":1.01,"end":1.191,"score":0.874},{"word":"nobody","start":1.251,"end":1.571,"score":0.91},{"word":"by","start":1.611,"end":1.751,"score":0.863},{"word":"a","start":1.791,"end":1.832,"score":0.975},{"word":"beauty","start":1.872,"end":2.152,"score":0.836},{"word":"and","start":2.192,"end":2.272,"score":0.82},{"word":"will","start":2.292,"end":2.472,"score":0.853},{"word":"as","start":2.512,"end":2.593,"score":0.838},{"word":"you","start":2.613,"end":2.753,"score":0.842},{"word":"t'ward.","start":2.793,"end":3.133,"score":0.217},{"word":"Mr.","start":3.874,"end":4.175,"score":0.563},{"word":"Rochester","start":4.235,"end":4.756,"score":0.94},{"word":"is","start":4.836,"end":4.916,"score":0.816},{"word":"sub,","start":4.936,"end":5.236,"score":0.877},{"word":"and","start":5.276,"end":5.356,"score":0.802},{"word":"that","start":5.397,"end":5.577,"score":0.948},{"word":"so","start":5.617,"end":5.777,"score":0.982},{"word":"don't","start":5.817,"end":6.017,"score":0.863},{"word":"find","start":6.057,"end":6.358,"score":0.873},{"word":"simpus,","start":6.398,"end":6.839,"score":0.865},{"word":"and","start":7.399,"end":7.499,"score":0.884},{"word":"devoted","start":7.54,"end":7.92,"score":0.969},{"word":"abode,","start":8.0,"end":8.461,"score":0.635},{"word":"to","start":9.102,"end":9.222,"score":0.839},{"word":"hath","start":9.262,"end":9.402,"score":0.65},{"word":"might","start":9.442,"end":9.703,"score":0.855},{"word":"in","start":9.783,"end":9.883,"score":0.8},{"word":"a","start":9.923,"end":9.943,"score":0.97}]}}
```

--------------------------------

### Create Preview Environment

Source: https://github.com/beam-cloud/examples/blob/main/huggingface_inference/README.md

Use this command to create a temporary preview environment for your Huggingface inference application. This is useful for testing before deploying.

```bash
beam serve app.py:predict
```

--------------------------------

### Run Benchmarking Script

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/whisper_stt/README.md

Execute this Python script to call the deployed API and measure inference and cold boot times.

```python
python request.py
```

--------------------------------

### ComfyUI API Example Response

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/comfy_ui/controlnet/README.md

The expected JSON response from the /generate endpoint, containing the URL of the generated image.

```json
{
  "output_url": " https://app.beam.cloud/output/id/5cc90408-2c40-424f-bb3f-731268e7f100"
}
```

--------------------------------

### List Fine-Tuned Gemma Model Files

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

Output from the Beam CLI showing the files generated after fine-tuning the Gemma model, including adapter weights and checkpoints.

```bash
❯ beam ls gemma-ft/gemma-2b-finetuned

  Name                                                Size   Modified Time   IsDir
 ──────────────────────────────────────────────────────────────────────────────────
  gemma-2b-finetuned/README.md                    4.97 KiB   Aug 10 2024     No
  gemma-2b-finetuned/adapter_config.json          644.00 B   Aug 10 2024     No
  gemma-2b-finetuned/adapter_model.safetensors   12.20 MiB   Aug 10 2024     No
  gemma-2b-finetuned/checkpoint-700              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/checkpoint-800              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/checkpoint-809              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/special_tokens_map.json      555.00 B   Aug 10 2024     No
  gemma-2b-finetuned/tokenizer.json              16.71 MiB   Aug 10 2024     No
  gemma-2b-finetuned/tokenizer_config.json       45.21 KiB   Aug 10 2024     No

  9 items | 139.06 MiB used
```

--------------------------------

### Deploy OpenAI-Compatible APIs with vLLM

Source: https://github.com/beam-cloud/examples/blob/main/vllm/README.md

Deploy multiple LLMs as OpenAI-compatible APIs using the VLLM wrapper class. Command-line arguments for `vllm serve` can be passed via the `vllm_args` field. Each model supports different features, including multi-modal, chat, and tool calling.

```bash
beam deploy models.py:internvl
```

```bash
beam deploy models.py:yicoder_chat
```

```bash
beam deploy models.py:mistral_instruct
```

```bash
beam deploy models.py:deepseek_r1
```

--------------------------------

### Deploy Headless Browser App

Source: https://github.com/beam-cloud/examples/blob/main/web_scraping/headless_chrome/README.md

Deploy the application on Beam using the provided command. Ensure your application file is named 'app.py' and the browser entry point is 'browser'.

```bash
beam deploy app.py:browser
```

--------------------------------

### Upload Weights and Dataset to Beam Volume

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Use these CLI commands to create a Beam volume and copy your local model weights and fine-tuning dataset to it. Ensure your weights are in a directory named 'local_weights' and your dataset is in 'local_dataset'.

```bash
$ beam volume create llama-ft

$ beam cp local_weights llama-ft/weights

$ beam cp local_dataset llama-ft/data
```

--------------------------------

### Define a Beam Function with GPU Acceleration

Source: https://github.com/beam-cloud/examples/blob/main/jupyter_notebooks/beam-notebook.ipynb

Define a Beam function that utilizes GPU acceleration. This example specifies an A100-40 GPU and demonstrates how to check for NVIDIA drivers.

```python
from beam import function, Image


# Runs on an A100-40 GPU in the cloud
@function(gpu="A100-40", cpu=4, memory="32Gi", image=Image(python_packages=["torch"]))
def handler():
    import subprocess
    
    # Print the available GPU drivers 
    print(subprocess.check_output(["nvidia-smi"], shell=True))

    return {"gpu":"true"}
```

--------------------------------

### List Fine-Tuning Output Files

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Lists the files generated after the fine-tuning process completes, showing their names, sizes, modification times, and whether they are directories.

```bash
$ beam ls llama-ft/llama-finetuned
```

--------------------------------

### Deploy Fine-tuning Job to Beam

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Add these lines to your Python script to enable deployment to Beam. This allows you to run the fine-tuning process on Beam's serverless GPUs. Execute the script using `python finetune.py`.

```python
# finetune.py
# Deploy to beam by running `$ python finetune.py` in the terminal
from beam import Volume, Image, function, env
```

--------------------------------

### Execute a remote function and verify downloaded files

Source: https://github.com/beam-cloud/examples/blob/main/bioinformatics/dnabert/README.md

Run the Python script containing the `download` function. After execution, use the `beam ls` command to list files in the specified volume and confirm the download.

```sh
$ python download-dna.py

=> Running function: <download-dna:download>
Sequence ID: CM004190.1
Length: 33445071
Description: Pan troglodytes isolate Yerkes chimp pedigree #C0471 (Clint) chromosome 21, whole genome shotgun
sequence
=> Function complete <8bb1e80b-533c-4f91-8eba-e8a6f899ed7c>
```

```sh
$ beam ls seq

  Name                           Size   Modified Time   IsDir
 ─────────────────────────────────────────────────────────────
  CM004190.1.gb              7.92 KiB   5 minutes ago   No
```

--------------------------------

### Deploy Beam Application

Source: https://github.com/beam-cloud/examples/blob/main/bioinformatics/dnabert/README.md

Deploy your Beam application from the shell by specifying the Python file and the main function.

```shell
beam deploy app.py:main
```

--------------------------------

### Deploy vLLM Application

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

Deploy the configured vLLM application by running the `beam deploy` command with the Python file and the application name. This command builds the image, syncs files, and deploys the API.

```bash
beam deploy models.py:internvl
```

--------------------------------

### Zonos TTS API Request Example

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/zonos/README.md

Send a POST request to the Zonos TTS API endpoint with the specified JSON payload to convert text to speech. The 'text' field contains the content to be synthesized.

```json
{
    "text": "Your text to convert to speech"
}
```

--------------------------------

### Deploy InstantID Model to Beam

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/instant_id/README.md

Deploy the InstantID model as a Beam endpoint using this command. Replace `app.py:generate_image` with your specific application and entry point.

```bash
beam deploy app.py:generate_image
```

--------------------------------

### Deploy Inference Endpoint

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

Use the beam CLI to deploy your inference script. Specify the entry point and give your endpoint a name.

```bash
beam deploy inference.py:predict --name gemma-ft
```

--------------------------------

### Zonos TTS API cURL Example

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/zonos/README.md

This cURL command demonstrates how to make a POST request to the Zonos TTS API. It includes necessary headers for content type and authorization, along with the JSON payload containing the text to be converted.

```bash
curl -X POST 'https://ff6e671a-c43d-468e-ab21-df87c8d87afb.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer dERxHlMCz4TDU9k7cxZte_FrILTCvs0nf3KSVe8oZumoTAOa4OIkpJiyGOq_hS9nyangjUG6GC9VmswWd_Rt4g==' \
-d '{ "text": "On Beam run AI workloads anywhere with zero complexity. One line of Python, global GPUs, full control"}'
```

--------------------------------

### Configure Gemma Inference Endpoint on Beam

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/gemma/README.md

This decorator sets up a Beam endpoint for Gemma inference. It specifies the endpoint name, model loading on start, volume for weights, CPU, memory, GPU, Python version, packages, and autoscaling.

```python
@endpoint(
    name="gemma-inference",
    on_start=load_finetuned_model,
    volumes=[Volume(name="gemma-ft", mount_path=MOUNT_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.9",
        python_packages=["transformers==4.42.0", "torch", "peft"],
    ),
    autoscaler=QueueDepthAutoscaler(max_containers=5, tasks_per_container=1),
)
```

--------------------------------

### Define vLLM Server Configuration with Beam SDK

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

Configure a vLLM server using the Beam SDK's VLLM class. This class accepts arguments mirroring the vLLM command-line tool. Ensure correct model name, resource allocation (CPU, memory, GPU), and vLLM-specific arguments like `trust_remote_code`, `max_model_len`, and `limit_mm_per_prompt` are set.

```python
from beam.integrations import VLLM, VLLMArgs

INTERNVL2_5 = "OpenGVLab/InternVL2_5-8B"

internvl = VLLM(
    name=INTERNVL2_5.split("/")[-1],
    cpu=8,
    memory="32Gi",
    gpu="A10G",
    gpu_count=2,
    vllm_args=VLLMArgs(
        model=INTERNVL2_5,
        served_model_name=[INTERNVL2_5],
        trust_remote_code=True,
        max_model_len=4096,
        gpu_memory_utilization=0.95,
        limit_mm_per_prompt={"image": 2},
    )
)
```

--------------------------------

### Deploy Whisper App to Beam

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/whisper_stt/README.md

Use this command to deploy the `app.py` file containing the Whisper inference function to Beam.

```bash
beam deploy app.py:transcribe
```

--------------------------------

### Invoke Faster Whisper API with URL

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/faster_whisper/README.md

Invoke the deployed Faster Whisper transcription API using cURL. This example shows how to send an audio file URL for transcription. Ensure you replace placeholders with your actual endpoint ID and authentication token.

```bash
curl -X POST 'https://app.beam.cloud/endpoint/id/[YOUR-ENDPOINT-ID]' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [YOUR-AUTH-TOKEN]' \
-d '{"url":"http://commondatastorage.googleapis.com/codeskulptor-demos/DDR_assets/Kangaroo_MusiQue_-_The_Neverwritten_Role_Playing_Game.mp3"}'
```

--------------------------------

### Upload Model with Python Script

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/instant_id/README.md

Use this command to upload the InstantID model to Beam. Ensure you have the necessary Python environment set up.

```bash
python upload.py
```

--------------------------------

### Interact with Deployed vLLM APIs via Chat Client

Source: https://github.com/beam-cloud/examples/blob/main/vllm/README.md

Run a Python chat client script to interact with deployed vLLM models. The script prompts for the deployment URL and allows users to chat with the model, including providing image links for multi-modal models.

```bash
python chat.py
```

```text
Welcome to the CLI Chat Application!
Type 'quit' to exit the conversation.
Enter the app URL: https://internvl-instruct-15c4487-v1.app.beam.cloud
Model OpenGVLab/InternVL2_5-8B is ready
Question: What is in this image?
Image link (press enter to skip): https://upload.wikimedia.org/wikipedia/commons/8/86/Wood.duck.arp.jpg
Assistant:  The image captures a vibrant wood duck in mid-flight, its wings spread wide as it soars through a lush field dotted with yellow flowers. The duck's head is adorned with striking red and black markings, while its body is a mix of green, white, and brown feathers. The perspective of the photo is from below, placing the duck in the center and giving a sense of its impressive wingspan. The background is a vivid green, filled with various shades of green and yellow flowers, providing a stark contrast to the duck's colorful plumage. The image is a beautiful representation of wildlife in its natural habitat
```

--------------------------------

### Call Deployed Beam Endpoint with Curl

Source: https://github.com/beam-cloud/examples/blob/main/language_models/llama3_8b/README.md

Example of how to send a POST request to your deployed Beam endpoint using curl. Ensure you replace [ENDPOINT-ID] and [AUTH-TOKEN] with your actual values. The request body includes system and user messages for the Llama 3 model.

```sh
curl -X POST 'https://app.beam.cloud/endpoint/id/[ENDPOINT-ID]'
-H 'Connection: keep-alive'
-H 'Content-Type: application/json'
-H 'Authorization: Bearer [AUTH-TOKEN]'
-d '{
    "messages": [
        {"role": "system", "content": "You are a yoda chatbot who always responds in yoda speak!"},
        {"role": "user", "content": "Who are you?"}
    ]
}'
```

--------------------------------

### Configure MCP Server in Claude Desktop

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/README.md

JSON configuration for the MCP server in Claude Desktop. Update the 'command' and 'args' to point to your uv executable and server script.

```json
{
  "mcpServers": {
    "beam-mcp": {
      "command": "/path/to/your/uv",
      "args": [
        "--directory",
        "/path/to/your/server",
        "run",
        "server.py"
      ]
    }
  }
}

```

--------------------------------

### Run Fine-Tuning Job via Beam CLI

Source: https://github.com/beam-cloud/examples/blob/main/finetuning/llama/README.md

Executes the fine-tuning function using the Beam CLI. This command initiates the image building, file syncing, and function execution process.

```bash
python [finetune.py](http://finetune.py)
```

--------------------------------

### Serve Faster Whisper App

Source: https://github.com/beam-cloud/examples/blob/main/audio_and_transcription/faster_whisper/README.md

Serve the Faster Whisper transcription application using the beam CLI. This command makes the transcription API available for invocation.

```bash
beam serve app.py:transcribe
```

--------------------------------

### Deploy ComfyUI App on Beam

Source: https://github.com/beam-cloud/examples/blob/main/image_generation/comfy_ui/controlnet/README.md

Use this command to deploy the ComfyUI image generation application to Beam.

```bash
beam deploy app.py:handler
```

--------------------------------

### Equivalent vLLM Command Line Tool Command

Source: https://github.com/beam-cloud/examples/blob/main/mcp_server/docs.txt

This command shows the equivalent vLLM command-line arguments for the Python SDK configuration. It specifies the model, enables remote code trust, sets the maximum model length, and limits multi-modal prompts.

```bash
vllm serve OpenGVLab/InternVL2_5-8B --trust-remote-code \
--max-model-len 4096 --limit-mm-per-prompt image=2
```

--------------------------------

### Execute Beam Application

Source: https://github.com/beam-cloud/examples/blob/main/bioinformatics/dnabert/README.md

This command executes the Beam application, initiating the image building, file syncing, and remote function execution process. It shows the expected output logs during the job execution.

```sh
$ python app.py

=> Running function: <app:main>
=> Building image
=> Using cached image
=> Syncing files
=> Files synced
=> Running function: <app:generate_embeddings>
=> Running function: <app:generate_embeddings>
=> Running function: <app:generate_embeddings>
=> Running function: <app:generate_embeddings>
=> Function complete <4c4ea387-f8cf-499e-9470-aba852f9a6c3>
Embedding saved to: {'output_url': 'https://app.beam.cloud/output/id/53f03dfe-6564-4f89-8657-888dd01ceb62'}
=> Function complete <b38ca7f5-462d-4b96-a244-fc2ffb6ef2ad>
Embedding saved to: {'output_url': 'https://app.beam.cloud/output/id/9a74a538-004d-4203-a09f-21dd9c488b67'}
=> Function complete <f962cbe9-966f-42e3-9f6b-e609996984b4>
Embedding saved to: {'output_url': 'https://app.beam.cloud/output/id/1c62ebce-6a65-429e-b32d-07bc6fe6b2bd'}
=> Function complete <3a988ff8-1fa0-4804-898a-1e4dc3d30c46>
Embedding saved to: {'output_url': 'https://app.beam.cloud/output/id/881afdc9-e13a-4aa7-8599-b41be5f60f80'}
=> Function complete <436eadd4-b5d4-45af-8f33-a00c8f3e3b77>

```