### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Installs WasmEdge and its necessary plugins for Whisper. This command downloads and executes an installation script.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### Getting Started with Obsidian-local-gpt

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/obsidian.md

Provides a step-by-step guide to setting up and using the Obsidian-local-gpt plugin for AI-powered note-taking.

```English
1. Set up the Obsidian-local-gpt plugin in your Obsidian app.
2. Explore the various AI-powered features to enhance your productivity.
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Installs WasmEdge version 0.14.1 using a curl script. This is the initial step to set up the environment.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1
```

--------------------------------

### Install WasmEdge Runtime

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md

Installs WasmEdge, a WebAssembly runtime, along with the AI inference plugin (WASI-NN) necessary for running LLM models. This script handles the setup of the core runtime and its AI capabilities.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md

Installs WasmEdge runtime and AI inference plugin using a script from GitHub.

```shell
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md

Installs the WasmEdge runtime, a high-performance LLM runtime, using a provided script.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### Start Llama-Nexus Inference Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md

Starts the Llama-Nexus server in the background using the specified configuration file.

```bash
nohup ./llama-nexus --config config.toml &
```

--------------------------------

### Download Whisper Model

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Downloads a Whisper model in GGML format. The example downloads the 'ggml-medium.bin' model from Hugging Face.

```bash
curl -LO https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin
```

--------------------------------

### Start LlamaEdge API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md

Starts the LlamaEdge API server with the downloaded models and UI files, making the chatbot accessible via a web browser.

```shell
wasmedge --dir .:. --nn-preload default:GGML:AUTO:llava-v1.6-vicuna-7b-Q5_K_M.gguf llama-api-server.wasm -p vicuna-llava -c 4096 --llava-mmproj llava-v1.6-vicuna-7b-mmproj-model-f16.gguf -m llava-v1.6-vicuna-7b
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md

Installs the WasmEdge runtime, a high-performance, lightweight, and cross-platform LLM runtime, using a provided installation script.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### Install and Start FlowiseAI

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/flowiseai-tool-call.md

Installs FlowiseAI globally using npm and starts the FlowiseAI server, making its UI accessible at http://localhost:3000.

```bash
npm install -g flowise
npx flowise start
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md

Installs the WasmEdge runtime, a high-performance LLM runtime essential for running the MedGemma model.

```shell
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
```

--------------------------------

### API Response Example

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md

An example JSON response from the LlamaEdge API server's chat completions endpoint, showing the assistant's answer to the query about the capital of Texas.

```json
{"id":"chatcmpl-5f0b5247-7afc-45f8-bc48-614712396a05","object":"chat.completion","created":1751945744,"model":"Mistral-Small-3.1-24B-Instruct-2503-Q5_K_M","choices":[{"index":0,"message":{"content":"The capital of Texas is Austin.","role":"assistant"},"finish_reason":"stop","logprobs":null}],"usage":{"prompt_tokens":38,"completion_tokens":8,"total_tokens":46}}
```

--------------------------------

### Start the API server

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Starts the Stable Diffusion API server using WasmEdge, specifying the model name and the downloaded model file. The server defaults to port 8080.

```bash
wasmedge --dir .:. sd-api-server.wasm --model-name sd-v2.1 --model v2-1_768-nonema-pruned-f16.gguf
```

--------------------------------

### LlamaEdge API Server Output Example

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md

Example output from the LlamaEdge API server upon successful execution, showing configuration details and the listening address.

```rust
[2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:544: model_name: medgemma-4b
[2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:553: model_alias: default
[2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:573: ctx_size: 4098
[2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:593: batch_size: 512

...
[2025-05-29 17:07:46.935] [info] llama_api_server in llama-api-server/src/main.rs:907: running_mode: chat
[2025-05-29 17:07:46.935] [info] llama_api_server in llama-api-server/src/main.rs:917: plugin_ggml_version: b5201 (commit 85f36e5e)
[2025-05-29 17:07:46.936] [info] llama_api_server in llama-api-server/src/main.rs:952: Listening on 0.0.0.0:8080
```

--------------------------------

### WasmEdge API Server Output Example

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md

Example output from the WasmEdge API server upon successful execution, showing initialization logs, server version, model information, and the listening port.

```rust
[2025-05-18 11:23:09.970] [info] llama_api_server in llama-api-server/src/main.rs:202: LOG LEVEL: info
[2025-05-18 11:23:09.973] [info] llama_api_server in llama-api-server/src/main.rs:205: SERVER VERSION: 0.18.5
[2025-05-18 11:23:09.976] [info] llama_api_server in llama-api-server/src/main.rs:544: model_name: Qwen2.5-VL-7B-Instruct

...

[2025-05-18 11:23:10.531] [info] llama_api_server in llama-api-server/src/main.rs:917: plugin_ggml_version: b5361 (commit cf0a43bb)
[2025-05-18 11:23:10.533] [info] llama_api_server in llama-api-server/src/main.rs:952: Listening on 0.0.0.0:8080
```

--------------------------------

### Run LlamaEdge API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md

Starts the LlamaEdge API server using WasmEdge. It preloads the downloaded LLM model and specifies the model's parameter set for inference.

```bash
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3.2-1B-Instruct-Q5_K_M.gguf llama-api-server.wasm -p llama-3-chat
```

--------------------------------

### WasmEdge LLM Prompt Examples

Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/basic-llm-app.md

Provides example prompts to demonstrate the code completion capabilities of the LLM.

```bash
USER:
def print_hello_world():

USER:
fn is_prime(n: u64) -> bool {

USER:
Write a Rust function to check if an input number is prime:
```

--------------------------------

### Install OpenAI Python Library

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/intro.md

Installs the official OpenAI Python library using pip. This library is used to interact with the LlamaEdge API.

```bash
pip install openai
```

--------------------------------

### Download the portable API server app

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Downloads the lightweight and cross-platform sd-api-server.wasm application from the latest release.

```bash
curl -LO https://github.com/LlamaEdge/sd-api-server/releases/latest/download/sd-api-server.wasm
```

--------------------------------

### Clone and Build WasmEdge LLM Example

Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/basic-llm-app.md

Clones the WasmEdge WASINN examples repository, navigates to the basic GGML directory, and builds the application for wasm32-wasip1.

```bash
git clone https://github.com/second-state/WasmEdge-WASINN-examples
cd WasmEdge-WASINN-examples
cd wasmedge-ggml/basic

cargo build --target wasm32-wasip1 --release
cp target/wasm32-wasip1/release/wasmedge-ggml-basic.wasm .
```

--------------------------------

### Clone and Setup Demo Agent

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/tool-call.md

Clones the `llm_todo` repository from GitHub and installs the necessary Python dependencies. This agent demonstrates LLM interaction with a SQL database.

```bash
git clone https://github.com/second-state/llm_todo
cd llm_todo
pip install -r requirements.txt
```

--------------------------------

### Run the MCP Weather Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md

Starts the cardea-mcp-weather-server, making it accessible via HTTP stream on port 8002. Ensure the port is open for external access.

```bash
./cardea-weather-mcp-server --transport stream-http --socket-addr 0.0.0.0:8002
```

--------------------------------

### Start LlamaEdge API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/continue.md

Starts the LlamaEdge API server with specified models for coding and embeddings, including prompt templates and batch sizes.

```bash
wasmedge --dir .:. \
    --nn-preload default:GGML:AUTO:Codestral-22B-v0.1-hf-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
    llama-api-server.wasm \
    --model-alias default,embedding \
    --model-name Codestral-22B-v0.1-hf-Q5_K_M,nomic-embed-text-v1.5.f16 \
    --prompt-template mistral-instruct,embedding \
    --batch-size 128,8192 \
    --ctx-size 32768,8192
```

--------------------------------

### Run Translation Agent

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/translation-agent.md

Executes the example translation script. This command navigates to the examples directory and runs the Python script to perform the translation.

```shell
cd examples    
python example_script.py
```

--------------------------------

### Install Llama-Nexus

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/quick-start.md

Downloads and extracts the Llama-Nexus software for Linux on x86. Ensure you have `curl` and `tar` installed. Download for other platforms is available via the provided link.

```bash
curl -LO https://github.com/LlamaEdge/llama-nexus/releases/latest/download/llama-nexus-unknown-linux-gnu-x86_64.tar.gz

tar xvf llama-nexus-unknown-linux-gnu-x86_64.tar
```

--------------------------------

### Run Whisper API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Starts the Whisper API server using WasmEdge. The server requires the Wasm file and the Whisper model file. It defaults to running on port 8080.

```bash
wasmedge --dir .:. whisper-api-server.wasm -m ggml-medium.bin
```

--------------------------------

### Configure Llama-Nexus Server Port

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md

Sets the host and port for the Llama-Nexus server in the `config.toml` file.

```toml
[server]
host = "0.0.0.0" # The host to listen on
port = 9095      # The port to listen on
```

--------------------------------

### Run LlamaEdge API Server with Gemma-3

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md

Starts the LlamaEdge API server using WasmEdge, loading the Gemma-3 model and mmproj file. This command configures the prompt template, context size, and model name, making the API accessible on port 8080.

```bash
wasmedge --dir .:. --nn-preload default:GGML:AUTO:gemma-3-4b-it-Q5_K_M.gguf \
  llama-api-server.wasm \
  --prompt-template gemma-3 \
  --llava-mmproj gemma-3-4b-it-mmproj-f16.gguf \
  --ctx-size 4096 \
  --model-name gemma-3-4b
```

--------------------------------

### General Assistance Example

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/obsidian.md

Demonstrates how selecting text and using the 'General help' feature provides contextual information. Shows an example of a response when the model lacks specific training data for the selected text.

```English
The information you're looking for is not present in this context.

If you need to know the format and dates of KubeCon + CloudNativeCon + Open Source Summit + AI_dev China 2024, I suggest searching for official announcements or websites related to these events.
```

--------------------------------

### Run LlamaEdge API Server with Qwen 2.5 VL Model

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md

Starts the WasmEdge runtime with the LlamaEdge API server, loading the Qwen 2.5 VL model and configuring it for vision-language tasks. This command requires the model files and the API server WASM binary to be present in the current directory.

```wasm
wasmedge --dir .:. \
  --nn-preload default:GGML:AUTO:Qwen2.5-VL-7B-Instruct-Q5_K_M.gguf \
  llama-api-server.wasm \
  --model-name Qwen2.5-VL-7B-Instruct \
  --prompt-template qwen2-vision \
  --llava-mmproj Qwen2.5-VL-7B-Instruct-vision.gguf \
  --ctx-size 4096
```

--------------------------------

### Start LlamaEdge API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/agent-zero.md

Starts the LlamaEdge API server with specified models and configurations. It preloads models and sets aliases, names, prompt templates, batch sizes, and context sizes.

```bash
wasmedge --dir .:. \
    --nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
    llama-api-server.wasm \
    --model-alias default,embedding \
    --model-name Meta-Llama-3.1-8B-Instruct-Q5_K_M,nomic-embed-text-v1.5.f16 \
    --prompt-template llama-3-chat,embedding \
    --batch-size 128,8192 \
    --ctx-size 32768,8192
```

--------------------------------

### Clone and Build WasmEdge LLM Example

Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/chatbot-llm-app.md

Steps to clone the WasmEdge WASINN examples repository, navigate to the llama directory, build the application using cargo, and copy the compiled WASM file.

```shell
git clone https://github.com/second-state/WasmEdge-WASINN-examples
cd WasmEdge-WASINN-examples
cd wasmedge-ggml/llama
cargo build --target wasm32-wasip1 --release
cp target/wasm32-wasip1/release/wasmedge-ggml-llama.wasm .
```

--------------------------------

### Start LlamaEdge API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/translation-agent.md

Starts the LlamaEdge API server with specified models and configurations. It preloads the Gemma-2-9B and embedding models and sets up aliases, model names, prompt templates, batch sizes, and context sizes.

```bash
wasmedge --dir .:. \
    --nn-preload default:GGML:AUTO:gemma-2-9b-it-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
    llama-api-server.wasm \
    --model-alias default,embedding \
    --model-name gemma-2-9b-it-Q5_K_M,nomic-embed-text-v1.5.f16 \
    --prompt-template gemma-instruct,embedding \
    --batch-size 128,8192 \
    --ctx-size 8192,8192
```

--------------------------------

### Download Qwen 2.5 VL Model Files

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md

Downloads the Qwen 2.5 VL 7B Instruct model in GGUF format and its associated mmproj file, required for vision-language tasks.

```bash
curl -LO https://huggingface.co/second-state/Qwen2.5-VL-7B-Instruct-GGUF/resolve/main/Qwen2.5-VL-7B-Instruct-Q5_K_M.gguf
curl -LO https://huggingface.co/second-state/Qwen2.5-VL-7B-Instruct-GGUF/resolve/main/Qwen2.5-VL-7B-Instruct-vision.gguf
```

--------------------------------

### Download LlamaEdge API Server App

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md

Downloads the compiled WebAssembly binary for the LlamaEdge API server. This lightweight, cross-platform application provides an OpenAI-compatible API for interacting with LLMs.

```shell
curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm
```

--------------------------------

### Download and Extract MCP Servers

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md

Downloads the cardea-mcp-servers release for Linux and extracts the archive. This is the first step in setting up the weather MCP server.

```bash
curl -LO https://github.com/cardea-mcp/cardea-mcp-servers/releases/download/0.8.0/cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz
tar xvf cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz
```

--------------------------------

### API Request Response Example

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md

An example of a successful response from the llama-api-server after processing an API request containing image data. It includes details about the generated content and token usage.

```json
{
    "id": "chatcmpl-e5f777db-c913-45ab-b37f-e2c499c8fa0b",
    "object": "chat.completion",
    "created": 1747652210,
    "model": "medgemma-4b",
    "choices": [
        {
            "index": 0,
            "message": {
                "content": "There is a round, dense opacity in the right lower lobe of the lung. This could be a mass or nodule, and further investigation would be needed to determine its nature.",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage": {
        "prompt_tokens": 27,
        "completion_tokens": 68,
        "total_tokens": 95
    }
}
```

--------------------------------

### Download and Extract Llama-Nexus

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md

Downloads the llama-nexus release for Apple Darwin (aarch64) and extracts the archive. This prepares the inference server.

```bash
curl -LO https://github.com/LlamaEdge/llama-nexus/releases/download/0.6.0/llama-nexus-apple-darwin-aarch64.tar.gz
tar xvf llama-nexus-apple-darwin-aarch64.tar.gz
```

--------------------------------

### Download Stable Diffusion Plugin for CUDA 12.0 (Ubuntu)

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Downloads the WasmEdge stable diffusion plugin for Ubuntu systems with CUDA 12.0 support and extracts it to the WasmEdge plugin directory.

```bash
# Download the stable diffusion plugin for cuda 12.0
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-cuda-12.0-0.14.1-ubuntu20.04_x86_64.tar.gz

# Unzip the plugin to $HOME/.wasmedge/plugin
tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-cuda-12.0-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin
```

--------------------------------

### Successful API Response Example

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md

An example of a successful JSON response from the LlamaEdge API server after processing a multimodal request. It includes details about the completion, model used, and token usage.

```json
{
    "id": "chatcmpl-4367085d-6451-4896-bbd8-a5090604394d",
    "object": "chat.completion",
    "created": 1747369554,
    "model": "Qwen2-VL-2B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "content": "mixed berries in a paper bowl",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage": {
        "prompt_tokens": 27,
        "completion_tokens": 8,
        "total_tokens": 35
    }
}
```

--------------------------------

### Install and Extract cardea-mcp-servers

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/agentic-search.md

Downloads and extracts the cardea-mcp-servers for Linux on x86. This is a prerequisite for starting the agentic search MCP server.

```bash
curl -LO https://github.com/cardea-mcp/cardea-mcp-servers/releases/download/0.8.0/cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz

gunzip cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz
tar xvf cardea-mcp-servers-unknown-linux-gnu-x86_64.tar
```

--------------------------------

### Download Whisper API Server

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Downloads the Whisper API server as a WebAssembly (Wasm) file. This server provides an OpenAI-compatible API interface for Whisper.

```bash
curl -LO https://github.com/LlamaEdge/whisper-api-server/releases/download/0.3.9/whisper-api-server.wasm
```

--------------------------------

### Download Stable Diffusion Plugin for CUDA 11.0 (Ubuntu)

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Downloads the WasmEdge stable diffusion plugin for Ubuntu systems with CUDA 11.0 support and extracts it to the WasmEdge plugin directory.

```bash
# Download the stable diffusion plugin for cuda 11.0
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz

# Unzip the plugin to $HOME/.wasmedge/plugin
tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin
```

--------------------------------

### Whisper API Translation Response

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Example JSON response from the Whisper API server after a successful translation request. It includes the translated text with timestamps.

```json
{
  "text": "[00:00:00.000 --> 00:00:04.000]  This is a Chinese broadcast."
}
```

--------------------------------

### Whisper API Transcription Response

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Example JSON response from the Whisper API server after a successful transcription request. It includes the transcribed text with timestamps.

```json
{
    "text": "[00:00:00.000 --> 00:00:03.540]  This is a test record for Whisper.cpp"
}
```

--------------------------------

### Download LLM Models

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md

Downloads the Llava-v1.6-Vicuna-7B model and its corresponding mmproj model from Hugging Face.

```shell
curl -LO https://huggingface.co/second-state/Llava-v1.6-Vicuna-7B-GGUF/resolve/main/llava-v1.6-vicuna-7b-Q5_K_M.gguf
curl -LO https://huggingface.co/second-state/Llava-v1.6-Vicuna-7B-GGUF/resolve/main/llava-v1.6-vicuna-7b-mmproj-model-f16.gguf
```

--------------------------------

### Download Stable Diffusion Plugin for Mac Apple Silicon

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Downloads the WasmEdge stable diffusion plugin specifically for Mac Apple Silicon (arm64 architecture) and extracts it to the WasmEdge plugin directory.

```bash
# Download the stable diffusion plugin for Mac Apple Silicon
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-0.14.1-darwin_arm64.tar.gz

# Unzip the plugin to $HOME/.wasmedge/plugin
tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin

rm $HOME/.wasmedge/plugin/libwasmedgePluginWasiNN.dylib
```

--------------------------------

### Llama-API-Server Successful Response

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md

Example of a successful JSON response from the llama-api-server after processing a multimodal request. It includes the model's generated content, usage statistics, and other metadata.

```json
{
    "id": "chatcmpl-e5f777db-c913-45ab-b37f-e2c499c8fa0b",
    "object": "chat.completion",
    "created": 1747652210,
    "model": "gemma-3-4b",
    "choices": [
        {
            "index": 0,
            "message": {
                "content": "mixed berries in a paper bowl",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage": {
        "prompt_tokens": 27,
        "completion_tokens": 8,
        "total_tokens": 35
    }
}
```

--------------------------------

### Download LlamaEdge API Server App

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md

Downloads the compiled binary for the LlamaEdge API server, which provides an OpenAI-compatible API for LLMs.

```bash
curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm
```

--------------------------------

### Download Whisper Plugin for Mac Apple Silicon

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Downloads the Whisper plugin specifically for Mac Apple Silicon architecture. The plugin is then extracted to the WasmEdge plugin directory.

```bash
# Download the whisper plugin for Mac Apple Silicon
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz

# Unzip the plugin to $HOME/.wasmedge/plugin
tar -xzf WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin
```

--------------------------------

### Install Python Dependencies

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/langchain.md

Installs all required Python dependencies for the chatbot application from the 'requirements.txt' file.

```shell
pip install -r requirements.txt
```

--------------------------------

### Download the Stable Diffusion model

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md

Downloads a specific Stable Diffusion model (v2-1_768-nonema-pruned-f16.gguf) from Hugging Face. Links to a collection of other models are also provided.

```bash
curl -LO https://huggingface.co/second-state/stable-diffusion-2-1-GGUF/resolve/main/v2-1_768-nonema-pruned-f16.gguf
```

--------------------------------

### Install Dependencies

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/agent-zero.md

Installs the required Python dependencies for the Agent Zero application using pip.

```bash
pip install -r requirements.txt
```

--------------------------------

### Download LLM Model

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md

Downloads the Llama 3.2 1B Instruct model in GGUF format from Hugging Face. This model is finetuned for instruction following and is required for the LlamaEdge API server.

```shell
curl -LO https://huggingface.co/second-state/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q5_K_M.gguf
```

--------------------------------

### Start llama-nexus Server

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/agentic-search.md

Starts the llama-nexus inference server in the background using the specified configuration file.

```bash
nohup ./llama-nexus --config config.toml &
```

--------------------------------

### Download LlamaEdge API Server App

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md

Downloads the compiled Wasm binary for the LlamaEdge API server.

```shell
curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm
```

--------------------------------

### Install WasmEdge Runtime with GGML Plugin

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/langchain.md

Installs the WasmEdge runtime, including the wasi_nn-ggml plugin, which is necessary for running LLM models.

```shell
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml
```

--------------------------------

### Install WasmEdge

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/flux.md

Installs WasmEdge version 0.14.1 using a curl script. This is the foundational step for running Wasm applications.

```bash
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1
```

--------------------------------

### Download Whisper Plugin for CUDA 11.0 (Ubuntu)

Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md

Downloads the Whisper plugin for Ubuntu systems with CUDA 11.0. The plugin is then extracted to the WasmEdge plugin directory.

```bash
# Download the stable diffusion plugin for cuda 11.0
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-whisper-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz

# Unzip the plugin to $HOME/.wasmedge/plugin
tar -xzf WasmEdge-plugin-wasi_nn-whisper-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin
```

--------------------------------

### Start Llama-Nexus Service

Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/quick-start.md

Starts the Llama-Nexus service using a specified configuration file. By default, it listens on port 3389.

```bash
./llama-nexus --config config.toml
```