### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Installs WasmEdge and its necessary plugins for Whisper. This command downloads and executes an installation script. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### Getting Started with Obsidian-local-gpt Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/obsidian.md Provides a step-by-step guide to setting up and using the Obsidian-local-gpt plugin for AI-powered note-taking. ```English 1. Set up the Obsidian-local-gpt plugin in your Obsidian app. 2. Explore the various AI-powered features to enhance your productivity. ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Installs WasmEdge version 0.14.1 using a curl script. This is the initial step to set up the environment. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1 ``` -------------------------------- ### Install WasmEdge Runtime Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md Installs WasmEdge, a WebAssembly runtime, along with the AI inference plugin (WASI-NN) necessary for running LLM models. This script handles the setup of the core runtime and its AI capabilities. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md Installs WasmEdge runtime and AI inference plugin using a script from GitHub. ```shell curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md Installs the WasmEdge runtime, a high-performance LLM runtime, using a provided script. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### Start Llama-Nexus Inference Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md Starts the Llama-Nexus server in the background using the specified configuration file. ```bash nohup ./llama-nexus --config config.toml & ``` -------------------------------- ### Download Whisper Model Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Downloads a Whisper model in GGML format. The example downloads the 'ggml-medium.bin' model from Hugging Face. ```bash curl -LO https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin ``` -------------------------------- ### Start LlamaEdge API Server Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md Starts the LlamaEdge API server with the downloaded models and UI files, making the chatbot accessible via a web browser. ```shell wasmedge --dir .:. --nn-preload default:GGML:AUTO:llava-v1.6-vicuna-7b-Q5_K_M.gguf llama-api-server.wasm -p vicuna-llava -c 4096 --llava-mmproj llava-v1.6-vicuna-7b-mmproj-model-f16.gguf -m llava-v1.6-vicuna-7b ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md Installs the WasmEdge runtime, a high-performance, lightweight, and cross-platform LLM runtime, using a provided installation script. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### Install and Start FlowiseAI Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/flowiseai-tool-call.md Installs FlowiseAI globally using npm and starts the FlowiseAI server, making its UI accessible at http://localhost:3000. ```bash npm install -g flowise npx flowise start ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md Installs the WasmEdge runtime, a high-performance LLM runtime essential for running the MedGemma model. ```shell curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s ``` -------------------------------- ### API Response Example Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md An example JSON response from the LlamaEdge API server's chat completions endpoint, showing the assistant's answer to the query about the capital of Texas. ```json {"id":"chatcmpl-5f0b5247-7afc-45f8-bc48-614712396a05","object":"chat.completion","created":1751945744,"model":"Mistral-Small-3.1-24B-Instruct-2503-Q5_K_M","choices":[{"index":0,"message":{"content":"The capital of Texas is Austin.","role":"assistant"},"finish_reason":"stop","logprobs":null}],"usage":{"prompt_tokens":38,"completion_tokens":8,"total_tokens":46}} ``` -------------------------------- ### Start the API server Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Starts the Stable Diffusion API server using WasmEdge, specifying the model name and the downloaded model file. The server defaults to port 8080. ```bash wasmedge --dir .:. sd-api-server.wasm --model-name sd-v2.1 --model v2-1_768-nonema-pruned-f16.gguf ``` -------------------------------- ### LlamaEdge API Server Output Example Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md Example output from the LlamaEdge API server upon successful execution, showing configuration details and the listening address. ```rust [2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:544: model_name: medgemma-4b [2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:553: model_alias: default [2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:573: ctx_size: 4098 [2025-05-29 17:07:46.398] [info] llama_api_server in llama-api-server/src/main.rs:593: batch_size: 512 ... [2025-05-29 17:07:46.935] [info] llama_api_server in llama-api-server/src/main.rs:907: running_mode: chat [2025-05-29 17:07:46.935] [info] llama_api_server in llama-api-server/src/main.rs:917: plugin_ggml_version: b5201 (commit 85f36e5e) [2025-05-29 17:07:46.936] [info] llama_api_server in llama-api-server/src/main.rs:952: Listening on 0.0.0.0:8080 ``` -------------------------------- ### WasmEdge API Server Output Example Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md Example output from the WasmEdge API server upon successful execution, showing initialization logs, server version, model information, and the listening port. ```rust [2025-05-18 11:23:09.970] [info] llama_api_server in llama-api-server/src/main.rs:202: LOG LEVEL: info [2025-05-18 11:23:09.973] [info] llama_api_server in llama-api-server/src/main.rs:205: SERVER VERSION: 0.18.5 [2025-05-18 11:23:09.976] [info] llama_api_server in llama-api-server/src/main.rs:544: model_name: Qwen2.5-VL-7B-Instruct ... [2025-05-18 11:23:10.531] [info] llama_api_server in llama-api-server/src/main.rs:917: plugin_ggml_version: b5361 (commit cf0a43bb) [2025-05-18 11:23:10.533] [info] llama_api_server in llama-api-server/src/main.rs:952: Listening on 0.0.0.0:8080 ``` -------------------------------- ### Run LlamaEdge API Server Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md Starts the LlamaEdge API server using WasmEdge. It preloads the downloaded LLM model and specifies the model's parameter set for inference. ```bash wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3.2-1B-Instruct-Q5_K_M.gguf llama-api-server.wasm -p llama-3-chat ``` -------------------------------- ### WasmEdge LLM Prompt Examples Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/basic-llm-app.md Provides example prompts to demonstrate the code completion capabilities of the LLM. ```bash USER: def print_hello_world(): USER: fn is_prime(n: u64) -> bool { USER: Write a Rust function to check if an input number is prime: ``` -------------------------------- ### Install OpenAI Python Library Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/intro.md Installs the official OpenAI Python library using pip. This library is used to interact with the LlamaEdge API. ```bash pip install openai ``` -------------------------------- ### Download the portable API server app Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Downloads the lightweight and cross-platform sd-api-server.wasm application from the latest release. ```bash curl -LO https://github.com/LlamaEdge/sd-api-server/releases/latest/download/sd-api-server.wasm ``` -------------------------------- ### Clone and Build WasmEdge LLM Example Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/basic-llm-app.md Clones the WasmEdge WASINN examples repository, navigates to the basic GGML directory, and builds the application for wasm32-wasip1. ```bash git clone https://github.com/second-state/WasmEdge-WASINN-examples cd WasmEdge-WASINN-examples cd wasmedge-ggml/basic cargo build --target wasm32-wasip1 --release cp target/wasm32-wasip1/release/wasmedge-ggml-basic.wasm . ``` -------------------------------- ### Clone and Setup Demo Agent Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/tool-call.md Clones the `llm_todo` repository from GitHub and installs the necessary Python dependencies. This agent demonstrates LLM interaction with a SQL database. ```bash git clone https://github.com/second-state/llm_todo cd llm_todo pip install -r requirements.txt ``` -------------------------------- ### Run the MCP Weather Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md Starts the cardea-mcp-weather-server, making it accessible via HTTP stream on port 8002. Ensure the port is open for external access. ```bash ./cardea-weather-mcp-server --transport stream-http --socket-addr 0.0.0.0:8002 ``` -------------------------------- ### Start LlamaEdge API Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/continue.md Starts the LlamaEdge API server with specified models for coding and embeddings, including prompt templates and batch sizes. ```bash wasmedge --dir .:. \ --nn-preload default:GGML:AUTO:Codestral-22B-v0.1-hf-Q5_K_M.gguf \ --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \ llama-api-server.wasm \ --model-alias default,embedding \ --model-name Codestral-22B-v0.1-hf-Q5_K_M,nomic-embed-text-v1.5.f16 \ --prompt-template mistral-instruct,embedding \ --batch-size 128,8192 \ --ctx-size 32768,8192 ``` -------------------------------- ### Run Translation Agent Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/translation-agent.md Executes the example translation script. This command navigates to the examples directory and runs the Python script to perform the translation. ```shell cd examples python example_script.py ``` -------------------------------- ### Install Llama-Nexus Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/quick-start.md Downloads and extracts the Llama-Nexus software for Linux on x86. Ensure you have `curl` and `tar` installed. Download for other platforms is available via the provided link. ```bash curl -LO https://github.com/LlamaEdge/llama-nexus/releases/latest/download/llama-nexus-unknown-linux-gnu-x86_64.tar.gz tar xvf llama-nexus-unknown-linux-gnu-x86_64.tar ``` -------------------------------- ### Run Whisper API Server Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Starts the Whisper API server using WasmEdge. The server requires the Wasm file and the Whisper model file. It defaults to running on port 8080. ```bash wasmedge --dir .:. whisper-api-server.wasm -m ggml-medium.bin ``` -------------------------------- ### Configure Llama-Nexus Server Port Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md Sets the host and port for the Llama-Nexus server in the `config.toml` file. ```toml [server] host = "0.0.0.0" # The host to listen on port = 9095 # The port to listen on ``` -------------------------------- ### Run LlamaEdge API Server with Gemma-3 Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md Starts the LlamaEdge API server using WasmEdge, loading the Gemma-3 model and mmproj file. This command configures the prompt template, context size, and model name, making the API accessible on port 8080. ```bash wasmedge --dir .:. --nn-preload default:GGML:AUTO:gemma-3-4b-it-Q5_K_M.gguf \ llama-api-server.wasm \ --prompt-template gemma-3 \ --llava-mmproj gemma-3-4b-it-mmproj-f16.gguf \ --ctx-size 4096 \ --model-name gemma-3-4b ``` -------------------------------- ### General Assistance Example Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/obsidian.md Demonstrates how selecting text and using the 'General help' feature provides contextual information. Shows an example of a response when the model lacks specific training data for the selected text. ```English The information you're looking for is not present in this context. If you need to know the format and dates of KubeCon + CloudNativeCon + Open Source Summit + AI_dev China 2024, I suggest searching for official announcements or websites related to these events. ``` -------------------------------- ### Run LlamaEdge API Server with Qwen 2.5 VL Model Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md Starts the WasmEdge runtime with the LlamaEdge API server, loading the Qwen 2.5 VL model and configuring it for vision-language tasks. This command requires the model files and the API server WASM binary to be present in the current directory. ```wasm wasmedge --dir .:. \ --nn-preload default:GGML:AUTO:Qwen2.5-VL-7B-Instruct-Q5_K_M.gguf \ llama-api-server.wasm \ --model-name Qwen2.5-VL-7B-Instruct \ --prompt-template qwen2-vision \ --llava-mmproj Qwen2.5-VL-7B-Instruct-vision.gguf \ --ctx-size 4096 ``` -------------------------------- ### Start LlamaEdge API Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/agent-zero.md Starts the LlamaEdge API server with specified models and configurations. It preloads models and sets aliases, names, prompt templates, batch sizes, and context sizes. ```bash wasmedge --dir .:. \ --nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \ --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \ llama-api-server.wasm \ --model-alias default,embedding \ --model-name Meta-Llama-3.1-8B-Instruct-Q5_K_M,nomic-embed-text-v1.5.f16 \ --prompt-template llama-3-chat,embedding \ --batch-size 128,8192 \ --ctx-size 32768,8192 ``` -------------------------------- ### Clone and Build WasmEdge LLM Example Source: https://github.com/llamaedge/docs/blob/main/docs/inference-sdk/chatbot-llm-app.md Steps to clone the WasmEdge WASINN examples repository, navigate to the llama directory, build the application using cargo, and copy the compiled WASM file. ```shell git clone https://github.com/second-state/WasmEdge-WASINN-examples cd WasmEdge-WASINN-examples cd wasmedge-ggml/llama cargo build --target wasm32-wasip1 --release cp target/wasm32-wasip1/release/wasmedge-ggml-llama.wasm . ``` -------------------------------- ### Start LlamaEdge API Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/translation-agent.md Starts the LlamaEdge API server with specified models and configurations. It preloads the Gemma-2-9B and embedding models and sets up aliases, model names, prompt templates, batch sizes, and context sizes. ```bash wasmedge --dir .:. \ --nn-preload default:GGML:AUTO:gemma-2-9b-it-Q5_K_M.gguf \ --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \ llama-api-server.wasm \ --model-alias default,embedding \ --model-name gemma-2-9b-it-Q5_K_M,nomic-embed-text-v1.5.f16 \ --prompt-template gemma-instruct,embedding \ --batch-size 128,8192 \ --ctx-size 8192,8192 ``` -------------------------------- ### Download Qwen 2.5 VL Model Files Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md Downloads the Qwen 2.5 VL 7B Instruct model in GGUF format and its associated mmproj file, required for vision-language tasks. ```bash curl -LO https://huggingface.co/second-state/Qwen2.5-VL-7B-Instruct-GGUF/resolve/main/Qwen2.5-VL-7B-Instruct-Q5_K_M.gguf curl -LO https://huggingface.co/second-state/Qwen2.5-VL-7B-Instruct-GGUF/resolve/main/Qwen2.5-VL-7B-Instruct-vision.gguf ``` -------------------------------- ### Download LlamaEdge API Server App Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md Downloads the compiled WebAssembly binary for the LlamaEdge API server. This lightweight, cross-platform application provides an OpenAI-compatible API for interacting with LLMs. ```shell curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm ``` -------------------------------- ### Download and Extract MCP Servers Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md Downloads the cardea-mcp-servers release for Linux and extracts the archive. This is the first step in setting up the weather MCP server. ```bash curl -LO https://github.com/cardea-mcp/cardea-mcp-servers/releases/download/0.8.0/cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz tar xvf cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz ``` -------------------------------- ### API Request Response Example Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/medgemma-4b.md An example of a successful response from the llama-api-server after processing an API request containing image data. It includes details about the generated content and token usage. ```json { "id": "chatcmpl-e5f777db-c913-45ab-b37f-e2c499c8fa0b", "object": "chat.completion", "created": 1747652210, "model": "medgemma-4b", "choices": [ { "index": 0, "message": { "content": "There is a round, dense opacity in the right lower lobe of the lung. This could be a mass or nodule, and further investigation would be needed to determine its nature.", "role": "assistant" }, "finish_reason": "stop", "logprobs": null } ], "usage": { "prompt_tokens": 27, "completion_tokens": 68, "total_tokens": 95 } } ``` -------------------------------- ### Download and Extract Llama-Nexus Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/quick-start-with-mcp.md Downloads the llama-nexus release for Apple Darwin (aarch64) and extracts the archive. This prepares the inference server. ```bash curl -LO https://github.com/LlamaEdge/llama-nexus/releases/download/0.6.0/llama-nexus-apple-darwin-aarch64.tar.gz tar xvf llama-nexus-apple-darwin-aarch64.tar.gz ``` -------------------------------- ### Download Stable Diffusion Plugin for CUDA 12.0 (Ubuntu) Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Downloads the WasmEdge stable diffusion plugin for Ubuntu systems with CUDA 12.0 support and extracts it to the WasmEdge plugin directory. ```bash # Download the stable diffusion plugin for cuda 12.0 curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-cuda-12.0-0.14.1-ubuntu20.04_x86_64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-cuda-12.0-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin ``` -------------------------------- ### Successful API Response Example Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md An example of a successful JSON response from the LlamaEdge API server after processing a multimodal request. It includes details about the completion, model used, and token usage. ```json { "id": "chatcmpl-4367085d-6451-4896-bbd8-a5090604394d", "object": "chat.completion", "created": 1747369554, "model": "Qwen2-VL-2B-Instruct", "choices": [ { "index": 0, "message": { "content": "mixed berries in a paper bowl", "role": "assistant" }, "finish_reason": "stop", "logprobs": null } ], "usage": { "prompt_tokens": 27, "completion_tokens": 8, "total_tokens": 35 } } ``` -------------------------------- ### Install and Extract cardea-mcp-servers Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/agentic-search.md Downloads and extracts the cardea-mcp-servers for Linux on x86. This is a prerequisite for starting the agentic search MCP server. ```bash curl -LO https://github.com/cardea-mcp/cardea-mcp-servers/releases/download/0.8.0/cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz gunzip cardea-mcp-servers-unknown-linux-gnu-x86_64.tar.gz tar xvf cardea-mcp-servers-unknown-linux-gnu-x86_64.tar ``` -------------------------------- ### Download Whisper API Server Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Downloads the Whisper API server as a WebAssembly (Wasm) file. This server provides an OpenAI-compatible API interface for Whisper. ```bash curl -LO https://github.com/LlamaEdge/whisper-api-server/releases/download/0.3.9/whisper-api-server.wasm ``` -------------------------------- ### Download Stable Diffusion Plugin for CUDA 11.0 (Ubuntu) Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Downloads the WasmEdge stable diffusion plugin for Ubuntu systems with CUDA 11.0 support and extracts it to the WasmEdge plugin directory. ```bash # Download the stable diffusion plugin for cuda 11.0 curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin ``` -------------------------------- ### Whisper API Translation Response Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Example JSON response from the Whisper API server after a successful translation request. It includes the translated text with timestamps. ```json { "text": "[00:00:00.000 --> 00:00:04.000] This is a Chinese broadcast." } ``` -------------------------------- ### Whisper API Transcription Response Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Example JSON response from the Whisper API server after a successful transcription request. It includes the transcribed text with timestamps. ```json { "text": "[00:00:00.000 --> 00:00:03.540] This is a test record for Whisper.cpp" } ``` -------------------------------- ### Download LLM Models Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md Downloads the Llava-v1.6-Vicuna-7B model and its corresponding mmproj model from Hugging Face. ```shell curl -LO https://huggingface.co/second-state/Llava-v1.6-Vicuna-7B-GGUF/resolve/main/llava-v1.6-vicuna-7b-Q5_K_M.gguf curl -LO https://huggingface.co/second-state/Llava-v1.6-Vicuna-7B-GGUF/resolve/main/llava-v1.6-vicuna-7b-mmproj-model-f16.gguf ``` -------------------------------- ### Download Stable Diffusion Plugin for Mac Apple Silicon Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Downloads the WasmEdge stable diffusion plugin specifically for Mac Apple Silicon (arm64 architecture) and extracts it to the WasmEdge plugin directory. ```bash # Download the stable diffusion plugin for Mac Apple Silicon curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasmedge_stablediffusion-0.14.1-darwin_arm64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin rm $HOME/.wasmedge/plugin/libwasmedgePluginWasiNN.dylib ``` -------------------------------- ### Llama-API-Server Successful Response Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/gemma-3.md Example of a successful JSON response from the llama-api-server after processing a multimodal request. It includes the model's generated content, usage statistics, and other metadata. ```json { "id": "chatcmpl-e5f777db-c913-45ab-b37f-e2c499c8fa0b", "object": "chat.completion", "created": 1747652210, "model": "gemma-3-4b", "choices": [ { "index": 0, "message": { "content": "mixed berries in a paper bowl", "role": "assistant" }, "finish_reason": "stop", "logprobs": null } ], "usage": { "prompt_tokens": 27, "completion_tokens": 8, "total_tokens": 35 } } ``` -------------------------------- ### Download LlamaEdge API Server App Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/qwen2-5.md Downloads the compiled binary for the LlamaEdge API server, which provides an OpenAI-compatible API for LLMs. ```bash curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm ``` -------------------------------- ### Download Whisper Plugin for Mac Apple Silicon Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Downloads the Whisper plugin specifically for Mac Apple Silicon architecture. The plugin is then extracted to the WasmEdge plugin directory. ```bash # Download the whisper plugin for Mac Apple Silicon curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin ``` -------------------------------- ### Install Python Dependencies Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/langchain.md Installs all required Python dependencies for the chatbot application from the 'requirements.txt' file. ```shell pip install -r requirements.txt ``` -------------------------------- ### Download the Stable Diffusion model Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/quick-start-sd.md Downloads a specific Stable Diffusion model (v2-1_768-nonema-pruned-f16.gguf) from Hugging Face. Links to a collection of other models are also provided. ```bash curl -LO https://huggingface.co/second-state/stable-diffusion-2-1-GGUF/resolve/main/v2-1_768-nonema-pruned-f16.gguf ``` -------------------------------- ### Install Dependencies Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/agent-zero.md Installs the required Python dependencies for the Agent Zero application using pip. ```bash pip install -r requirements.txt ``` -------------------------------- ### Download LLM Model Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/llm/quick-start-llm.md Downloads the Llama 3.2 1B Instruct model in GGUF format from Hugging Face. This model is finetuned for instruction following and is required for the LlamaEdge API server. ```shell curl -LO https://huggingface.co/second-state/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q5_K_M.gguf ``` -------------------------------- ### Start llama-nexus Server Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/mcp/agentic-search.md Starts the llama-nexus inference server in the background using the specified configuration file. ```bash nohup ./llama-nexus --config config.toml & ``` -------------------------------- ### Download LlamaEdge API Server App Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/multimodal/llava.md Downloads the compiled Wasm binary for the LlamaEdge API server. ```shell curl -LO https://github.com/second-state/LlamaEdge/releases/latest/download/llama-api-server.wasm ``` -------------------------------- ### Install WasmEdge Runtime with GGML Plugin Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/openai-api/langchain.md Installs the WasmEdge runtime, including the wasi_nn-ggml plugin, which is necessary for running LLM models. ```shell curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml ``` -------------------------------- ### Install WasmEdge Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/text-to-image/flux.md Installs WasmEdge version 0.14.1 using a curl script. This is the foundational step for running Wasm applications. ```bash curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1 ``` -------------------------------- ### Download Whisper Plugin for CUDA 11.0 (Ubuntu) Source: https://github.com/llamaedge/docs/blob/main/docs/ai-models/speech-to-text/quick-start-whisper.md Downloads the Whisper plugin for Ubuntu systems with CUDA 11.0. The plugin is then extracted to the WasmEdge plugin directory. ```bash # Download the stable diffusion plugin for cuda 11.0 curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-whisper-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasi_nn-whisper-cuda-11.3-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin ``` -------------------------------- ### Start Llama-Nexus Service Source: https://github.com/llamaedge/docs/blob/main/docs/llama-nexus/quick-start.md Starts the Llama-Nexus service using a specified configuration file. By default, it listens on port 3389. ```bash ./llama-nexus --config config.toml ```