### Setup Virtual Environment Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/earnings21/README.md Create and activate a virtual environment before installing dependencies. ```bash $ python3 -m venv venv $ . venv/bin/activate $ pip install -r requirements.txt ``` -------------------------------- ### Setup Python Environment Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/librispeech/README.md Install dependencies for WER score calculation, optionally using a virtual environment. ```bash $ pip install -r requirements.txt ``` ```bash $ python3 -m venv venv $ . venv/bin/activate $ pip install -r requirements.txt ``` -------------------------------- ### Install whisper.cpp with Conan Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Installs pre-built binaries or builds whisper.cpp from source using Conan. Ensure you have Conan installed and configured. ```bash conan install --requires="whisper-cpp/[*]" --build=missing ``` -------------------------------- ### Setup Python Virtual Environment for OpenVINO (Windows) Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Sets up a Python virtual environment and installs necessary dependencies for OpenVINO model conversion on Windows. ```powershell cd models python -m venv openvino_conv_env openvino_conv_env\Scripts\activate python -m pip install --upgrade pip pip install -r requirements-openvino.txt ``` -------------------------------- ### Setup Python Virtual Environment for OpenVINO (Linux/macOS) Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Sets up a Python virtual environment and installs necessary dependencies for OpenVINO model conversion on Linux and macOS. ```bash cd models python3 -m venv openvino_conv_env source openvino_conv_env/bin/activate python -m pip install --upgrade pip pip install -r requirements-openvino.txt ``` -------------------------------- ### Real-time Audio Input Example Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Performs real-time inference on microphone audio using the `whisper-stream` tool. Requires SDL2 to be installed. Adjust model path and parameters as needed. ```bash cmake -B build -DWHISPER_SDL2=ON cmake --build build -j --config Release ./build/bin/whisper-stream -m ./models/ggml-base.en.bin -t 8 --step 500 --length 5000 ``` -------------------------------- ### Set up Virtual Environment and Install Requirements Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/librispeech-parakeet/README.md Set up a Python virtual environment and install the required packages for WER score computation. ```bash python3 -m venv venv . venv/bin/activate pip install -r requirements.txt ``` -------------------------------- ### Configure Example Subdirectories Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/CMakeLists.txt Includes various example subdirectories based on build conditions like EMSCRIPTEN, CMAKE_JS_VERSION, or the presence of WHISPER_SDL2. ```cmake # examples include_directories(${CMAKE_CURRENT_SOURCE_DIR}) if (EMSCRIPTEN) add_subdirectory(whisper.wasm) add_subdirectory(stream.wasm) add_subdirectory(command.wasm) add_subdirectory(bench.wasm) add_subdirectory(wchess) elseif(CMAKE_JS_VERSION) add_subdirectory(addon.node) else() add_subdirectory(cli) add_subdirectory(bench) add_subdirectory(server) add_subdirectory(quantize) add_subdirectory(vad-speech-segments) add_subdirectory(parakeet-cli) add_subdirectory(parakeet-quantize) if (WHISPER_SDL2) add_subdirectory(stream) add_subdirectory(command) add_subdirectory(talk-llama) add_subdirectory(lsp) if (GGML_SYCL) add_subdirectory(sycl) endif () endif (WHISPER_SDL2) add_subdirectory(deprecation-warning) endif() if (WHISPER_SDL2) add_subdirectory(wchess) endif (WHISPER_SDL2) ``` -------------------------------- ### Start Local HTTP Server Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/bench.wasm/README.md Run the provided Python script to start a local HTTP server for serving the benchmark files. ```console python3 examples/server.py ``` -------------------------------- ### Setup OpenVINO Environment (Windows CMD) Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Runs the OpenVINO setup batch script to configure the environment for building whisper.cpp with OpenVINO support on Windows Command Prompt. ```batch C:\Path\To\w_openvino_toolkit_windows_2023.0.0.10926.b4452d56304_x86_64\setupvars.bat ``` -------------------------------- ### Set Target Properties and Install Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-hexagon/htp/CMakeLists.txt Configures export compile commands for the target and installs the HTP library. ```cmake set_target_properties(${HTP_LIB} PROPERTIES EXPORT_COMPILE_COMMANDS ON) install(TARGETS ${HTP_LIB}) ``` -------------------------------- ### Generate and Install Pkg-config File Source: https://github.com/ggml-org/whisper.cpp/blob/master/CMakeLists.txt Configures and installs the pkg-config file (.pc) for the 'whisper' and 'parakeet' libraries. This allows other build systems to easily find and link against the installed libraries. ```cmake configure_file(cmake/whisper.pc.in "${CMAKE_CURRENT_BINARY_DIR}/whisper.pc" @ONLY) install(FILES "${CMAKE_CURRENT_BINARY_DIR}/whisper.pc" DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig) ``` ```cmake configure_file(cmake/parakeet.pc.in "${CMAKE_CURRENT_BINARY_DIR}/parakeet.pc" @ONLY) install(FILES "${CMAKE_CURRENT_BINARY_DIR}/parakeet.pc" DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig) ``` -------------------------------- ### Install Whisper Server Executable Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/server/CMakeLists.txt Installs the 'whisper-server' target as a runtime executable. ```cmake install(TARGETS ${TARGET} RUNTIME) ``` -------------------------------- ### Install Metal Library Files Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-metal/CMakeLists.txt Configures installation rules for Metal source and binary files with specific permissions. ```cmake if (NOT GGML_METAL_EMBED_LIBRARY) install( FILES src/ggml-metal/ggml-metal.metal PERMISSIONS OWNER_READ OWNER_WRITE GROUP_READ WORLD_READ DESTINATION ${CMAKE_INSTALL_BINDIR}) install( FILES ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib DESTINATION ${CMAKE_INSTALL_BINDIR} ) endif() ``` -------------------------------- ### Run Whisper and VAD Examples Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/addon.node/README.md Execute the basic whisper transcription or the VAD performance comparison example. ```shell cd examples/addon.node node index.js --language='language' --model='model-path' --fname_inp='file-path' ``` ```shell node vad-example.js ``` -------------------------------- ### Setup OpenVINO Environment (Linux) Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Sources the OpenVINO setup script to configure the environment for building whisper.cpp with OpenVINO support on Linux. ```bash source /path/to/l_openvino_toolkit_ubuntu22_2023.0.0.10926.b4452d56304_x86_64/setupvars.sh ``` -------------------------------- ### Set Installation Paths Source: https://github.com/ggml-org/whisper.cpp/blob/master/CMakeLists.txt Configures installation directories for libraries, headers, and binaries using CMake variables. These paths can be overridden by users. ```cmake set(WHISPER_INCLUDE_INSTALL_DIR ${CMAKE_INSTALL_INCLUDEDIR} CACHE PATH "Location of header files") set(WHISPER_LIB_INSTALL_DIR ${CMAKE_INSTALL_LIBDIR} CACHE PATH "Location of library files") set(WHISPER_BIN_INSTALL_DIR ${CMAKE_INSTALL_BINDIR} CACHE PATH "Location of binary files") ``` ```cmake set(PARAKEET_INCLUDE_INSTALL_DIR ${CMAKE_INSTALL_INCLUDEDIR} CACHE PATH "Location of header files") set(PARAKEET_LIB_INSTALL_DIR ${CMAKE_INSTALL_LIBDIR} CACHE PATH "Location of library files") set(PARAKEET_BIN_INSTALL_DIR ${CMAKE_INSTALL_BINDIR} CACHE PATH "Location of binary files") ``` -------------------------------- ### Build and Run Whisper CLI Example Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Build the whisper-cli example using CMake and then use it to transcribe an audio file. The CLI currently supports 16-bit WAV files. ```bash # build the project cmake -B build cmake --build build -j --config Release # transcribe an audio file ./build/bin/whisper-cli -f samples/jfk.wav ``` -------------------------------- ### Sample Execution Output Source: https://github.com/ggml-org/whisper.cpp/blob/master/bindings/javascript/README.md Example output showing model loading, system information, and transcription results from running the test script. ```text $ node --experimental-wasm-threads --experimental-wasm-simd ../tests/test-whisper.js whisper_model_load: loading model from 'whisper.bin' whisper_model_load: n_vocab = 51864 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 512 whisper_model_load: n_audio_head = 8 whisper_model_load: n_audio_layer = 6 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 512 whisper_model_load: n_text_head = 8 whisper_model_load: n_text_layer = 6 whisper_model_load: n_mels = 80 whisper_model_load: f16 = 1 whisper_model_load: type = 2 whisper_model_load: adding 1607 extra tokens whisper_model_load: mem_required = 506.00 MB whisper_model_load: ggml ctx size = 140.60 MB whisper_model_load: memory size = 22.83 MB whisper_model_load: model size = 140.54 MB system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | NEON = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 1 | BLAS = 0 | operator(): processing 176000 samples, 11.0 sec, 8 threads, 1 processors, lang = en, task = transcribe ... [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country. whisper_print_timings: load time = 162.37 ms whisper_print_timings: mel time = 183.70 ms whisper_print_timings: sample time = 4.27 ms whisper_print_timings: encode time = 8582.63 ms / 1430.44 ms per layer whisper_print_timings: decode time = 436.16 ms / 72.69 ms per layer whisper_print_timings: total time = 9370.90 ms ``` -------------------------------- ### Install Parakeet CLI Executable Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/parakeet-cli/CMakeLists.txt Installs the built parakeet-cli executable to the runtime directory. ```cmake install(TARGETS ${TARGET} RUNTIME) ``` -------------------------------- ### Build whisper.cpp VAD Example Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/vad-speech-segments/README.md Build the VAD speech segmentation example using CMake. Ensure you have a compatible build environment set up. ```console cmake -S . -B build cmake --build build -j8 --target vad-speech-segments ``` -------------------------------- ### Build and test whisper.cpp Go bindings Source: https://github.com/ggml-org/whisper.cpp/blob/master/bindings/go/README.md Commands to clone the repository, compile the library, and run tests or build examples. ```bash git clone https://github.com/ggml-org/whisper.cpp.git cd whisper.cpp/bindings/go make test ``` ```bash make examples ``` ```bash GGML_CUDA=1 make examples ``` -------------------------------- ### Install Node.js Addon Dependencies Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/addon.node/README.md Install the necessary dependencies for the addon project. ```shell npm install ``` -------------------------------- ### Install CMake Package Files Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/CMakeLists.txt Installs the generated ggml-config.cmake and ggml-version.cmake files into the CMake package directory. ```cmake install(FILES ${CMAKE_CURRENT_BINARY_DIR}/ggml-config.cmake ${CMAKE_CURRENT_BINARY_DIR}/ggml-version.cmake DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/ggml) ``` -------------------------------- ### Install ggml Targets Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/CMakeLists.txt Installs the ggml library and its public headers, as well as the ggml-base library. ```cmake install(TARGETS ggml LIBRARY PUBLIC_HEADER) install(TARGETS ggml-base LIBRARY) ``` -------------------------------- ### Install whisper.nvim script Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/whisper.nvim/README.md Copy the whisper.nvim script to a directory in your PATH and make it executable. This makes the script available system-wide. ```bash cp examples/whisper.nvim/whisper.nvim ~/bin/ chmod u+x ~/bin/whisper.nvim ``` -------------------------------- ### Verify OpenCL Installation Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Use clinfo to verify that the Intel GPU driver is correctly installed and recognized by the system. ```bash sudo apt install clinfo sudo clinfo -l ``` -------------------------------- ### Example Core ML Inference Output Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Example output from whisper-cli when using a Core ML model, showing the initialization of the Core ML model and system information including Core ML support. ```text $ ./build/bin/whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav ... whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc' whisper_init_state: first run on a device may take a while ... whisper_init_state: Core ML model loaded system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | ... ``` -------------------------------- ### Verify Device Selection Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Example log output confirming the selected device ID. ```text Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device ``` -------------------------------- ### Build with OpenBLAS CPU Support Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Build whisper.cpp with CPU acceleration for the encoder using OpenBLAS. Ensure OpenBLAS is installed. ```bash cmake -B build -DGGML_BLAS=1 cmake --build build -j --config Release ``` -------------------------------- ### Initialize Chessboard Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/wchess/wchess.wasm/index-tmpl.html Initializes the chessboard UI component with the starting position. ```javascript var board = Chessboard('chessboard', 'start') var move_count = 0; ``` -------------------------------- ### SYCL Device Output Log Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Example output showing detected SYCL devices and their compute capabilities. ```text found 4 SYCL devices: Device 0: Intel(R) Arc(TM) A770 Graphics, compute capability 1.3, max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136 Device 1: Intel(R) FPGA Emulation Device, compute capability 1.2, max compute_units 24, max work group size 67108864, max sub group size 64, global mem size 67065057280 Device 2: 13th Gen Intel(R) Core(TM) i7-13700K, compute capability 3.0, max compute_units 24, max work group size 8192, max sub group size 64, global mem size 67065057280 Device 3: Intel(R) Arc(TM) A770 Graphics, compute capability 3.0, max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136 ``` -------------------------------- ### Build whisper-command with SDL2 Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/command/README.md Install the required SDL2 dependency and compile the project using CMake. ```bash # Install SDL2 # On Debian based linux distributions: sudo apt-get install libsdl2-dev # On Fedora Linux: sudo dnf install SDL2 SDL2-devel # Install SDL2 on Mac OS brew install sdl2 cmake -B build -DWHISPER_SDL2=ON cmake --build build --config Release ``` -------------------------------- ### Build whisper.cpp with Ascend NPU Support Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Use this command to build whisper.cpp with Ascend NPU acceleration enabled. Ensure you have the CANN toolkit installed. ```bash cmake -B build -DGGML_CANN=1 cmake --build build -j --config Release ``` -------------------------------- ### Example Preprocessor Directive Source: https://github.com/ggml-org/whisper.cpp/blob/master/CONTRIBUTING.md Demonstrates a basic preprocessor directive structure with an opening and closing `#ifdef`. ```cpp #ifdef FOO #endif // FOO ``` -------------------------------- ### Run Inference with Ascend NPU Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Execute inference examples after building with Ascend NPU support. This command assumes the build directory and model files are in their default locations. ```bash ./build/bin/whisper-cli -f samples/jfk.wav -m models/ggml-base.en.bin -t 8 ``` -------------------------------- ### Build and run whisper-talk-llama Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/talk-llama/README.md Commands to install SDL2 dependencies, compile the project using CMake, and execute the tool with specified Whisper and LLaMA models. ```bash # Install SDL2 # On Debian based linux distributions: sudo apt-get install libsdl2-dev # On Fedora Linux: sudo dnf install SDL2 SDL2-devel # Install SDL2 on Mac OS brew install sdl2 # Build the "whisper-talk-llama" executable cmake -B build -S . -DWHISPER_SDL2=ON cmake --build build --config Release # Run it ./build/bin/whisper-talk-llama -mw ./models/ggml-small.en.bin -ml ../llama.cpp/models/llama-13b/ggml-model-q4_0.gguf -p "Georgi" -t 8 ``` -------------------------------- ### Quick Demo: Build and Run with Make Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md A simplified command to download the base.en model and run inference on all sample WAV files in the samples folder. ```bash make base.en ``` -------------------------------- ### Initialize and process audio with whisper.cpp in Go Source: https://github.com/ggml-org/whisper.cpp/blob/master/bindings/go/README.md Demonstrates loading a model, creating a context, and processing audio samples to extract segments. ```go import ( "github.com/ggerganov/whisper.cpp/bindings/go/pkg/whisper" ) func main() { var modelpath string // Path to the model var samples []float32 // Samples to process // Load the model model, err := whisper.New(modelpath) if err != nil { panic(err) } defer model.Close() // Process samples context, err := model.NewContext() if err != nil { panic(err) } if err := context.Process(samples, nil, nil, nil); err != nil { return err } // Print out the results for { segment, err := context.NextSegment() if err != nil { break } fmt.Printf("[%6s->%6s] %s\n", segment.Start, segment.End, segment.Text) } } ``` -------------------------------- ### Run Whisper Web Server using Docker Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Start the whisper.cpp web server in a Docker container, exposing port 8080 and mounting a directory for models. ```shell docker run -it --rm -p "8080:8080" \ -v path/to/models:/models \ whisper.cpp:main "whisper-server --host 127.0.0.1 -m /models/ggml-base.bin" ``` -------------------------------- ### Generate Karaoke-Style Movie with Whisper CLI Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Use the whisper-cli with the -owts argument to generate karaoke-style movies. Requires ffmpeg to be installed. The generated .wts file is then sourced to play the movie. ```bash ./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f ./samples/jfk.wav -owts source ./samples/jfk.wav.wts ffplay ./samples/jfk.wav.mp4 ``` ```bash ./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f ./samples/mm0.wav -owts source ./samples/mm0.wav.wts ffplay ./samples/mm0.wav.mp4 ``` ```bash ./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f ./samples/gb0.wav -owts source ./samples/gb0.wav.wts ffplay ./samples/gb0.wav.mp4 ``` -------------------------------- ### Set Target Properties and Install Headers Source: https://github.com/ggml-org/whisper.cpp/blob/master/CMakeLists.txt Sets the public header for the 'whisper' and 'parakeet' targets and installs them. This ensures headers are available after installation. ```cmake set_target_properties(whisper PROPERTIES PUBLIC_HEADER ${CMAKE_CURRENT_SOURCE_DIR}/include/whisper.h) install(TARGETS whisper LIBRARY PUBLIC_HEADER) ``` ```cmake set_target_properties(parakeet PROPERTIES PUBLIC_HEADER ${CMAKE_CURRENT_SOURCE_DIR}/include/parakeet.h) install(TARGETS parakeet LIBRARY PUBLIC_HEADER) ``` -------------------------------- ### Install Requirements Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/earnings21/README.md Install Python dependencies for WER score calculation. ```bash $ pip install -r requirements.txt ``` -------------------------------- ### Compile and Download Model Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/librispeech/README.md Build the project and download the required tiny model. ```bash $ # Execute the commands below in the project root dir. $ cmake -B build $ cmake --build build --config Release $ ./models/download-ggml-model.sh tiny ``` -------------------------------- ### Install FFmpeg Dependencies Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Install the necessary FFmpeg development libraries for Debian/Ubuntu or RHEL/Fedora systems. ```bash # Debian/Ubuntu sudo apt install libavcodec-dev libavformat-dev libavutil-dev # RHEL/Fedora sudo dnf install libavcodec-free-devel libavformat-free-devel libavutil-free-devel ``` -------------------------------- ### Initialize oneAPI Environment Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Source the oneAPI environment variables and verify device availability using sycl-ls. ```bash source /opt/intel/oneapi/setvars.sh sycl-ls ``` -------------------------------- ### Load Test Whisper.cpp Server with k6 Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/server/README.md Run this command to benchmark the Whisper.cpp server using the provided k6 script. Configure concurrency and request parameters via environment variables. ```bash k6 run bench.js \ --env FILE_PATH=/absolute/path/to/samples/jfk.wav \ --env BASE_URL=http://127.0.0.1:8080 \ --env ENDPOINT=/inference \ --env CONCURRENCY=4 \ --env TEMPERATURE=0.0 \ --env TEMPERATURE_INC=0.2 \ --env RESPONSE_FORMAT=json ``` -------------------------------- ### List SYCL Devices Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Execute the binary to list available SYCL devices and their capabilities. ```bash ./build/bin/ls-sycl-device ``` ```bash ./build/bin/main ``` -------------------------------- ### Configure MSVC Targets for Examples Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/CMakeLists.txt Applies MSVC-specific settings to targets related to ggml examples when GGML_BUILD_EXAMPLES is enabled. ```cmake if (GGML_BUILD_EXAMPLES) configure_msvc_target(common-ggml) configure_msvc_target(common) configure_msvc_target(mnist-common) configure_msvc_target(mnist-eval) configure_msvc_target(mnist-train) configure_msvc_target(gpt-2-ctx) configure_msvc_target(gpt-2-alloc) configure_msvc_target(gpt-2-backend) configure_msvc_target(gpt-2-sched) configure_msvc_target(gpt-2-quantize) configure_msvc_target(gpt-2-batched) configure_msvc_target(gpt-j) configure_msvc_target(gpt-j-quantize) configure_msvc_target(magika) configure_msvc_target(yolov3-tiny) configure_msvc_target(sam) configure_msvc_target(simple-ctx) configure_msvc_target(simple-backend) endif() ``` -------------------------------- ### Guided Transcription Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/lsp/README.md Initiates a guided transcription using a previously registered commandset. The server processes audio from a specified timestamp. ```APIDOC ## POST /guided ### Description Performs a guided transcription using a specified commandset and timestamp. ### Method POST ### Endpoint /guided ### Parameters #### Request Body - **params.commandset_index** (integer) - Optional - The index of the commandset to use. Defaults to the most recently registered commandset if not provided. - **params.timestamp** (unsigned integer) - Optional - The point in time (in milliseconds) from which audio processing should begin. If omitted, processing starts upon message receipt. ### Response #### Success Response (200) - **result.command_index** (integer) - The index of the detected command within the selected commandset. - **result.command_text** (string) - The text of the detected command. - **result.timestamp** (unsigned integer) - The point in time (in milliseconds) at which audio processing stopped. This can be used to mask latency in subsequent requests. #### Response Example { "result": { "command_index": 0, "command_text": "hello world", "timestamp": 1678886400000 } } ### Error Handling - Ensure `commandset_index` is valid if provided. ``` -------------------------------- ### Build and Test whisper.cpp for Node.js Source: https://github.com/ggml-org/whisper.cpp/blob/master/bindings/javascript/README.md Commands to set up the environment, download models, prepare audio samples, build the project using Emscripten, and execute tests. ```bash # load emscripten source /path/to/emsdk/emsdk_env.sh # clone repo git clone https://github.com/ggerganov/whisper.cpp cd whisper.cpp # grab base.en model ./models/download-ggml-model.sh base.en # prepare PCM sample for testing ffmpeg -i samples/jfk.wav -f f32le -acodec pcm_f32le samples/jfk.pcmf32 # build mkdir build-em && cd build-em emcmake cmake .. && make -j # run test node ../tests/test-whisper.js # For Node.js versions prior to v16.4.0, experimental features need to be enabled: node --experimental-wasm-threads --experimental-wasm-simd ../tests/test-whisper.js # publish npm package make publish-npm ``` -------------------------------- ### Initialize WebAssembly Module and Audio Processing Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/command.wasm/index-tmpl.html Sets up the Emscripten Module configuration and initializes global variables for audio context and model management. ```javascript // web audio context var context = null; // audio data var audio = null; var audio0 = null; // the command instance var instance = null; // model name var model_whisper = null; var Module = { print: printTextarea, printErr: printTextarea, setStatus: function(text) { printTextarea('js: ' + text); }, monitorRunDependencies: function(left) { }, preRun: function() { printTextarea('js: Preparing ...'); }, postRun: function() { printTextarea('js: Initialized successfully!'); } }; ``` -------------------------------- ### Set ggml Installation Paths Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/CMakeLists.txt Configures the installation directories for ggml header, library, and binary files. These paths are cached for user customization. ```cmake set(GGML_INSTALL_VERSION ${GGML_VERSION}) set(GGML_INCLUDE_INSTALL_DIR ${CMAKE_INSTALL_INCLUDEDIR} CACHE PATH "Location of header files") set(GGML_LIB_INSTALL_DIR ${CMAKE_INSTALL_LIBDIR} CACHE PATH "Location of library files") set(GGML_BIN_INSTALL_DIR ${CMAKE_INSTALL_BINDIR} CACHE PATH "Location of binary files") ``` -------------------------------- ### ROCm Path Detection Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-hip/CMakeLists.txt Determines the ROCm installation path, prioritizing the ROCM_PATH environment variable, then falling back to common installation directories. ```cmake if (NOT EXISTS $ENV{ROCM_PATH}) if (NOT EXISTS /opt/rocm) set(ROCM_PATH /usr) else() set(ROCM_PATH /opt/rocm) endif() else() set(ROCM_PATH $ENV{ROCM_PATH}) endif() list(APPEND CMAKE_PREFIX_PATH ${ROCM_PATH}) list(APPEND CMAKE_PREFIX_PATH "${ROCM_PATH}/lib64/cmake") ``` -------------------------------- ### Build chessboard.js Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/wchess/wchess.wasm/chessboardjs-1.0.0/js/chessboard-1.0.0/README.md Commands to build the chessboard.js project and its website. ```sh # create a build in the build/ directory npm run build ``` ```sh # re-build the website npm run website ``` -------------------------------- ### Install WER Score Requirements Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/librispeech-parakeet/README.md Install the Python dependencies required for computing the Word Error Rate (WER) score. This can be done using pip. ```bash pip install -r requirements.txt ``` -------------------------------- ### Run Benchmark Source: https://github.com/ggml-org/whisper.cpp/blob/master/tests/librispeech/README.md Execute the benchmark test suite. ```bash $ make ``` -------------------------------- ### Install Hexagon Runtime Libraries Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-hexagon/CMakeLists.txt Installs the Hexagon Shared Kernel Libraries (skels) required at runtime. This is a general step for enabling Hexagon support. ```cmake install(FILES ${HTP_SKELS} TYPE LIB) ``` -------------------------------- ### Build and Run wchess Command-line Tool Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/wchess/README.md Instructions for compiling the project with SDL2 support and executing the binary with a specified Whisper model. ```bash mkdir build && cd build cmake -DWHISPER_SDL2=1 .. make -j ./bin/wchess -m ../models/ggml-base.en.bin ``` -------------------------------- ### Benchmark All Models (Quantized) Source: https://github.com/ggml-org/whisper.cpp/blob/master/scripts/bench-all-gg.txt Compiles the project and runs the benchmarking script for all models with quantized precision (e.g., q8_0). This is useful for evaluating performance with reduced precision. ```bash make -j && ./scripts/bench-all.sh 1 1 1 ``` -------------------------------- ### Install Python Dependencies for Core ML Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Install necessary Python packages for generating a Core ML model. Python 3.11 and Xcode command-line tools are recommended. ```bash pip install ane_transformers pip install openai-whisper pip install coremltools ``` -------------------------------- ### Configure Package Configuration Files Source: https://github.com/ggml-org/whisper.cpp/blob/master/CMakeLists.txt Generates and installs package configuration files (whisper-config.cmake and whisper-version.cmake) for the 'whisper' library. These files help other projects find and use the installed library. ```cmake configure_package_config_file( ${CMAKE_CURRENT_SOURCE_DIR}/cmake/whisper-config.cmake.in ${CMAKE_CURRENT_BINARY_DIR}/whisper-config.cmake INSTALL_DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/whisper PATH_VARS WHISPER_INCLUDE_INSTALL_DIR WHISPER_LIB_INSTALL_DIR WHISPER_BIN_INSTALL_DIR ) write_basic_package_version_file( ${CMAKE_CURRENT_BINARY_DIR}/whisper-version.cmake VERSION ${WHISPER_INSTALL_VERSION} COMPATIBILITY SameMajorVersion) install(FILES ${CMAKE_CURRENT_BINARY_DIR}/whisper-config.cmake ${CMAKE_CURRENT_BINARY_DIR}/whisper-version.cmake DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/whisper) ``` ```cmake configure_package_config_file( ${CMAKE_CURRENT_SOURCE_DIR}/cmake/parakeet-config.cmake.in ${CMAKE_CURRENT_BINARY_DIR}/parakeet-config.cmake INSTALL_DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/parakeet PATH_VARS PARAKEET_INCLUDE_INSTALL_DIR PARAKEET_LIB_INSTALL_DIR PARAKEET_BIN_INSTALL_DIR) write_basic_package_version_file( ${CMAKE_CURRENT_BINARY_DIR}/parakeet-version.cmake VERSION ${WHISPER_INSTALL_VERSION} COMPATIBILITY SameMajorVersion) install(FILES ${CMAKE_CURRENT_BINARY_DIR}/parakeet-config.cmake ${CMAKE_CURRENT_BINARY_DIR}/parakeet-version.cmake DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/parakeet) ``` -------------------------------- ### Run Video Comparison Script Source: https://github.com/ggml-org/whisper.cpp/blob/master/README.md Use this script to generate a video comparing different whisper models. It requires ffplay to play the generated video. ```bash ./scripts/bench-wts.sh samples/jfk.wav ffplay ./samples/jfk.wav.all.mp4 ``` -------------------------------- ### Start Audio Recording Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/wchess/wchess.wasm/index-tmpl.html Starts the audio recording process. It initializes the AudioContext and MediaRecorder, then captures audio chunks. Upon receiving data, it decodes the audio, processes it using OfflineAudioContext, and sends the samples to the whisper instance. ```javascript function startRecording() { if (!context) { context = new AudioContext({ sampleRate: kSampleRate, channelCount: 1, echoCancellation: false, autoGainControl: true, noiseSuppression: true, }); } startTime = Date.now(); var chunks = []; var stream = null; navigator.mediaDevices.getUserMedia({audio: true, video: false}) .then(function(s) { stream = s; mediaRecorder = new MediaRecorder(stream); mediaRecorder.ondataavailable = function(e) { chunks.push(e.data); var blob = new Blob(chunks, { 'type': 'audio/ogg; codecs=opus' }); var reader = new FileReader(); reader.onload = function(event) { var buf = new Uint8Array(reader.result); context.decodeAudioData(buf.buffer, function(audioBuffer) { var offlineContext = new OfflineAudioContext(audioBuffer.numberOfChannels, audioBuffer.length, audioBuffer.sampleRate); var source = offlineContext.createBufferSource(); source.buffer = audioBuffer; source.connect(offlineContext.destination); source.start(0); offlineContext.startRendering().then(function(renderedBuffer) { let audio = renderedBuffer.getChannelData(0); printTextarea('js: number of samples: ' + audio.length); Module.set_audio(instance, audio); }); mediaRecorder = null; context = null; }); } reader.readAsArrayBuffer(blob); }; mediaRecorder.onstop = function(e) { stream.getTracks().forEach(function(track) { track.stop(); }); }; mediaRecorder.start(); }) .catch(function(err) { printTextarea('js: error getting audio stream: ' + err); }); } ``` -------------------------------- ### Build and Run All Benchmarks Source: https://github.com/ggml-org/whisper.cpp/blob/master/scripts/bench-all-gg.txt This command first builds the project using parallel jobs and then executes all benchmark scripts. It's used to generate the performance tables shown. ```bash make -j && ./scripts/bench-all.sh 1 1 0 ``` ```bash make -j && ./scripts/bench-all.sh 1 1 1 ``` -------------------------------- ### Build XCFramework Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/whisper.swiftui/README.md Execute this script from the project root to generate the necessary XCFramework for the SwiftUI app. ```console $ ./build-xcframework.sh ``` -------------------------------- ### Create Dummy Core ML Model Directory Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/whisper.objc/README.md If you do not wish to convert a Core ML model, create a dummy directory to satisfy the application's model path requirement. ```bash mkdir models/ggml-base.en-encoder.mlmodelc ``` -------------------------------- ### Configure ggml Package Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/CMakeLists.txt Configures the ggml.pc file for pkg-config when GGML_STANDALONE is enabled, and installs it to the pkgconfig directory. ```cmake configure_file(${CMAKE_CURRENT_SOURCE_DIR}/ggml.pc.in ${CMAKE_CURRENT_BINARY_DIR}/ggml.pc @ONLY) install(FILES ${CMAKE_CURRENT_BINARY_DIR}/ggml.pc DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig) ``` -------------------------------- ### HIP Version Check Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-hip/CMakeLists.txt Ensures that the installed ROCm/HIP version is at least V6.1, which is required for certain features. ```cmake if (${hip_VERSION} VERSION_LESS 6.1) message(FATAL_ERROR "At least ROCM/HIP V6.1 is required") endif() message(STATUS "HIP and hipBLAS found") ``` -------------------------------- ### Copy build files to server path Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/stream.wasm/README.md Transfer the necessary build artifacts to your web server's directory. ```bash # copy the produced page to your HTTP path cp bin/stream.wasm/* /path/to/html/ cp bin/libstream.js /path/to/html/ cp bin/libstream.worker.js /path/to/html/ ``` -------------------------------- ### Build Parakeet CLI Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/parakeet-cli/README.md Compile the parakeet-cli tool using CMake. Ensure you have a C++ compiler and CMake installed. ```console $ cmake -B build -S . $ cmake --build build --target parakeet-cli -j 12 ``` -------------------------------- ### Benchmark All Models (FP16) Source: https://github.com/ggml-org/whisper.cpp/blob/master/scripts/bench-all-gg.txt Compiles the project and runs the benchmarking script for all models with FP16 precision. This is useful for evaluating baseline performance. ```bash make -j && ./scripts/bench-all.sh 1 1 0 ``` -------------------------------- ### Build whisper.xcframework Source: https://github.com/ggml-org/whisper.cpp/blob/master/examples/whisper.objc/README.md Build the whisper.xcframework using this command. This is a prerequisite for using the whisper.objc application. ```bash ./build-xcframework.sh ``` -------------------------------- ### Find OpenVINO and OpenCL Packages Source: https://github.com/ggml-org/whisper.cpp/blob/master/ggml/src/ggml-openvino/CMakeLists.txt Locates the required OpenVINO and OpenCL components for the build. Ensure these are installed and discoverable by CMake. ```cmake find_package(OpenVINO REQUIRED COMPONENTS Runtime Threading) find_package(OpenCL REQUIRED) ``` -------------------------------- ### Configure User Permissions for GPU Access Source: https://github.com/ggml-org/whisper.cpp/blob/master/README_sycl.md Add the current user to the video and render groups to ensure proper access to Intel GPU hardware. ```bash sudo usermod -aG render username sudo usermod -aG video username ```