### Faster-Whisper Command-Line Usage Source: https://github.com/purfview/whisper-standalone-win/blob/main/README.md Examples of how to use the faster-whisper executable from the command line. Demonstrates specifying input files/folders, language, model size, task, and output directory. ```bash faster-whisper-xxl.exe "D:\videofile.mkv" --language English --model medium --output_dir source ``` ```bash faster-whisper-xxl.exe "D:\Folder" -l en -m turbo --sentence --batch_recursive ``` ```bash faster-whisper-xxl.exe "D:\videofile.mkv" -l ja -m medium --task translate --standard -o source ``` ```bash faster-whisper-xxl.exe --help ``` -------------------------------- ### Whisper Standalone Win: New Arguments and Parameters Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Introduces new command-line arguments and parameters for enhanced control over Whisper's functionality, including hallucination detection, timestamp clipping, and output formatting. ```English New args: `--hallucination_silence_threshold` `--clip_timestamps` New args: `--max_new_tokens` `--chunk_length` Additional option for `--prompt_reset_on_no_end` Various audio filters: `--ff_dump` `--ff_mp3` `--ff_sync` `--ff_rnndn_sh` `--ff_rnndn_xiph` `--ff_fftdn` `--ff_tempo` `--ff_gate` `--ff_speechnorm` `--ff_loudnorm` `--ff_silence_suppress` `--ff_lowhighpass` New parameters : `--max_comma_cent` `--min_dist_to_end` New parameters : `--prompt_max` `--reprompt` [enabled by default] `--prompt_reset_on_no_end` [enabled by default] New parameter : --standard_asia New parameters : `--sentence` `--standard` `--max_comma` `--max_gap` `--max_line_width` `--max_line_count` Exposed options : `--prefix` `--suppress_blank` `--without_timestamps` `--max_initial_timestamp` ``` -------------------------------- ### Whisper Standalone Win: Prompt and Hallucination Handling Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Describes improvements in prompt handling, including resetting prompts and avoiding common hallucinations, as well as new parameters for controlling prompt behavior. ```English Doesn't add to prompt segments triggered by the common hallucinations in hallucinations_list. Bugfix: `--prompt_reset_on_temperature` was broken. Regression since r139. Improved `--initial_prompt` default preset, there is alternative experimental `auto` preset. Improved `--sentence` with the ignore words list, words like `Mr.` and ect.. The last progress update is passed to the title bar. `--prompt_reset_on_no_end` Bugfix: -prompt=None wasn't working. ``` -------------------------------- ### Whisper Standalone Win: Dependency Updates Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Lists updates to key dependencies such as PyInstaller, PyAV, and CTranslate2, ensuring compatibility and leveraging the latest features and performance improvements. ```English Updated PyAV to v11.0.0 [ffmpeg 6.0] Updated PyInstaller to v6.4.0 Updated CTranslate2 to v3.24.0 Updated psutil to v5.9.7 Updated PyInstaller to v6.3.0 Updated CTranslate2 to v3.23.0 Updated Pyinstaller to v6.2.0 ``` -------------------------------- ### Whisper Standalone Win: Model Configuration and Defaults Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Covers changes in default model configurations, such as using fp16 for 'large-v3' and adjustments to parameters like hallucination silence thresholds. ```English large-v3 use fp16 model by default. [Removed other v3 options] Adjusted "--hallucination_silence_threshold"s score algo, and new `--hallucination_silence_th_temp` ``` -------------------------------- ### Whisper Standalone Win: Output and Formatting Improvements Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Details enhancements to output formats, including JSON aggregation, sentence boundary detection, and handling of punctuation for cleaner transcriptions. ```English JSON output has additional 'text' key with all text aggregated. Bugfix: The final line disappeared if the final word wasn't punctuated when using `--sentence`. Bugfix: The final word could disappear in some combination of the new parameters from r160.8. Improved `--skip` to work with all output folders. Improved `--skip` to work with all output formats. If output is "all" - checks only "srt". New `--check_files` argument. Added a catch for invalid `audio` input. Turns off word_timestamps if output format is "text". Improved `--one_word` with the second option. ``` -------------------------------- ### Whisper Standalone Win: Bug Fixes and Enhancements Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Covers bug fixes for various issues including prompt presets, custom prompt presets with translation tasks, and CPU-related filter problems. Also includes improvements to sentence detection. ```English Bugfix: PR705 was incorporated a bit incorrectly. Bugfix: Bug with the custom prompt presets and task "translate". Bugfix: With some CPUs rnndn filters were not working. Improved: `--sentence` doesn't treat ellipses (...) as the end of sentence, unless ellipses are at the end of a segment. Changed default of `--initial_prompt` to `auto`. Better error handling for the audio filters. Increased "recursion" limit from 1000 to 10000. Fix quirks with "The last progress update is passed to the title bar." [r160.11] Fix Linux exec issue with forward compatibility. Bugfix: Bug if couldn't detect the number of CPU's cores. Bugfix: Illogical "Avoid computing higher temperatures on no_speech". [It could cause bad hallucination loops] https://github.com/openai/whisper/pull/1903 https://github.com/SYSTRAN/faster-whisper/pull/625 Fix: Commit #165 was missing in `r167.1` release. Bugfix: Fixed bug on macOS -> https://github.com/Purfview/whisper-standalone-win/issues/86 ``` -------------------------------- ### Whisper Standalone Win: Miscellaneous Fixes and Features Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Includes various other fixes and features, such as UTF-8 character support in the console and handling of wildcard inputs with special characters. ```English UTF-8 chars are now supported in SE's "console". Bugfix: Wildcard input could fail if brackets were in a folder's name. Skip micro chunks. ``` -------------------------------- ### Standalone Faster-Whisper & Faster-Whisper-XXL Updates Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt This section details the changes, fixes, and new features introduced in various versions of the Standalone Faster-Whisper and Faster-Whisper-XXL projects. It includes updates to command-line arguments, model handling, VAD methods, and dependency versions. ```changelog ## r245.4 [XXL]: Fixed: 'unload_model' error when using batched inference ## r245.3 [XXL]: Fixed: Model didn't free VRAM when --model_preload False Fixed: "cuda:1" didn't work with MDX Fixed: Some issues in "One Click Transcribe.bat" Fixed: Matplotlib backend error [Colab] Fixed: cuDNN libs not found error [Linux] Changed: `--ff_mdx_kim2` to `--ff_vocal_extract mdx_kim2` Changed: `--mdx_device` to `--voc_device` Added: `distil-large-v3.5` model Updated: pyinstaller to 6.12.0 ## r245.2 [XXL]: Fixed: Audio track selection with `--ff_track` didn't work with `--ff_mdx_kim2`. New feature: Deal with inverted polarity/phase in audios with new args: `--ff_lc` & `--ff_invert`. New feature: `--model_preload` [its automatic]. Fixes FW model interference with `--ff_mdx_kim2`. New feature: `--diarize_ff` [its automatic]. Enables diarization after `--ff_...` filters. New feature: Output diarization embeddings with `--return_embeddings`/`-embeddings`. New feature: Diarize without transcription with `--diarize_only`. ## r245.1 [XXL]: Clean-up temp folder after Ctrl+C Fixed OOM error for `silero_v5_fw` on long audios New arg `--japanese` or `-ja` ## r239.1 [XXL]: New features/args: `--batched` `--unmerged` `--batch_size` `--multilingual` `--hotwords` `--rehot` `--ignore_dupe_prompt` New `vad_method`'s: `silero_v5` `silero_v4_fw` `silero_v5_fw` Removed not useful experimental options: no_speech_strict_lvl nullify_non_speech prompt_max reprompt's first option ## r194.5 [XXL]: Change: Enable auto/default prompt presets for translation. ## r194.4 [XXL]: Fix: --sentence was splitting digits containing dot. Fix: auditok Fix: Error with empty audio and pyannote_v3 vad Fix: Disable prompt_reset_on_no_end when HST is triggered Removed experimental condition in HST algo ## r194.2 [XXL]: Bugfix: Broken outputs when diarize/sentence and batch input [194.1]. Fix: --sentence was splitting words on hyphens. Fix: --sentence was splitting digits containing comma. Change: --sentence affects all output formats except json [previously only srt & vtt]. Improvements and bugfixes for json input. Added "One Click Transcribe" tool: https://github.com/Purfview/whisper-standalone-win/discussions/337 ## r194.1 [XXL]: New feature: CUDA support for `pyannote_v3` and `pyannote_onnx_v3` VADs New feature: Selective output formats, for example: `--output_format srt json` New feature: `--diarize` will auto-activate `--sentence` and all output formats will be affected except json New `--diarize` options: `pyannote_v3.0`, `pyannote_v3.1`, `reverb_v1`, `reverb_v2` New diarization args: `--num_speakers`, `--min_speakers`, `--max_speakers` `--diarize_dump` Bugfix: Bug in alternative subtitle writer with max_line_width [r189.1]. Bugfix: Bug in --sentence subtitle writer if "word" is space. Change: .cache dir is bound to exe dir instead of cwd. pyinstaller updated to 6.11.1 CTranslate2 updated to 4.4.0 [+ Compute Capability 5.0 support] onnxruntime_gpu updated to 1.18.0 [CUDA 12.x cuDNN 8.x] Python updated to 3.10.11 ## r193.1 [XXL]: New feature: Speaker Diarization with `--diarize`. New feature: `large-v3-turbo` model autodownload. New feature: JSON output has additional key: 'language'. New feature: `--postfix` works without --language. Bigfix: Rare bug in alternative subtitle writer if "words" is empty [r189.1]. Change: `--vad_alt_method` shorted to `--vad_method`. Change: Windows 7 not supported anymore. ## r192.3.4 [XXL]: New feature: Write intermediate files to temp folder. [ignored on dump] New feature: Can take JSON files as input to generate subtitles from it according to the settings. New alternative VAD method : `pyannote_v3` [previous same named option renamed to "pyannote_onnx_v3"] New arg: `--nullify_non_speech` [for WIP experiments] ## r192.3.3 [XXL]: CTranslate2 updated to v4.2.0 Transcription on CPU is up to ~26% faster. ## r192.3.2 [XXL]: Various improvements and bugfixes to `--vad_alt_method` ## r192.3: Bugfix: 'one_word' was broken in r192.2. ## r192.2: Allowed to use 'max_line_width' and 'max_line_count' without '--sentence'. ## r192.1.1 [XXL]: `--ff_mdx_kim2`: Preprocess audio with MDX23 Kim vocal v2 model. `--vad_alt_method`: Alternative VAD methods - "silero_v3", "silero_v4", "pyannote_v3", "auditok", "webrtc". ## r192.1: Bugfix: It was unnecessarily going through ffmpeg routines. [introduced in r189.1] ``` -------------------------------- ### Whisper Standalone Win: Model Support Expansion Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Details the addition of support for new Whisper models, including various 'Distil-Whisper' variants and the 'large-v3' model, enhancing transcription capabilities. ```English Supports "Distil-Whisper" models: `distil-large-v2` `distil-medium.en` `distil-small.en` `large-v3` now is using the official model. 'large-v3' model. 'Cantonese' language. ``` -------------------------------- ### Whisper Standalone Win: Model and Feature Updates Source: https://github.com/purfview/whisper-standalone-win/blob/main/changelog.txt Details updates related to model support (e.g., distil-large-v3), language-specific adjustments (Serbian Latin alphabet), and new functionalities like VAD auto-disable with clip timestamps. ```English Supports `distil-large-v3` model. Switched auto prompt for Serbian to the latin alphabet. Auto-disable VAD when --clip_timestamps is in use. JSON output has additional 'text' key with all text aggregated. Fixed --task=translate and --postfix bug. --highlight_words now supports --max_line_width and --max_line_count Changed patience default. Adjusted "--hallucination_silence_threshold"s score algo, and new `--hallucination_silence_th_temp` Auto pseudo-vad th offsets for "large-v3", can be disabled with --v3_offsets_off. `--language_detection_threshold` `--language_detection_segments` `--ff_track` - Audio track selection. `--ff_fc` - Front-Center channel selection. `--ff_dump` - Additionally prevents deletion of some intermediate audio files. `--vad_dump` - Writes VAD timings to srt file. Updated PyInstaller to v6.5.0 Fix Linux exec issue with forward compatibility. v2 ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.