### Install Conda and Run Setup Script

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-Inference.ipynb

Installs condacolab and Anaconda, then executes the setup script to prepare the environment.

```python
%pip install -q condacolab
import condacolab
condacolab.install_from_url("https://repo.anaconda.com/archive/Anaconda3-2024.10-1-Linux-x86_64.sh")
!cd /content && bash setup.sh

```

--------------------------------

### Starting the API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Starts the primary API server with specified pretrained models and reference audio.

```APIDOC
## Starting the API Server (api.py)

The primary API server provides REST endpoints for text-to-speech synthesis with support for streaming audio output and model switching.

```bash
python api.py \
    -s "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth" \
    -g "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt" \
    -dr "reference_audio.wav" \
    -dt "Hello, this is a reference text." \
    -dl "en" \
    -a "0.0.0.0" \
    -p 9880 \
    -sm "normal" \
    -mt "wav"
```
```

--------------------------------

### Install GPT-SoVITS on Linux

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Install GPT-SoVITS on Linux by running the install script. Specify the device (GPU or CPU) and the model source (Hugging Face or ModelScope). Optionally download UVR5 models.

```bash
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```

--------------------------------

### Install Conda and Execute Setup Script

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-WebUI.ipynb

This Python script installs Conda using condacolab and then executes the previously defined setup.sh script to configure the environment for GPT-SoVITS. Ensure this runs after the initial environment setup.

```python
%pip install -q condacolab
import condacolab
condacolab.install_from_url("https://repo.anaconda.com/archive/Anaconda3-2024.10-1-Linux-x86_64.sh")
!cd /content && bash setup.sh
```

--------------------------------

### Install GPT-SoVITS on macOS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Install GPT-SoVITS on macOS using the install script. Note that GPU training on Macs yields lower quality, so CPU is recommended. Specify the device (MPS or CPU) and model source.

```bash
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```

--------------------------------

### Start WebUI for Training and Inference

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Launches the integrated WebUI for managing the GPT-SoVITS workflow, including training and inference. Can be started with specific language or model versions.

```bash
# Start WebUI (auto-detects latest model version)
python webui.py

# Start with specific language
python webui.py en

# Start with V1 models
python webui.py v1
```

--------------------------------

### Starting the V2 API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Starts the V2 API server with enhanced features including parallel inference and streaming modes.

```APIDOC
## Starting the V2 API Server (api_v2.py)

The V2 API provides enhanced features including parallel inference, streaming modes, and batch processing with YAML configuration.

```bash
python api_v2.py \
    -c "GPT_SoVITS/configs/tts_infer.yaml" \
    -a "127.0.0.1" \
    -p 9880
```
```

--------------------------------

### Setup Environment Script for GPT-SoVITS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-WebUI.ipynb

This shell script clones the GPT-SoVITS repository, creates and activates a Conda environment named 'GPTSoVITS', installs ipykernel, and runs the installation script. It's designed to be run once for environment setup.

```shell
set -e

cd /content

git clone https://github.com/RVC-Boss/GPT-SoVITS.git

cd GPT-SoVITS

if conda env list | awk '{print $1}' | grep -Fxq "GPTSoVITS"; then
    :
else
    conda create -n GPTSoVITS python=3.10 -y
fi

source activate GPTSoVITS

pip install ipykernel

bash install.sh --device CU126 --source HF --download-uvr5
```

--------------------------------

### Install GPT-SoVITS WebUI on Windows

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Use this command to create a conda environment and install GPT-SoVITS with specified device and source options. The --DownloadUVR5 flag is optional.

```powershell
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pwsh -F install.ps1 --Device <CU126|CU128|CPU> --Source <HF|HF-Mirror|ModelScope> [--DownloadUVR5]
```

--------------------------------

### Start GPT-SoVITS API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Launches the primary API server with specified pretrained models and reference audio. Ensure all paths to model checkpoints and reference audio are correct.

```bash
python api.py \
    -s "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth" \
    -g "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt" \
    -dr "reference_audio.wav" \
    -dt "Hello, this is a reference text." \
    -dl "en" \
    -a "0.0.0.0" \
    -p 9880 \
    -sm "normal" \
    -mt "wav"
```

--------------------------------

### Setup Shell Script for GPT-SoVITS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-Inference.ipynb

This script clones the GPT-SoVITS repository, creates a Conda environment named GPTSoVITS, and installs necessary Python packages.

```shell
set -e

cd /content

git clone https://github.com/RVC-Boss/GPT-SoVITS.git

cd GPT-SoVITS

mkdir -p GPT_weights

mkdir -p SoVITS_weights

if conda env list | awk '{print $1}' | grep -Fxq "GPTSoVITS"; then
    :
else
    conda create -n GPTSoVITS python=3.10 -y
fi

source activate GPTSoVITS

pip install ipykernel

bash install.sh --device CU126 --source HF

```

--------------------------------

### Install FFmpeg on Ubuntu/Debian

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Install FFmpeg and libsox-dev on Ubuntu or Debian systems using apt package manager.

```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
```

--------------------------------

### Create Conda Environment for BigVGAN

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Example command to create a conda environment with specified Python and PyTorch versions, including CUDA support. Ensure you have conda installed.

```shell
conda create -n bigvgan python=3.10 pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda activate bigvgan
```

--------------------------------

### Start Inference-Only WebUI

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Launches a WebUI specifically for inference tasks, separate from the training functionalities.

```bash
python GPT_SoVITS/inference_webui.py
```

--------------------------------

### Train 16kHz Model with Checkpoint Path

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/tools/AP_BWE_main/README.md

Example command to train the 16kHz model, specifying both the configuration file and the checkpoint path. This allows for resuming training or organizing checkpoints.

```bash
CUDA_VISIBLE_DEVICES=0 python train_16k.py --config ../configs/config_2kto16k.json --checkpoint_path ../checkpoints/AP-BWE_2kto16k
```

--------------------------------

### Manually Install Dependencies for GPT-SoVITS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/docs/cn/README.md

These commands set up a conda environment and install necessary Python dependencies from requirements files for GPT-SoVITS. It's recommended to use this method if the automated scripts do not work.

```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits

pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```

--------------------------------

### Run Local Gradio Demo

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Commands to install dependencies for the local Gradio demo and then run the demo application. Ensure you are in the BigVGAN directory.

```shell
pip install -r demo/requirements.txt
python demo/app.py
```

--------------------------------

### Install FFmpeg on macOS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Install FFmpeg on macOS using the Homebrew package manager.

```bash
brew install ffmpeg
```

--------------------------------

### Install FFmpeg using Conda

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

For Conda users, install FFmpeg by activating the GPTSoVits environment and running the conda install command.

```bash
conda activate GPTSoVits
conda install ffmpeg
```

--------------------------------

### Install GPT-SoVITS WebUI on Linux

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/docs/cn/README.md

Use this bash command to create a conda environment and install dependencies for GPT-SoVITS on Linux. Specify the device (CUDA, ROCm, or CPU) and the package source.

```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```

--------------------------------

### Install GPT-SoVITS WebUI on macOS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/docs/cn/README.md

Use this bash command to create a conda environment and install dependencies for GPT-SoVITS on macOS. Note that GPU training on Mac may yield results significantly lower than other devices, so CPU training is recommended. Specify the device (MPS or CPU) and the package source.

```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```

--------------------------------

### Start V2 API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Initiates the V2 API server using a YAML configuration file for TTS inference. This version supports enhanced features like parallel inference and streaming.

```bash
# Start V2 API server with configuration file
python api_v2.py \
    -c "GPT_SoVITS/configs/tts_infer.yaml" \
    -a "127.0.0.1" \
    -p 9880
```

--------------------------------

### Inference Quickstart with Hugging Face Hub

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Python code to perform audio inference using a pretrained BigVGAN model from Hugging Face Hub. Loads a model, computes mel spectrogram, and generates a waveform. Set `use_cuda_kernel=True` for potentially faster inference.

```python
device = 'cuda'

import torch
import bigvgan
import librosa
from meldataset import get_mel_spectrogram

# instantiate the model. You can optionally set use_cuda_kernel=True for faster inference.
model = bigvgan.BigVGAN.from_pretrained('nvidia/bigvgan_v2_24khz_100band_256x', use_cuda_kernel=False)

# remove weight norm in the model and set to eval mode
model.remove_weight_norm()
model = model.eval().to(device)

# load wav file and compute mel spectrogram
wav_path = '/path/to/your/audio.wav'
wav, sr = librosa.load(wav_path, sr=model.h.sampling_rate, mono=True) # wav is np.ndarray with shape [T_time] and values in [-1, 1]
wav = torch.FloatTensor(wav).unsqueeze(0) # wav is FloatTensor with shape [B(1), T_time]

# compute mel spectrogram from the ground truth audio
mel = get_mel_spectrogram(wav, model.h).to(device) # mel is FloatTensor with shape [B(1), C_mel, T_frame]

# generate waveform from mel
with torch.inference_mode():
    wav_gen = model(mel) # wav_gen is FloatTensor with shape [B(1), 1, T_time] and values in [-1, 1]
wav_gen_float = wav_gen.squeeze(0).cpu() # wav_gen is FloatTensor with shape [1, T_time]

# you can convert the generated waveform to 16 bit linear PCM
wav_gen_int16 = (wav_gen_float * 32767.0).numpy().astype('int16') # wav_gen is now np.ndarray with shape [1, T_time] and int16 dtype
```

--------------------------------

### Dataset Annotation Example

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/docs/cn/README.md

An example of a .list file entry, showing the expected format for a Chinese TTS annotation.

```plaintext
D:\GPT-SoVITS\xxx/xxx.wav|xxx|zh|我爱玩原神.
```

--------------------------------

### Inference for 16kHz with Output Directory

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/tools/AP_BWE_main/README.md

Example command for 16kHz inference, specifying the generator checkpoint and a custom output directory for the generated audio files.

```python
python inference_16k.py --checkpoint_file ../checkpoints/2kto16k/g_2kto16k --output_dir ../generated_files/2kto16k
```

--------------------------------

### Install Requirements for v2

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Update necessary packages when upgrading to v2. Ensure you have the latest codes cloned from GitHub.

```bash
pip install -r requirements.txt
```

--------------------------------

### Install BigVGAN Dependencies

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Commands to clone the BigVGAN repository and install its Python dependencies using pip. This should be run after activating the conda environment.

```shell
git clone https://github.com/NVIDIA/BigVGAN
cd BigVGAN
pip install -r requirements.txt
```

--------------------------------

### Dataset Format Example

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Defines the required format for training data, which is a pipe-separated list of audio file path, speaker name, language, and transcribed text. Supports multiple languages.

```text
# Format: audio_path|speaker_name|language|text
# Languages: zh (Chinese), en (English), ja (Japanese), ko (Korean), yue (Cantonese)

# Example training list file (training.list):
/data/audio/speaker1_001.wav|speaker1|en|Hello, this is a sample sentence.
/data/audio/speaker1_002.wav|speaker1|en|Another example of training data.
/data/audio/speaker1_003.wav|speaker1|zh|这是一个中文样本。
```

--------------------------------

### Start UVR5 Vocal Separation WebUI

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Launches the WebUI for UVR5, a tool for separating vocals from accompaniment in audio files. Requires specifying the device, precision mode, and port.

```bash
# Start UVR5 WebUI
python tools/uvr5/webui.py "cuda" True 9873

# Parameters:
# - Device: "cuda" or "cpu"
# - is_half: True for half precision (faster on GPU)
# - port: WebUI port number
```

--------------------------------

### Launch GPT-SoVITS Web UI

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-Inference.ipynb

Starts the GPT-SoVITS web interface using the webui.py script within the activated Conda environment. Set is_share to True to create a public gradio link.

```shell
!cd /content/GPT-SoVITS && source activate GPTSoVITS && export is_share=True && python webui.py

```

--------------------------------

### Basic TTS Inference via GET Request

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Synthesizes speech from text using the default reference audio configured at server startup.

```APIDOC
## Basic TTS Inference via GET Request

Synthesize speech from text using the default reference audio configured at server startup.

### Method
GET

### Endpoint
`http://127.0.0.1:9880/`

### Query Parameters
- **text** (string) - Required - The text to synthesize.
- **text_language** (string) - Required - The language of the text.
- **cut_punc** (string) - Optional - Punctuation character to split sentences.

### Request Example
```bash
# Simple TTS request using default reference audio
curl "http://127.0.0.1:9880?text=Hello%20world%2C%20this%20is%20a%20test.&text_language=en"

# With custom punctuation splitting
curl "http://127.0.0.1:9880?text=First%20sentence.%20Second%20sentence.&text_language=en&cut_punc=."
```

### Response
#### Success Response (200)
- **audio** (binary) - The synthesized audio stream.

#### Response Example
(Binary audio data)
```

--------------------------------

### Instantiate BigVGAN with CUDA Kernel

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Enable the fast CUDA inference kernel when initializing the BigVGAN model. This requires CUDA to be installed and compatible with your PyTorch build.

```python
generator = BigVGAN(h, use_cuda_kernel=True)
```

--------------------------------

### Successful CUDA Kernel Test Output

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Example output indicating a successful test of the CUDA fused kernel against the plain PyTorch BigVGAN inference. A low mean difference confirms correctness.

```shell
loading plain Pytorch BigVGAN
...
loading CUDA kernel BigVGAN with auto-build
Detected CUDA files, patching ldflags
Emitting ninja build file /path/to/your/BigVGAN/alias_free_activation/cuda/build/build.ninja..
Building extension module anti_alias_activation_cuda...
...
Loading extension module anti_alias_activation_cuda...
...
Loading '/path/to/your/bigvgan_generator.pt'
...
[Success] test CUDA fused vs. plain torch BigVGAN inference
 > mean_difference=0.0007238413265440613
...
```

--------------------------------

### Basic TTS Inference via GET Request

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Performs a simple text-to-speech synthesis request using the default reference audio. Supports custom punctuation splitting for more control over sentence segmentation.

```bash
# Simple TTS request using default reference audio
curl "http://127.0.0.1:9880?text=Hello%20world%2C%20this%20is%20a%20test.&text_language=en"
```

```bash
# With custom punctuation splitting
curl "http://127.0.0.1:9880?text=First%20sentence.%20Second%20sentence.&text_language=en&cut_punc=."
```

--------------------------------

### Run WebUI for GPT-SoVITS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Launches the GPT-SoVITS web UI. An optional language parameter can be specified.

```bash
python webui.py <language(optional)>
```

```bash
python webui.py v1 <language(optional)>
```

--------------------------------

### Install Dependencies Manually

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Manually install project dependencies using pip. Ensure you have activated the GPTSoVits conda environment first. This installs packages from requirements.txt and optionally extra-req.txt without their dependencies.

```bash
pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```

--------------------------------

### Run Inference WebUI for GPT-SoVITS

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Launches the GPT-SoVITS inference web UI. An optional language parameter can be specified.

```bash
python GPT_SoVITS/inference_webui.py <language(optional)>
```

```bash
python webui.py
```

--------------------------------

### Run UVR5 WebUI from Command Line

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Use this command to open the WebUI for UVR5. Replace placeholders with your specific device, precision, and port settings.

```bash
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```

--------------------------------

### Change Default Reference Audio (GET)

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Alternatively, change the default reference audio using a GET request with query parameters. This method is less common for state changes.

```python
# GET method
requests.get(
    "http://127.0.0.1:9880/change_refer",
    params={
        "refer_wav_path": "new_reference.wav",
        "prompt_text": "New reference text.",
        "prompt_language": "en"
    }
)
```

--------------------------------

### Build Docker Image Locally

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Build the GPT-SoVITS Docker image locally. Specify the CUDA version and whether to use the lite version.

```bash
bash docker_build.sh --cuda 12.6 --lite
```

--------------------------------

### Launch GPT-SoVITS WebUI

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-WebUI.ipynb

This command launches the GPT-SoVITS WebUI. It navigates to the project directory, activates the Conda environment, sets the sharing option to true, and runs the webui.py script. This should be run after the environment is fully set up.

```shell
!cd /content/GPT-SoVITS && source activate GPTSoVITS && export is_share=True && python webui.py
```

--------------------------------

### Shutdown API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Shut down the GPT-SoVITS API server using a GET request to the control endpoint. Use with caution.

```bash
# Shutdown the server
curl "http://127.0.0.1:9880/control?command=exit"
```

--------------------------------

### Build Docker Image Locally

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Build the Docker image locally using the provided script. Specify the CUDA version and optionally the '--lite' flag for a lightweight image.

```bash
bash docker_build.sh --cuda <12.6|12.8> [--lite]
```

--------------------------------

### Restart API Server

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Restart the GPT-SoVITS API server using a simple GET request to the control endpoint. Ensure the server is accessible.

```bash
# Restart the server
curl "http://127.0.0.1:9880/control?command=restart"
```

--------------------------------

### Initialize TTS and Run Inference

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Initializes the Text-to-Speech (TTS) system with specified configurations and runs inference to synthesize speech. Ensure reference audio and input parameters are correctly set.

```python
config = TTS_Config({
    "device": "cuda",
    "is_half": True,
    "version": "v2",
    "t2s_weights_path": "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt",
    "vits_weights_path": "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth",
    "bert_base_path": "GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large",
    "cnhuhbert_base_path": "GPT_SoVITS/pretrained_models/chinese-hubert-base"
})

tts = TTS(config)

tts.set_ref_audio("reference.wav")

inputs = {
    "text": "Text to synthesize into speech.",
    "text_lang": "en",
    "ref_audio_path": "reference.wav",
    "prompt_text": "Reference audio text.",
    "prompt_lang": "en",
    "top_k": 15,
    "top_p": 1.0,
    "temperature": 1.0,
    "speed_factor": 1.0,
    "parallel_infer": True
}

for sr, audio in tts.run(inputs):
    import soundfile as sf
    sf.write("output.wav", audio, sr)
    break
```

--------------------------------

### Change Default Reference Audio

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Update the default reference audio used when no reference is provided in requests. This can be done via POST or GET requests.

```APIDOC
## POST /change_refer

### Description
Updates the default reference audio for voice synthesis.

### Method
POST

### Endpoint
`http://127.0.0.1:9880/change_refer`

### Parameters
#### Request Body
- **refer_wav_path** (string) - Required - Path to the new reference audio file.
- **prompt_text** (string) - Required - Transcription of the reference audio.
- **prompt_language** (string) - Required - Language of the reference audio.

### Request Example
```json
{
  "refer_wav_path": "new_reference.wav",
  "prompt_text": "New reference audio transcription.",
  "prompt_language": "en"
}
```

## GET /change_refer

### Description
Updates the default reference audio for voice synthesis using query parameters.

### Method
GET

### Endpoint
`http://127.0.0.1:9880/change_refer`

### Parameters
#### Query Parameters
- **refer_wav_path** (string) - Required - Path to the new reference audio file.
- **prompt_text** (string) - Required - Transcription of the reference audio.
- **prompt_language** (string) - Required - Language of the reference audio.

### Request Example
```
http://127.0.0.1:9880/change_refer?refer_wav_path=new_reference.wav&prompt_text=New%20reference%20text.&prompt_language=en
```
```

--------------------------------

### Create and Activate Conda Environment

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Use this command to create a new conda environment named GPTSoVits with Python 3.10 and then activate it. This is a prerequisite for most installation steps.

```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
```

--------------------------------

### Run Docker Compose Lite

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Use this command to run the GPT-SoVITS service with a smaller Docker image, suitable for resource-constrained environments.

```bash
docker compose run --service-ports GPT-SoVITS-CU126-Lite
```

--------------------------------

### Create Symbolic Links for LibriTTS Dataset

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Prepare the LibriTTS dataset by creating symbolic links to the root directory. This is necessary for the codebase to correctly reference training and validation files.

```shell
cd filelists/LibriTTS && \
ln -s /path/to/your/LibriTTS/train-clean-100 train-clean-100 && \
ln -s /path/to/your/LibriTTS/train-clean-360 train-clean-360 && \
ln -s /path/to/your/LibriTTS/train-other-500 train-other-500 && \
ln -s /path/to/your/LibriTTS/dev-clean dev-clean && \
ln -s /path/to/your/LibriTTS/dev-other dev-other && \
ln -s /path/to/your/LibriTTS/test-clean test-clean && \
ln -s /path/to/your/LibriTTS/test-other test-other && \
cd ../..
```

--------------------------------

### Download Models from Hugging Face

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-Inference.ipynb

Downloads GPT and SoVITS model checkpoints from Hugging Face based on user-defined repository and file paths.

```python
# Modify These
USER_ID = "AkitoP"
REPO_NAME = "GPT-SoVITS-v2-aegi"
BRANCH = "main"
GPT_PATH = "new_aegigoe-e100.ckpt"
SOVITS_PATH = "new_aegigoe_e60_s32220.pth"

# Do Not Modify
HF_BASE = "https://huggingface.co"
REPO_ID = f"{USER_ID}/{REPO_NAME}"
GPT_URL = f"{HF_BASE}/{REPO_ID}/blob/{BRANCH}/{GPT_PATH}"
SOVITS_URL = f"{HF_BASE}/{REPO_ID}/blob/{BRANCH}/{SOVITS_PATH}"

!cd "/content/GPT-SoVITS/GPT_weights" && wget "{GPT_URL}"
!cd "/content/GPT-SoVITS/SoVITS_weights" && wget "{SOVITS_URL}"

```

--------------------------------

### Train BigVGAN Model with LibriTTS Dataset

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Initiate the training process for a BigVGAN-v2 model using the LibriTTS dataset. This command specifies the configuration file, dataset paths, and checkpoint location.

```shell
python train.py \
--config configs/bigvgan_v2_24khz_100band_256x.json \
--input_wavs_dir filelists/LibriTTS \
--input_training_file filelists/LibriTTS/train-full.txt \
--input_validation_file filelists/LibriTTS/val-full.txt \
--list_input_unseen_wavs_dir filelists/LibriTTS filelists/LibriTTS \
--list_input_unseen_validation_file filelists/LibriTTS/dev-clean.txt filelists/LibriTTS/dev-other.txt \
--checkpoint_path exp/bigvgan_v2_24khz_100band_256x
```

--------------------------------

### Enable CUDA Kernel via Command Line

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/GPT_SoVITS/BigVGAN/README.md

Activate the custom CUDA inference kernel for synthesis scripts by passing the `--use_cuda_kernel` flag. The kernel is built automatically on first use.

```shell
python inference.py --use_cuda_kernel ...
```

```shell
python inference_e2e.py --use_cuda_kernel ...
```

--------------------------------

### Docker Deployment with Docker Compose

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Deploys GPT-SoVITS using Docker Compose, specifying the CUDA version. This command runs the service with exposed ports for accessibility.

```bash
# Using Docker Compose (CUDA 12.6)
docker compose run --service-ports GPT-SoVITS-CU126
```

--------------------------------

### V2 API TTS Synthesis with Streaming

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Performs text-to-speech synthesis using the V2 API with streaming enabled for continuous audio output. This example demonstrates advanced parameters for batch processing, splitting strategies, and streaming modes. Requires the 'requests' library.

```python
import requests

# V2 API TTS request with streaming
response = requests.post(
    "http://127.0.0.1:9880/tts",
    json={
        "text": "This is a long text that will be synthesized with streaming support.",
        "text_lang": "en",
        "ref_audio_path": "reference.wav",
        "prompt_text": "Reference audio transcription.",
        "prompt_lang": "en",
        "top_k": 15,
        "top_p": 1.0,
        "temperature": 1.0,
        "text_split_method": "cut5",  # Splitting strategy
        "batch_size": 1,
        "batch_threshold": 0.75,
        "split_bucket": True,
        "speed_factor": 1.0,
        "fragment_interval": 0.3,
        "seed": 42,
        "parallel_infer": True,
        "repetition_penalty": 1.35,
        "sample_steps": 32,
        "super_sampling": False,
        "streaming_mode": 1,  # 0: disabled, 1: best quality, 2: medium, 3: fast
        "media_type": "wav"
    },
    stream=True
)

# Save streaming audio
with open("output.wav", "wb") as f:
    for chunk in response.iter_content(chunk_size=1024):
        f.write(chunk)
```

--------------------------------

### Initialize Python TTS Class

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Demonstrates the initial import statements for using the GPT-SoVITS TTS functionality directly within a Python script. Further usage would involve instantiating TTS and TTS_Config classes.

```python
from GPT_SoVITS.TTS_infer_pack.TTS import TTS, TTS_Config
```

--------------------------------

### Download Models from ModelScope

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/Colab-Inference.ipynb

Downloads GPT and SoVITS model checkpoints from ModelScope using specified user, repository, branch, and file paths.

```python
# Modify These
USER_ID = "aihobbyist"
REPO_NAME = "GPT-SoVits-V2-models"
BRANCH = "master"
GPT_PATH = "Genshin_Impact/EN/GPT_GenshinImpact_EN_5.1.ckpt"
SOVITS_PATH = "Wuthering_Waves/CN/SV_WutheringWaves_CN_1.3.pth"

# Do Not Modify
HF_BASE = "https://www.modelscope.cn/models"
REPO_ID = f"{USER_ID}/{REPO_NAME}"
GPT_URL = f"{HF_BASE}/{REPO_ID}/resolve/{BRANCH}/{GPT_PATH}"
SOVITS_URL = f"{HF_BASE}/{REPO_ID}/resolve/{BRANCH}/{SOVITS_PATH}"

!cd "/content/GPT-SoVITS/GPT_weights" && wget "{GPT_URL}"
!cd "/content/GPT-SoVITS/SoVITS_weights" && wget "{SOVITS_URL}"

```

--------------------------------

### Command Line Interface for Batch TTS Synthesis

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Performs batch TTS synthesis using the command-line interface. Requires specifying paths for GPT and SoVITS models, reference audio, and target text files.

```bash
python GPT_SoVITS/inference_cli.py \
    --gpt_model "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt" \
    --sovits_model "GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth" \
    --ref_audio "reference_audio.wav" \
    --ref_text "reference_text.txt" \
    --ref_language "英文" \
    --target_text "target_text.txt" \
    --target_language "英文" \
    --output_path "output/"
```

--------------------------------

### Train 48kHz Model

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/tools/AP_BWE_main/README.md

Initiates the training process for the 48kHz speech bandwidth extension model. Requires a configuration file path. Checkpoints are saved in 'cp_model' by default.

```bash
cd train
CUDA_VISIBLE_DEVICES=0 python train_48k.py --config [config file path]
```

--------------------------------

### Run a Specific Docker Compose Service

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/README.md

Execute a specific GPT-SoVITS service using Docker Compose. Choose between full or lite versions, and different CUDA versions. Ensure you are in the project root directory.

```bash
docker compose run --service-ports <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128>
```

--------------------------------

### Train 16kHz Model

Source: https://github.com/rvc-boss/gpt-sovits/blob/main/tools/AP_BWE_main/README.md

Initiates the training process for the 16kHz speech bandwidth extension model. Requires a configuration file path. Checkpoints are saved in 'cp_model' by default.

```bash
cd train
CUDA_VISIBLE_DEVICES=0 python train_16k.py --config [config file path]
```

--------------------------------

### Switch SoVITS Model via API

Source: https://context7.com/rvc-boss/gpt-sovits/llms.txt

Dynamically changes the SoVITS model weights at runtime using a cURL command. Ensure the server is running and the provided path is correct.

```bash
# Switch SoVITS model
curl "http://127.0.0.1:9880/set_sovits_weights?weights_path=GPT_SoVITS/pretrained_models/s2Gv3.pth"
```