### Install ms-swift

Source: https://github.com/modelscope/ms-swift/blob/main/examples/notebook/qwen2_5-self-cognition/self-cognition-sft.ipynb

Install the ms-swift library using pip. This is a prerequisite for running the subsequent code examples.

```python
# # install ms-swift
# pip install ms-swift -U
```

--------------------------------

### Install SWIFT with All Capabilities

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/GetStarted/SWIFT-installation.md

Install SWIFT with all available dependencies for full functionality using pip. This is the most comprehensive installation option.

```shell
pip install 'ms-swift[all]' -U
```

--------------------------------

### Install SWIFT from Git URL

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Install SWIFT directly from a Git repository URL using pip. Options for installing with extra dependencies are also shown.

```shell
pip install git+https://github.com/modelscope/ms-swift.git
```

```shell
pip install "ms-swift[all]@git+https://github.com/modelscope/ms-swift.git"
```

```shell
pip install "git+https://github.com/modelscope/ms-swift.git@release/3.12"
```

```shell
pip install "ms-swift[all]@git+https://github.com/modelscope/ms-swift.git@release/3.12"
```

--------------------------------

### Install SWIFT using uv

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Install the SWIFT package using the `uv` package installer. Specify the PyTorch backend for optimal performance.

```shell
pip install uv
uv pip install 'ms-swift' --torch-backend=auto
```

--------------------------------

### Install ms-swift with Evaluation Support

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Evaluation.md

Install the ms-swift package with the 'eval' extra to enable evaluation capabilities. Alternatively, install from source.

```shell
pip install ms-swift[eval] -U
```

```shell
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e '.[eval]'
```

--------------------------------

### Install ms-swift from source for Swift 3.x

Source: https://github.com/modelscope/ms-swift/blob/main/README.md

To install a specific version (e.g., Swift 3.x), checkout the desired release branch before installing from source.

```shell
# git checkout release/3.12
pip install -e .
```

--------------------------------

### JSON Configuration Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Command-line-parameters.md

Example of a JSON configuration file specifying model and dataset.

```json
{
    "model": "Qwen/Qwen2.5-7B-Instruct",
    "dataset": "swift/self-cognition#500"
}
```

--------------------------------

### YAML Configuration Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Command-line-parameters.md

Example of a YAML configuration file specifying model and dataset.

```yaml
model: "Qwen/Qwen2.5-7B-Instruct"
dataset: "swift/self-cognition#500"
```

--------------------------------

### Install ms-swift and Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/MLLM-Registration.md

Install the ms-swift library and specific versions of transformers and qwen_omni_utils to avoid compatibility issues.

```shell
pip install "ms-swift>=4.0"

pip install "transformers==4.57.*" "qwen_omni_utils==0.0.8"
```

--------------------------------

### SWIFT Docker Image Examples

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Example Docker image tags for different SWIFT versions and configurations, including regional endpoints.

```text
# swift4.3.2
modelscope-registry.cn-hangzhou.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.23.0-modelscope1.37.1-swift4.3.2
modelscope-registry.cn-beijing.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.23.0-modelscope1.37.1-swift4.3.2
modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.23.0-modelscope1.37.1-swift4.3.2

# swift4.2.3
modelscope-registry.cn-hangzhou.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.21.0-modelscope1.36.3-swift4.2.3
modelscope-registry.cn-beijing.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.21.0-modelscope1.36.3-swift4.2.3
modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda13.0.3-py312-torch2.11.0-vllm0.21.0-modelscope1.36.3-swift4.2.3

# swift4.1.3
modelscope-registry.cn-hangzhou.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.9.1-py312-torch2.10.0-vllm0.19.1-modelscope1.35.4-swift4.1.3
modelscope-registry.cn-beijing.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.9.1-py312-torch2.10.0-vllm0.19.1-modelscope1.35.4-swift4.1.3
modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.9.1-py312-torch2.10.0-vllm0.19.1-modelscope1.35.4-swift4.1.3

# swift4.0.3
modelscope-registry.cn-hangzhou.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.8.1-py311-torch2.10.0-vllm0.17.1-modelscope1.34.0-swift4.0.3
modelscope-registry.cn-beijing.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.8.1-py311-torch2.10.0-vllm0.17.1-modelscope1.34.0-swift4.0.3
modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.8.1-py311-torch2.10.0-vllm0.17.1-modelscope1.34.0-swift4.0.3
```

--------------------------------

### Build Documentation

Source: https://github.com/modelscope/ms-swift/blob/main/docs/README.md

Run this command in the root directory to build the project's documentation.

```shell
# in root directory:
make docs
```

--------------------------------

### Install SWIFT from Source with All Capabilities (Main Branch)

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/GetStarted/SWIFT-installation.md

Clone the SWIFT repository and install it in editable mode with all dependencies. This ensures all features are available for development.

```shell
pip install -e '.[all]'
```

--------------------------------

### Start Ray Cluster

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/Instruction/Ray.md

Commands to start a Ray cluster. The head node must be started first, and other nodes connect to it. Single-node setups can omit this step.

```bash
# 1. 启动 Ray 集群（单节点可省略）
ray start --head                        # 主节点
ray start --address=<head_ip>:6379      # 其他节点
```

--------------------------------

### Install GPTQ v2 Quantization Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Export-and-push.md

Install gptqmodel and optimum for GPTQ v2 quantization. Ensure compatibility with your CUDA setup.

```shell
pip install gptqmodel optimum -U
```

--------------------------------

### Run Swift Training Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/BestPractices/Metax-support.md

This command navigates to the ms-swift directory and executes a full training script. It is recommended to use the optimized Swift packages located in the /workspace directory.

```bash
# We assume that the ms-swift code is under /workspace
cd /workspace/ms-swift/
bash examples/train/full/train.sh
```

--------------------------------

### Install MindSpeed/Megatron-SWIFT Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/NPU-support.md

Install optional dependencies for MindSpeed and Megatron-LM, including cloning repositories, installing packages, and setting environment variables. This setup is for advanced users requiring enhanced performance.

```shell
# 1. Clone Megatron-LM and switch to v0.16.0
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout core_v0.16.0
cd ..

# 2. Clone and install MindSpeed
git clone https://gitcode.com/Ascend/MindSpeed.git
cd MindSpeed
git checkout core_r0.16.0
pip install -e .
cd ..

# 3. Clone and install mcore-bridge
git clone https://github.com/modelscope/mcore-bridge.git
cd mcore-bridge
pip install -e .
cd ..

# 4. Download and install triton-ascend
pip install triton-ascend==3.2.1 --extra-index-url=https://triton-ascend.osinfra.cn/pypi/simple

# 5. Set environment variables
export PYTHONPATH=$PYTHONPATH:<your_local_megatron_lm_path>
export MEGATRON_LM_PATH=<your_local_megatron_lm_path>

# 6. Disable Megatron GDN if you need to fall back to the transformers GatedDeltaNet implementation
export USE_MCORE_GDN=0
```

--------------------------------

### Example dataset_info.json Configurations

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/Customization/Custom-dataset.md

These examples show different ways to structure the dataset_info.json file for specifying dataset metadata, including dataset IDs, paths, subsets, splits, and column mappings.

```json
[
  {
    "ms_dataset_id": "xxx/xxx"
  },
  {
    "dataset_path": "<dataset_dir/dataset_path>"
  },
  {
    "ms_dataset_id": "<dataset_id>",
    "subsets": ["v1"],
    "split": ["train", "validation"],
    "columns": {
      "input": "query",
      "output": "response"
    }
  },
  {
    "ms_dataset_id": "<dataset_id>",
    "hf_dataset_id": "<hf_dataset_id>",
    "subsets": [{
      "subset": "subset1",
      "columns": {
        "problem": "query",
        "content": "response"
      }
    },
    {
      "subset": "subset2",
      "columns": {
        "messages": "_",
        "new_messages": "messages"
      }
    }]
  }
]
```

--------------------------------

### Launch Web-UI

Source: https://github.com/modelscope/ms-swift/blob/main/README.md

Start the Web-UI for a zero-threshold training and deployment interface. Set the language using the `SWIFT_UI_LANG` environment variable.

```shell
SWIFT_UI_LANG=en swift web-ui
```

--------------------------------

### Install ms-swift using pip

Source: https://github.com/modelscope/ms-swift/blob/main/README.md

Use this command to install the ms-swift library using pip. The -U flag ensures you get the latest version.

```shell
pip install ms-swift -U
```

--------------------------------

### Start External vLLM Server

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/BestPractices/GRPO-Multi-Modal-Training.md

Launches an external vLLM server for model inference. Ensure CUDA_VISIBLE_DEVICES are correctly set for your GPU setup.

```bash
CUDA_VISIBLE_DEVICES=6,7 \
swift rollout \
    --model Qwen/Qwen2.5-VL-3B-Instruct \
    --vllm_data_parallel_size 2
```

--------------------------------

### Elastic Training Command Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/Elastic.md

This example demonstrates how to launch a supervised fine-tuning (SFT) job with elastic training enabled. It configures DeepSpeed elasticity and specifies various training parameters.

```bash
model=your model path
dataset=your dataset
output= your output dir
export CUDA_VISIBLE_DEVICES=0 # Set according to the actual GPU usage
deepspeed_config_or_type=deepspeed type or configuration file path, e.g., zero1 or /xxx/ms-swift/swift/llm/ds_config/zero1.json

dlrover-run --nnodes 1:$NODE_NUM --nproc_per_node=1  \
/opt/conda/lib/python3.10/site-packages/swift/cli/sft.py --model $model \
--model_type qwen3 \
--tuner_type lora  \
--torch_dtype bfloat16 \
--dataset $dataset \
--num_train_epochs 4 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--learning_rate 5e-7 \
--gradient_accumulation_steps 8 \
--eval_steps 500 \
--save_steps 10 \
--save_total_limit 20 \
--logging_steps 1 \
--output_dir $output \
--warmup_ratio 0.01 \
--dataloader_num_workers 4 \
--temperature 1.0 \
--system 'You are a helpful assistant.' \
--lora_rank 8 \
--lora_alpha 32 \
--target_modules all-linear \
--dataset_num_proc 1 \
--use_flash_ckpt true \
--callbacks deepspeed_elastic graceful_exit \
--deepspeed $deepspeed_config_or_type \

```

--------------------------------

### SAPO Training Configuration

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/Instruction/GRPO/AdvancedResearch/SAPO.md

Configure SAPO for GRPO training by setting the loss type to 'sapo' and specifying positive and negative temperature parameters. This example shows a typical command-line setup for RLHF training with SAPO.

```bash
swift rlhf \
    --rlhf_type grpo \
    --loss_type sapo \
    --tau_pos 1.0 \
    --tau_neg 1.05 \
    # ... 其他参数
```

--------------------------------

### Qwen3-Embedding Custom Instruction Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/BestPractices/Embedding.md

Demonstrates how to use custom instructions with the Qwen3-Embedding model by prepending a system message to the user query.

```json
{"messages": [
  {"role": "system", "content": "请用中文回答，并输出简洁要点"},
  {"role": "user", "content": "介绍一下Qwen3-Embedding"}
]}
```

--------------------------------

### GRPO RLHF Training Configuration

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/BestPractices/GRPO.md

Configures and starts a GRPO Reinforcement Learning from Human Feedback (RLHF) training process. This setup utilizes vLLM for inference, a countdown tasks dataset, and specifies various training hyperparameters.

```bash
CUDA_VISIBLE_DEVICES=0,1 \
WANDB_API_KEY=your_wandb_key \
NPROC_PER_NODE=2 \
swift rlhf \
    --rlhf_type grpo \
    --model Qwen/Qwen2.5-3B-Instruct \
    --external_plugins examples/train/grpo/plugin/plugin.py \
    --reward_funcs external_countdown format \
    --use_vllm true \
    --vllm_mode server \
    --vllm_server_host 127.0.0.1 \
    --vllm_server_port 8000 \
    --tuner_type full \
    --torch_dtype bfloat16 \
    --dataset 'zouxuhong/Countdown-Tasks-3to4#50000' \
    --load_from_cache_file true \
    --max_length 2048 \
    --max_completion_length 1024 \
    --num_train_epochs 1 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 8 \
    --learning_rate 5e-7 \
    --gradient_accumulation_steps 8 \
    --eval_steps 500 \
    --save_steps 100 \
    --save_total_limit 20 \
    --logging_steps 1 \
    --output_dir output/GRPO_COUNTDOWN \
    --warmup_ratio 0.01 \
    --dataloader_num_workers 4 \
    --num_generations 8 \
    --temperature 1.0 \
    --system 'You are a helpful assistant. You first thinks about the reasoning process in the mind and then provides the user with the answer.' \
    --deepspeed zero3 \
    --log_completions true \
    --report_to wandb \
    --beta 0.001 \
    --num_iterations 1
```

--------------------------------

### Agent Template Encoding Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/Instruction/Agent-support.md

Demonstrates how to encode data using a specified agent template ('react_en' in this case) with the MS-Swift library. It shows the process of getting a processor, template, setting the mode, and encoding data into input_ids and labels.

```python
from swift import get_processor, get_template

tokenizer = get_processor('Qwen/Qwen3.5-2B')
template = get_template(tokenizer)  # 使用默认agent模板
# template = get_template(tokenizer, agent_template='qwen3_5')
print(f'agent_template: {template._agent_template}')
data = {...}
template.set_mode('train')
encoded = template.encode(data)
print(f'[INPUT_IDS] {template.safe_decode(encoded["input_ids"])}
')
print(f'[LABELS] {template.safe_decode(encoded["labels"])}')
```

--------------------------------

### Install mcore-bridge

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Megatron-SWIFT/Quick-start.md

Install the mcore-bridge library. The command to install from the main branch is commented out.

```shell
pip install mcore-bridge -U
# Install from main branch
# pip install git+https://github.com/modelscope/mcore-bridge.git
```

--------------------------------

### Start SWIFT Web UI

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/GetStarted/Web-UI.md

Use this command to launch the SWIFT Web UI for training and inference. Specify the language with the --lang flag.

```shell
swift web-ui --lang zh
# or en
swift web-ui --lang en
```

--------------------------------

### Check Torch Installation

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/Metax-support.md

Verify the installed torch version in the current environment after installation.

```bash
pip list |grep torch
# output:
# torch2.x.x+metax3.x.x.x
```

--------------------------------

### Install ms-swift and Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/Qwen3-Best-Practice.md

Install the ms-swift library and essential dependencies for training, including transformers, deepspeed for multi-GPU, liger-kernel for memory saving, and flash-attn for packing.

```bash
pip install ms-swift -U
pip install transformers

pip install deepspeed # for multi-GPU training
pip install liger-kernel # to save GPU memory resources
pip install flash-attn --no-build-isolation  # required for packing
```

--------------------------------

### Launch Training with YAML/JSON

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Command-line-parameters.md

Use `swift sft` with either a YAML or JSON configuration file to launch training jobs. The configuration file will be saved in the output directory.

```shell
swift sft xxx.yaml
```

```shell
swift sft xxx.json
```

--------------------------------

### Install SWIFT from Source (main branch)

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Clone the SWIFT repository and install it in editable mode using pip. This is useful for development or when needing the latest changes.

```shell
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .
```

```shell
uv pip install -e . --torch-backend=auto
```

```shell
pip install -e '.[all]'
```

--------------------------------

### Install SWIFT Wheel Package

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/GetStarted/SWIFT-installation.md

Install the base SWIFT package using pip. Use the '-U' flag to upgrade if already installed.

```shell
pip install 'ms-swift' -U
```

--------------------------------

### Full-Parameter Training Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/deepseek-v4.md

This example demonstrates full-parameter training on 64 GPUs. It requires lowering the learning rate and increasing parallelism. Adjust `pipeline_model_parallel_layout` according to your GPU configuration.

```default
--lr 1e-5 \
--min_lr 1e-6 \
--tensor_model_parallel_size 1 \
--expert_model_parallel_size 8 \
--pipeline_model_parallel_size 8 \
--pipeline_model_parallel_layout Et*5|t*5|t*6|t*6|t*6|t*5|t*5|t*5mL
```

--------------------------------

### Install SWIFT using pip

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Install the SWIFT package using pip. Use the `[megatron]`, `[eval]`, or `[all]` options to install additional dependencies.

```shell
pip install 'ms-swift' -U
```

```shell
pip install 'ms-swift[megatron]' -U
```

```shell
pip install 'ms-swift[eval]' -U
```

```shell
pip install 'ms-swift[all]' -U
```

--------------------------------

### Hybrid Configuration: YAML with Command-line Arguments

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Command-line-parameters.md

Combine YAML configuration with command-line arguments for flexibility. This example uses a YAML file for general settings and specifies the adapter path via the command line.

```shell
CUDA_VISIBLE_DEVICES=0 \
swift infer examples/yaml/deepspeed/infer.yaml \
    --adapters output/vx-xxx/checkpoint-xxx
```

--------------------------------

### Qwen3-VL Image Description Example

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/Qwen3-VL-Best-Practice.md

Demonstrates how to use Qwen3-VL to describe an image provided via a URL. The model analyzes the image and provides a detailed textual description.

```text
<<< <image>describe the image.
Input an image path or URL <<< http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/cat.png
This is a beautifully detailed, close-up portrait of an adorable tabby kitten, rendered with a soft, painterly effect that gives it a gentle, dreamy quality.

Here's a breakdown of the image:

- **The Kitten:** The subject is a young, fluffy kitten with a classic tabby pattern. Its fur is a mix of white and soft grayish-brown stripes, with a prominent dark stripe running down the center of its forehead and over its nose. The kitten's face is predominantly white, with delicate markings around its eyes and cheeks.

- **The Eyes:** Its most captivating feature is its large, round, and expressive eyes. They are a striking shade of bright blue-gray, with dark pupils that give it an intense, curious, and slightly innocent gaze. The eyes are wide open, suggesting the kitten is alert and attentive.

- **The Expression:** The kitten's expression is sweet and innocent. Its small pink nose and slightly parted mouth give it a gentle, almost pleading look. Its whiskers are long and white, standing out against its fur.

- **The Style:** The image has a soft-focus, artistic quality, reminiscent of impressionist painting. The edges of the kitten's fur are slightly blurred, creating a halo effect that draws attention to its face. The background is softly blurred with muted tones of green and gray, which helps the kitten stand out as the clear focal point.

- **Overall Impression:** The image evokes feelings of warmth, cuteness, and tenderness. The kitten appears to be looking directly at the viewer, creating a sense of connection and affection.

This is a lovely and charming depiction of a young kitten, capturing its innocence and charm in a visually appealing and emotionally engaging way.
```

--------------------------------

### Install SWIFT from Source (release/3.12 branch)

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/GetStarted/SWIFT-installation.md

Clone a specific release branch of the SWIFT repository and install it in editable mode. This is for installing older versions.

```shell
git clone -b release/3.12 https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .
```

```shell
pip install -e '.[all]'
```

--------------------------------

### Install ms-swift and Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/Qwen3_5-Best-Practice.md

Install the ms-swift library and essential dependencies for Qwen3.5 model training and inference. Ensure specific versions are used for compatibility.

```shell
pip install -U ms-swift
pip install -U "transformers>=5.9" "qwen_vl_utils>=0.0.14" peft liger-kernel
```

```shell
# flash-linear-attention
# If you encounter slow training issues, please refer to: https://github.com/fla-org/flash-linear-attention/issues/758
# Please use Python 3.12: https://github.com/fla-org/flash-linear-attention/issues/121
pip install -U "flash-linear-attention>=0.4.2" --no-build-isolation
```

```shell
# causal_conv1d
pip install -U git+https://github.com/Dao-AILab/causal-conv1d --no-build-isolation
```

```shell
# flash-attention
pip install "flash-attn==2.8.3" --no-build-isolation
```

```shell
# deepspeed training
pip install deepspeed
```

```shell
# vllm (torch2.10) for inference/deployment/RL
pip install -U "vllm>=0.17.0"
```

--------------------------------

### Install GPTQ Quantization Dependencies

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Export-and-push.md

Install auto_gptq and optimum for GPTQ quantization. Verify CUDA version compatibility with the AutoGPTQ GitHub repository for correct installation.

```shell
pip install auto_gptq optimum -U
```

--------------------------------

### Infonce Loss Configuration Examples

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source/BestPractices/Embedding.md

Examples of how to configure Infonce loss using environment variables. These settings control temperature, negative sampling, and masking of fake negatives.

```shell
export INFONCE_TEMPERATURE=0.2
export INFONCE_USE_BATCH=False
export INFONCE_HARD_NEGATIVES=10
export INFONCE_MASK_FAKE_NEGATIVE=True
export INFONCE_FAKE_NEG_MARGIN=0.2
export INFONCE_INCLUDE_QQ=True
export INFONCE_INCLUDE_DD=True
```

--------------------------------

### Install ms-swift and torch_npu

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/BestPractices/NPU-support.md

Install the ms-swift package and the torch_npu library. Optional packages like deepspeed for memory reduction and evalscope for evaluation features can also be installed.

```shell
conda create -n swift-npu python=3.11 -y
conda activate swift-npu
source /usr/local/Ascend/ascend-toolkit/set_env.sh
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
pip install ms-swift -U
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .
pip install torch_npu==2.9.0 decorator
pip install deepspeed
pip install evalscope[opencompass]
pip install vllm==0.18.0
pip install vllm-ascend==0.18.0
```

--------------------------------

### Multi-GPU Training with Torchrun

Source: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Frequently-asked-questions.md

This example demonstrates how to initiate multi-GPU training using torchrun. Ensure that DeepSpeed and device_map are not used simultaneously.

```bash
torchrun --nproc_per_node <num_gpus> your_script.py --your_args
```