### Stage 1: Recognize Audio/Video CLI Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of using the command-line interface to perform stage 1 recognition on an audio or video file. This stage processes the input and saves recognition results.

```bash
python funclip/videoclipper.py --stage 1 \
    --file examples/video.mp4 \
    --output_dir ./output \
    --lang zh
```

--------------------------------

### Audio Input Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

Demonstrates how to load an audio file using librosa and format it into the expected tuple structure for audio input.

```python
import librosa
wav, sr = librosa.load('audio.wav', sr=None)
audio_input = (sr, wav)  # e.g., (48000, array([...]))
```

--------------------------------

### VideoClipper Argument Setup

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Defines the complete argument parser setup for the VideoClipper, including a detailed list of all available arguments, their types, requirements, defaults, choices, and descriptions.

```APIDOC
## VideoClipper Argument Setup

### Description
This section details the complete argument parser configuration for the VideoClipper, outlining all available command-line arguments.

### Full Argument List
| Argument | Type | Required | Default | Choices | Description |
|----------|------|----------|---------|---------|-------------|
| --stage | int | Yes | — | 1, 2 | Processing stage (1=recognize, 2=clip) |
| --file | str | Yes | — | — | Input file path |
| --sd_switch | str | No | "no" | "no", "yes" | Enable speaker diarization |
| --output_dir | str | No | "./output" | — | Output directory path |
| --dest_text | str | No | None | — | Text to clip (# separated) |
| --dest_spk | str | No | None | — | Speaker to clip (# separated) |
| --start_ost | int | No | 0 | — | Start offset (ms) |
| --end_ost | int | No | 0 | — | End offset (ms) |
| --output_file | str | No | None | — | Output file path |
| --lang | str | No | "zh" | — | Language (zh, en) |
```

--------------------------------

### Example Usage of Text2SRT

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Demonstrates initializing Text2SRT and calling its text() method to get a formatted string.

```python
t2s = Text2SRT(['我', '是', 'AI'], [[0, 100], [100, 200], [200, 300]])
print(t2s.text())  # "我是 AI"
```

--------------------------------

### Batch Processing Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/DOCUMENTATION_SUMMARY.txt

Demonstrates patterns for batch processing multiple files using FunClip. This is efficient for handling large datasets.

```python
# Conceptual example for batch processing:
# import os
# for filename in os.listdir('input_dir'):
#     if filename.endswith('.wav'):
#         input_path = os.path.join('input_dir', filename)
#         output_path = os.path.join('output_dir', filename.replace('.wav', '.srt'))
#         # Call FunClip processing function here

```

--------------------------------

### Launching the Web Interface

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/module-overview.md

Starts a local web server for FunCLIP, allowing users to interact with the recognition and clipping features through a web browser. Specify language and port.

```bash
python funclip/launch.py -l zh -m paraformer -p 7860
# Visit localhost:7860 in browser
# Upload file → Recognize → Clip → Download
```

--------------------------------

### Install FunClip Python Requirements

Source: https://github.com/modelscope/funclip/blob/main/README.md

Clone the FunClip repository and install its Python dependencies using pip.

```shell
git clone https://github.com/alibaba-damo-academy/FunClip.git
cd FunClip
pip install -r ./requirements.txt
```

--------------------------------

### Hotword Syntax Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Provides an example of space-separated terms for hotword configuration. This improves recognition accuracy for specific terms.

```text
热词1 热词2 热词3
```

```text
云栖大会 阿里巴巴 普惠设计
```

--------------------------------

### Install ffmpeg and imagemagick on Ubuntu

Source: https://github.com/modelscope/funclip/blob/main/README.md

Installs ffmpeg and imagemagick on Ubuntu systems and configures ImageMagick policy for read/write access.

```shell
apt-get -y update && apt-get -y install ffmpeg imagemagick
sed -i 's/none/read,write/g' /etc/ImageMagick-6/policy.xml
```

--------------------------------

### Parse Command Line Arguments

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of parsing command-line arguments using the get_parser() function. Demonstrates how to set stage, input file, and output directory.

```python
from funclip.videoclipper import get_parser

parser = get_parser()
args = parser.parse_args([
    '--stage', '1',
    '--file', 'video.mp4',
    '--output_dir', './output'
])

print(args.stage)  # 1
print(args.file)   # 'video.mp4'
```

--------------------------------

### Example Usage of get_commandline_args

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Shows how to use the get_commandline_args function to retrieve parsed command-line arguments. Assumes specific arguments were passed via the CLI and demonstrates accessing them.

```python
from funclip.utils.argparse_tools import get_commandline_args

args = get_commandline_args()
# Assume CLI was: python script.py --stage 1 --file video.mp4
print(args.stage)   # 1
print(args.file)    # 'video.mp4'
```

--------------------------------

### Example Usage of srt()

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Demonstrates initializing Text2SRT and calling its srt() method to generate an SRT formatted string.

```python
t2s = Text2SRT(['我', '们'], [[500, 700]], offset=0)
print(t2s.srt(acc_ost=0.0))
```

--------------------------------

### Timestamp List Format Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of the timestamp output format, represented as a Python list of [start, end] time pairs.

```python
[[0, 100], [100, 200], [200, 300], ...]
```

--------------------------------

### Example Usage of time()

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Demonstrates initializing Text2SRT and calling its time() method to retrieve the time range with an applied offset.

```python
t2s = Text2SRT(['我', '们'], [[500, 700]], offset=0)
start, end = t2s.time(acc_ost=1.5)
# Returns: (0.5 + 1.5, 0.7 + 1.5) = (2.0, 2.2)
```

--------------------------------

### Argparse Setup for Gradio Launch Script

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Sets up the argument parser for the Gradio launch script, defining arguments for language, ASR model, sharing, port, and listening options.

```python
parser = argparse.ArgumentParser(description='argparse testing')
parser.add_argument('--lang', '-l', type=str, default="zh", help="language")
parser.add_argument('--model', '-m', type=str, default="paraformer",
                   choices=["paraformer", "fun-asr-nano", "sensevoice"],
                   help="ASR model")
parser.add_argument('--share', '-s', action='store_true',
                   help="if to establish gradio share link")
parser.add_argument('--port', '-p', type=int, default=7860, help='port number')
parser.add_argument('--listen', action='store_true',
                   help="if to listen to all hosts")
```

--------------------------------

### Initialize VideoClipper ArgumentParser

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Initializes the ArgumentParser for VideoClipper, setting up the description and formatter class. This is part of the complete argument setup for the VideoClipper.

```python
def get_parser():
    parser = ArgumentParser(
        description="ClipVideo Argument",
        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
    )
```

--------------------------------

### Sentences List of Dicts Format Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of the sentences output format, where each sentence is a dictionary containing text, timestamp, and speaker information.

```python
[{'text': [...], 'timestamp': [...], 'spk': 0}, ...]
```

--------------------------------

### Raw Text Output Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of the plain text output format for recognition results.

```text
我们 的 设 计 能 力 。
```

--------------------------------

### Sentence Info Structure Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

Provides an example of the sentence information dictionary, showing recognized Chinese text tokens, their corresponding timestamps, and speaker ID.

```python
sentence = {
    'text': ['我', '爱', '自', '然', '语', '言', '处', '理'],
    'timestamp': [
        [100, 200], [200, 300], [300, 400], [400, 500],
        [500, 600], [600, 700], [700, 800], [800, 900]
    ],
    'spk': 0
}
```

--------------------------------

### Install imagemagick on macOS

Source: https://github.com/modelscope/funclip/blob/main/README.md

Installs imagemagick on macOS using Homebrew and configures its policy file.

```shell
brew install imagemagick
sed -i 's/none/read,write/g' /usr/local/Cellar/imagemagick/7.1.1-8_1/etc/ImageMagick-7/policy.xml
```

--------------------------------

### Stage 1: Recognition with Speaker Diarization CLI Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of using the command-line interface for stage 1 recognition with speaker diarization enabled. This allows for speaker identification during the recognition process.

```bash
python funclip/videoclipper.py --stage 1 \
    --file examples/video.mp4 \
    --output_dir ./output \
    --sd_switch yes \
    --lang zh
```

--------------------------------

### State File Content Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

State files created by write_state() store Python objects using repr() format. This example shows the content of a timestamp file.

```text
[[0, 100], [100, 200], [200, 300]]
```

--------------------------------

### Stage 2: Clip with Text Search CLI Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of using the command-line interface to perform stage 2 clipping based on a specific text search. This requires a previously completed stage 1 recognition.

```bash
python funclip/videoclipper.py --stage 2 \
    --file examples/video.mp4 \
    --output_dir ./output \
    --dest_text "待裁剪的文本" \
    --start_ost 0 \
    --end_ost 100 \
    --output_file ./output/clipped.mp4
```

--------------------------------

### State Dictionary Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

An example illustrating the structure and typical content of the state dictionary, including sample data for recognition results, timestamps, sentences, and optional video metadata.

```python
state = {
    'audio_input': (16000, np.array([...])),
    'recog_res_raw': '我 爱 自 然 语 言 处 理',
    'timestamp': [[0, 100], [100, 200], ...],
    'sentences': [
        {
            'text': ['我', '爱'],
            'timestamp': [[0, 100], [100, 200]],
            'spk': 0
        },
        ...
    ],
    'sd_sentences': [...],
    'video_filename': 'video.mp4',
    'clip_video_file': 'video_clip.mp4',
    'video': VideoFileClip(...)
}
```

--------------------------------

### Valid Offset Syntax Examples

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/errors.md

Illustrates the correct syntax for specifying offsets within the destination text for Funclip's `clip` method.

```python
"我们的设计能力[100, 200]"  # Valid: offsets applied
"我们的设计能力[abc, def]"  # Invalid: non-numeric offsets
"我们的设计能力[100]"       # Invalid: missing second offset
```

--------------------------------

### Stage 2: Clip with Speaker Filter CLI Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of using the command-line interface to perform stage 2 clipping based on a specific speaker ID. This requires a previously completed stage 1 recognition with speaker diarization enabled.

```bash
python funclip/videoclipper.py --stage 2 \
    --file examples/video.mp4 \
    --output_dir ./output \
    --dest_spk "spk0" \
    --output_file ./output/spk0.mp4
```

--------------------------------

### Launch Local FunClip Gradio Service

Source: https://github.com/modelscope/funclip/blob/main/README.md

Starts a local Gradio service for FunClip. Use '-m' to specify the ASR model, '-l' for language, '-p' for port, and '-s' for public access.

```shell
python funclip/launch.py
# '-m fun-asr-nano' for Fun-ASR-Nano model (higher accuracy, 31 languages)
# '-m sensevoice' for SenseVoice model (multilingual ASR + emotion + audio event detection)
# '-l en' for English audio recognize
# '-p xxx' for setting port number
# '-s True' for establishing service for public accessing
```

--------------------------------

### SRT File Format Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Shows the format of the total.srt file, including sequence numbers, timestamps, and recognized text. It can also include speaker information.

```plaintext
1
00:00:00,000 --> 00:00:02,500
识别文本

2  spk0
00:00:02,500 --> 00:00:05,000
带说话人信息的文本
```

--------------------------------

### Example Usage of Custom ArgumentParser

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Demonstrates how to use the custom ArgumentParser to define and parse command-line arguments for a video clipping task. It shows adding required arguments and accessing parsed values.

```python
from funclip.utils.argparse_tools import ArgumentParser

parser = ArgumentParser(
    description="ClipVideo Argument",
    formatter_class=argparse.ArgumentDefaultsHelpFormatter,
)

parser.add_argument(
    "--stage",
    type=int,
    choices=(1, 2),
    help="Stage, 0 for recognizing and 1 for clipping",
    required=True
)

args = parser.parse_args()
print(f"Stage: {args.stage}")
```

--------------------------------

### Argument Parser Setup

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/DOCUMENTATION_SUMMARY.txt

The get_parser() function returns the argument parser object used by the CLI. This can be useful for understanding or extending the available command-line arguments.

```python
from funclip.command_line import get_parser

parser = get_parser()
args = parser.parse_args()

```

--------------------------------

### Launch FunClip with Fun-ASR-Nano or SenseVoice Models

Source: https://github.com/modelscope/funclip/blob/main/README.md

Run this command to try FunClip with the Fun-ASR-Nano model for higher accuracy across 31 languages, or with the SenseVoice model for emotion recognition and audio event detection.

```bash
python funclip/launch.py -m fun-asr-nano
```

```bash
python funclip/launch.py -m sensevoice
```

--------------------------------

### Launch FunCLIP with Arguments

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/INDEX.md

Use these arguments to launch the FunCLIP application, specifying language, model, port, and sharing options.

```bash
python funclip/launch.py \
  -l zh              # Language: zh, en
  -m paraformer      # Model: paraformer, fun-asr-nano, sensevoice
  -p 7860            # Port number
  -s                 # Enable share link
  --listen           # Listen 0.0.0.0
```

--------------------------------

### SRT Subtitle Format Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Example of the SRT subtitle format, including sequence numbers and timestamps.

```text
1
00:00:00,000 --> 00:00:02,500
识别结果的第一句

2  spk0
00:00:02,500 --> 00:00:05,000
第二句（带说话人信息）
```

--------------------------------

### Launch FunClip with Emotion and Event Detection Model

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Launches the Gradio web interface using the 'sensevoice' model, which supports emotion and audio event detection, via the '-m' argument.

```bash
python funclip/launch.py -m sensevoice
```

--------------------------------

### Listen on All Interfaces Launch

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Launches the Gradio interface to listen on all network interfaces (0.0.0.0) using the --listen flag.

```bash
# All network interfaces
python funclip/launch.py --listen
```

--------------------------------

### SRT File Format Example

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

Files with .srt extension use the standard SubRip format for subtitle timing and text. This example shows a single subtitle entry.

```text
1
00:00:00,000 --> 00:00:02,500
识别结果的第一句
```

--------------------------------

### Initialize and Recognize Audio with FunClip

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/README.md

This pattern initializes the AutoModel and VideoClipper, setting the language to Chinese for recognition. It requires the `funclip` and `funasr` libraries. Ensure `audio_input` is defined.

```python
from funclip.videoclipper import VideoClipper
from funasr import AutoModel

model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
clipper = VideoClipper(model)
clipper.lang = 'zh'

res_text, res_srt, state = clipper.recog(audio_input)
```

--------------------------------

### Run Stage 1 (Recognition) with Videoclipper

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/INDEX.md

Configure and execute the first stage of the videoclipper script, which handles audio recognition. Ensure to specify input file, output directory, and language.

```bash
python funclip/videoclipper.py \
  --stage 1 \
  --file input.mp4 \
  --output_dir ./output \
  --sd_switch yes    # Enable speaker diarization
  --lang zh          # Language: zh, en
```

--------------------------------

### Launch with Fun-ASR-Nano Model and Chinese Language

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Launches the Gradio interface using the Fun-ASR-Nano model and Chinese language settings.

```bash
python funclip/launch.py -m fun-asr-nano -l zh
```

--------------------------------

### Launch FunClip for Network Access

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Launches the Gradio web interface to listen on all network interfaces (0.0.0.0) instead of just localhost, enabling remote access via the '--listen' argument.

```bash
python funclip/launch.py --listen
```

--------------------------------

### Launch with Public Share Link

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Launches the Gradio interface and creates a temporary public share link, valid for 72 hours.

```bash
python funclip/launch.py --share
# Creates temporary public URL for 72 hours
```

--------------------------------

### Launch FunClip for English Audio File Recognition

Source: https://github.com/modelscope/funclip/blob/main/README.md

Use this command to enable FunClip's ability to recognize and clip English audio files.

```bash
python funclip/launch.py -l en
```

--------------------------------

### Launch Gradio Interface

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Use this command to launch the Gradio web interface. Options can be appended to customize the launch.

```bash
python funclip/launch.py [options]
```

--------------------------------

### Launch Web Interface on Custom Port

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Launches the Gradio web interface on a specified custom port.

```bash
python funclip/launch.py --port 8080
```

--------------------------------

### CLI Entry Point

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/INDEX.md

The main entry point for the FunCLIP command-line interface.

```APIDOC
## CLI Entry Point

### `runner(stage, file, ...)`

#### Description
Main CLI entry point that orchestrates the execution based on the specified stage and file.

#### Parameters
- `stage`: The execution stage (e.g., 'transcribe', 'clip').
- `file` (str): The input file path.
- `...`: Additional command-line arguments.

#### Returns
- `None`

### `get_parser()`

#### Description
Create and return the argument parser for the CLI.

#### Returns
- `ArgumentParser`: An instance of `argparse.ArgumentParser`.

### `get_commandline_args()`

#### Description
Parse the command-line arguments provided by the user.

#### Returns
- `Namespace`: An object containing the parsed command-line arguments.
```

--------------------------------

### Get Argument Parser

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Retrieves the ArgumentParser object used for command-line interface arguments. This is essential for understanding and utilizing the CLI options.

```python
def get_parser():
    pass
```

--------------------------------

### Get Time Range as Seconds

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Returns the time range as a tuple of (start_seconds, end_seconds), with an optional time accumulation offset applied.

```python
def time(self, acc_ost=0.0):
    pass
```

--------------------------------

### Multiple Text Segments with Offsets

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Clips multiple text segments from a video, allowing specification of start and end offsets for each segment.

```bash
python funclip/videoclipper.py \
    --stage 2 \
    --file video.mp4 \
    --output_dir ./asr_results \
    --dest_text "第一段#第二段#第三段" \
    --start_ost 100 \
    --end_ost -50 \
    --output_file ./multi_clips.mp4
```

--------------------------------

### Timestamp List Structure

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

Represents time ranges for clipping video segments. Each element is a list containing the start and end time in milliseconds.

```python
[[start_ms, end_ms], [start_ms, end_ms], ...]
```

```python
# Direct timestamps
timestamp_list = [[500, 5850], [7120, 12940], [13240, 25620]]
```

```python
# From text matching
from funclip.utils.trans_utils import proc, pre_proc
ts = proc(recog_res_raw, timestamp, pre_proc("待裁剪文本"))
# Returns frame-unit timestamps, convert to ms: [[start/16, end/16], ...]
```

--------------------------------

### Full Pipeline: Recognition and Clipping in Python

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/module-overview.md

Initializes the AutoModel and VideoClipper, performs speech recognition on an audio file, and then clips the recognized content based on specified text.

```python
from funasr import AutoModel
from funclip.videoclipper import VideoClipper
import librosa

# Initialize
model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
clipper = VideoClipper(model)
clipper.lang = 'zh'

# Step 1: Recognize
wav, sr = librosa.load('video.mp4', sr=16000)
res_text, res_srt, state = clipper.recog((sr, wav))

# Step 2: Clip
(sr, clipped), msg, srt = clipper.clip(
    dest_text="待裁剪文本",
    start_ost=0, end_ost=100,
    state=state
)
```

--------------------------------

### Launch FunClip with High-Accuracy Multilingual Model

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Launches the Gradio web interface using the high-accuracy multilingual model 'fun-asr-nano' via the '-m' argument.

```bash
python funclip/launch.py -m fun-asr-nano
```

--------------------------------

### Launch Listening on All Network Interfaces

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Launches the Gradio interface to listen on all network interfaces (0.0.0.0), allowing access from other machines on the network.

```bash
python funclip/launch.py --listen
# Allows access from other machines on the network
```

--------------------------------

### Alibaba Qwen Configuration

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Specifies the service (DashScope/百炼) and supported models for Alibaba Qwen. Configuration requires setting the DashScope API key.

```python
Service: DashScope (百炼)

Models Supported:
- qwen_plus
- qwen_max
- qwen_turbo

API Key: Bailian platform API key

Configuration: Set via `dashscope.api_key = key`
```

--------------------------------

### Get Command Line Arguments

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/DOCUMENTATION_SUMMARY.txt

The get_commandline_args() function is used to retrieve and parse command-line arguments for FunClip. Ensure all required arguments are provided when running the CLI.

```python
from funclip.argparse_tools import get_commandline_args

args = get_commandline_args()
print(args.input_path)

```

--------------------------------

### Command Line Interface (CLI)

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/DOCUMENTATION_SUMMARY.txt

Documentation for the command-line interface entry points and argument parsing.

```APIDOC
## Command Line Interface (CLI)

### Description
Entry points and utilities for interacting with FunClip via the command line.

### Functions

#### `runner()`
- **Description**: The main entry point for the CLI.
- **Parameters**: (Details not provided in source)
- **Returns**: (Details not provided in source)

#### `get_parser()`
- **Description**: Retrieves the argument parser configuration.
- **Parameters**: (Details not provided in source)
- **Returns**: (Details not provided in source)

### Stages

#### Stage 1 (Recognition)
- **Description**: Documentation for the recognition stage via CLI.

#### Stage 2 (Clipping)
- **Description**: Documentation for the clipping stage via CLI.

### Gradio Interface
- **Description**: Information on launching the Gradio interface.

### Usage Examples
- **Description**: Complete CLI usage examples are provided.
```

--------------------------------

### Main CLI Entry Point

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/DOCUMENTATION_SUMMARY.txt

The runner() function serves as the main entry point for the FunClip command-line interface (CLI). Use this to execute FunClip from the terminal.

```python
from funclip.command_line import runner

if __name__ == "__main__":
    runner()

```

--------------------------------

### Clip Video Without Subtitles

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/VideoClipper.md

Use this example to clip a video segment based on specific text content without adding any subtitle overlay. Ensure the 'state' dictionary is correctly populated from video recognition.

```python
# Clip video containing specific text
clip_video_file, message, srt = clipper.video_clip(
    dest_text="待裁剪的文本",
    start_ost=100,
    end_ost=-50,
    state=state,
    add_sub=False
)
```

--------------------------------

### LLM-Based Clipping with OpenAI API

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/module-overview.md

Integrates with an LLM (like OpenAI's GPT) to get clipping suggestions based on subtitles. Parses the LLM response to extract timestamps and then uses these timestamps for precise clipping.

```python
# After recognition (Pattern 1, Step 1)

# Get LLM suggestions
from funclip.llm.openai_api import openai_call
from funclip.utils.trans_utils import extract_timestamps

llm_response = openai_call(
    apikey="sk-...",
    model="gpt-3.5-turbo",
    system_content="你是视频剪辑助手...",
    user_content="这是待裁剪的视频SRT字幕：\n" + res_srt
)

# Parse and clip
timestamps = extract_timestamps(llm_response)
(sr, clipped), msg, srt = clipper.clip(
    dest_text="",
    start_ost=0, end_ost=0,
    state=state,
    timestamp_list=timestamps
)
```

--------------------------------

### VideoClipper Constructor

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/VideoClipper.md

Initializes a VideoClipper instance with an optional FunASR model for speech recognition.

```APIDOC
## `__init__(funasr_model)`

### Description
Initializes a VideoClipper instance with a FunASR model.

### Parameters
#### Path Parameters
- **funasr_model** (AutoModel | None) - Required - FunASR AutoModel instance for speech recognition. Can be None if only performing clipping operations with pre-computed state.

### Request Example
```python
from funasr import AutoModel
from funclip.videoclipper import VideoClipper

# Initialize with Chinese ASR model
funasr_model = AutoModel(
    model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
    vad_model="damo/speech_fsmn_vad_zh-cn_16k-common-pytorch",
    punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch",
    spk_model="damo/speech_campplus_sv_zh-cn_16k-common",
)
clipper = VideoClipper(funasr_model)
clipper.lang = 'zh'
```
```

--------------------------------

### Launch FunClip on Custom Port with Public Share

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Launches the Gradio web interface on a custom port (8080) and enables a public share link using '-p' and '-s' arguments.

```bash
python funclip/launch.py -p 8080 -s
```

--------------------------------

### Sentence Info Structure Definition

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/types.md

Defines the structure for recognized text, including text content, token-level timestamps, and optional speaker information. Text can be a string or a list of tokens, and timestamps are provided as millisecond start and end times.

```python
{
    'text': str | list[str],
    'timestamp': list[list[int, int]],
    'spk': int | str  # Optional
}
```

--------------------------------

### generate_srt_clip(sentence_list, start, end, begin_index=0, time_acc_ost=0.0)

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Generates SRT subtitles for a specific time range within a larger transcript. This function intelligently handles partial sentences at the boundaries and allows for sequential generation of subtitle clips.

```APIDOC
## generate_srt_clip(sentence_list, start, end, begin_index=0, time_acc_ost=0.0)

### Description
Generates SRT subtitles for a specific time range within the full transcript.

### Parameters
#### Path Parameters
- **sentence_list** (list[dict]) - Required - List of sentence dictionaries with 'text' and 'timestamp' keys.
- **start** (float) - Required - Start time in seconds.
- **end** (float) - Required - End time in seconds.
- **begin_index** (int) - Optional - Default: 0 - Starting subtitle index number (for multiple clips, use previous index-1).
- **time_acc_ost** (float) - Optional - Default: 0.0 - Time accumulation offset in seconds for multi-segment concatenation.

### Return Type
`tuple (str, list[tuple], int)`

Returns a tuple of:
- SRT subtitle string for the clipped range
- List of subtitle tuples: [((start_sec, end_sec), text), ...]
- Next index number (for consecutive clips)

### Details
Handles partial sentences that overlap with the clip boundaries. Splits tokens and timestamps appropriately.

### Example
```python
from funclip.utils.subtitle_utils import generate_srt_clip

sentences = [
    {
        'text': ['我', '们', '的', '设', '计', '能', '力'],
        'timestamp': [[100, 200], [200, 300], [300, 400], [400, 500], [500, 600], [600, 700], [700, 800]]
    }
]

# Get subtitles for time range 0.2s to 0.5s
srt, subs, next_idx = generate_srt_clip(
    sentences,
    start=0.2,
    end=0.5,
    begin_index=0,
    time_acc_ost=0.0
)

print(srt)
# Subtitle output with adjusted timestamps

print(subs)
# [((0.2, 0.5), '们的设计'), ...]

print(next_idx)
# 1 (next subtitle index)
```
```

--------------------------------

### Generate VAD Segments from Speaker Diarization

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/trans_utils.md

Generates Voice Activity Detection (VAD) segments, including start and end times in seconds and the corresponding audio data, from speaker diarization results. Requires audio data and speaker diarization information.

```python
from funclip.utils.trans_utils import generate_vad_data

# Assuming 'data' is an ndarray of audio data and 'sd_sentences' is a list of dicts with 'ts_list'
generate_vad_data(data, sd_sentences, sr=16000)
```

--------------------------------

### Public Share Link Launch

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Launches the Gradio interface and creates a temporary public shareable link using the -s flag.

```bash
# Public share link (temporary)
python funclip/launch.py -s
```

--------------------------------

### Generate SRT subtitles for a time clip

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/subtitle_utils.md

Generates SRT subtitles for a specific time range (start to end in seconds) from a list of sentences. It handles partial sentences at boundaries and returns the SRT string, a list of subtitle tuples, and the next index number.

```python
from funclip.utils.subtitle_utils import generate_srt_clip

sentences = [
    {
        'text': ['我', '们', '的', '设', '计', '能', '力'],
        'timestamp': [[100, 200], [200, 300], [300, 400], [400, 500], [500, 600], [600, 700], [700, 800]]
    }
]

# Get subtitles for time range 0.2s to 0.5s
srt, subs, next_idx = generate_srt_clip(
    sentences,
    start=0.2,
    end=0.5,
    begin_index=0,
    time_acc_ost=0.0
)

print(srt)
# Subtitle output with adjusted timestamps

print(subs)
# [((0.2, 0.5), '们的设计'), ...]

print(next_idx)
# 1 (next subtitle index)
```

--------------------------------

### Run Stage 2 (Clipping) with Videoclipper

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/INDEX.md

Configure and execute the second stage of the videoclipper script for clipping video content based on text. Specify input file, output directory, target text, and timing offsets.

```bash
python funclip/videoclipper.py \
  --stage 2 \
  --file input.mp4 \
  --output_dir ./output \
  --dest_text "text#to#find" \
  --start_ost 0      # ms offset
  --end_ost 100      # ms offset
  --output_file output.mp4
```

--------------------------------

### Initialize VideoClipper with FunASR Model

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/VideoClipper.md

Initializes the VideoClipper with a FunASR AutoModel instance. Set the language attribute ('zh' or 'en') after initialization. This is required for speech recognition functionalities.

```python
from funasr import AutoModel
from funclip.videoclipper import VideoClipper

# Initialize with Chinese ASR model
funasr_model = AutoModel(
    model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
    vad_model="damo/speech_fsmn_vad_zh-cn_16k-common-pytorch",
    punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch",
    spk_model="damo/speech_campplus_sv_zh-cn_16k-common",
)
clipper = VideoClipper(funasr_model)
clipper.lang = 'zh'
```

--------------------------------

### SenseVoice Model Launch

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Launches the Gradio interface using the SenseVoice model, which supports emotion and event detection, specified by the -m flag.

```bash
# SenseVoice (emotion + event detection)
python funclip/launch.py -m sensevoice
```

--------------------------------

### Handle Output Directory

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Ensures the output directory is correctly formatted by stripping trailing slashes and creates the directory if it does not exist.

```python
while output_dir.endswith('/'):
    output_dir = output_dir[:-1]
if not os.path.exists(output_dir):
    os.mkdir(output_dir)
```

--------------------------------

### FunClip Runner Function Signature

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Defines the main entry point for command-line video/audio clipping operations. It accepts parameters for processing stage, input file, output directory, clipping criteria, and language.

```python
def runner(stage, file, sd_switch, output_dir, dest_text, dest_spk, start_ost, end_ost, 
           output_file, config=None, lang='zh'):
    pass
```

--------------------------------

### Recognition Stage CLI Parameters

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Use these parameters for the recognition stage (Stage 1) of the videoclipper.py runner. Specify the input file and output directory.

```bash
--stage 1 --file input.mp4 --output_dir ./asr_results --sd_switch yes --lang en
```

--------------------------------

### Full Pipeline for Audio Clipping

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/INDEX.md

Use this pattern for a complete audio recognition and clipping process. It requires importing AutoModel and VideoClipper, and setting the language for recognition.

```python
from funasr import AutoModel
from funclip.videoclipper import VideoClipper

model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
clipper = VideoClipper(model)
clipper.lang = 'zh'

# Recognize
res_text, res_srt, state = clipper.recog(audio_input)

# Clip
(sr, audio), msg, srt = clipper.clip(
    dest_text="text",
    start_ost=0, end_ost=100,
    state=state
)
```

--------------------------------

### Fun-ASR-Nano Model Launch

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/command_line.md

Launches the Gradio interface using the Fun-ASR-Nano model, known for supporting 31 languages and higher accuracy, specified by the -m flag.

```bash
# Fun-ASR-Nano (31 languages, higher accuracy)
python funclip/launch.py -m fun-asr-nano
```

--------------------------------

### Minimal Recognition (Defaults)

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/api-reference/argparse_tools.md

Performs minimal recognition using default settings for output directory, language, and SD switch.

```bash
python funclip/videoclipper.py --stage 1 --file video.mp4
# Uses defaults: output_dir="./output", lang="zh", sd_switch="no"
```

--------------------------------

### g4f (Free) Configuration

Source: https://github.com/modelscope/funclip/blob/main/_autodocs/configuration.md

Details the configuration for using g4f (free service). No API key is required, but the service is noted as unstable.

```plaintext
Service: gpt4free

Models: Any model available in g4f (gpt-3.5-turbo, gpt-4, etc.)

API Key: None required

Warning: Unstable, may timeout or fail. Retry recommended.
```