### Check LibreOffice Installation

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Run a check for LibreOffice installation using the office document test script.

```bash
python examples/office_document_test.py --check-libreoffice --file dummy
```

--------------------------------

### Install RAG-Anything from Source with uv

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Clone the repository, install uv, and then sync dependencies. Use '--extra' or '--all-extras' for optional features. Commands can be run directly with 'uv run'.

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

```bash
git clone https://github.com/HKUDS/RAG-Anything.git
```

```bash
cd RAG-Anything
```

```bash
uv sync
```

```bash
UV_HTTP_TIMEOUT=120 uv sync
```

```bash
uv run python examples/raganything_example.py --help
```

```bash
uv sync --extra image --extra text
```

```bash
uv sync --all-extras
```

--------------------------------

### Run End-to-End Document Processing Example

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Execute the main example script for end-to-end document processing with MinerU parser. Requires an API key and a document path.

```bash
python examples/raganything_example.py path/to/document.pdf --api-key YOUR_API_KEY --parser mineru
```

--------------------------------

### Verify MinerU Installation

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Check the installation and configuration of MinerU. The first command verifies the command-line tool, and the second checks programmatically.

```bash
mineru --version
```

```python
from raganything import RAGAnything; rag = RAGAnything(); print('✅ MinerU installed properly' if rag.check_parser_installation() else '❌ MinerU installation issue')
```

--------------------------------

### Install RAG-Anything from PyPI

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Install the RAG-Anything package using pip. Use the '[all]' extra for all optional features, or specify individual features like '[image]' or '[text]'.

```bash
pip install raganything
```

```bash
pip install 'raganything[all]'
```

```bash
pip install 'raganything[image]'
```

```bash
pip install 'raganything[text]'
```

```bash
pip install 'raganything[image,text]'
```

--------------------------------

### Run Direct Multimodal Content Processing Example

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Execute the script for direct multimodal content processing. Requires an API key.

```bash
python examples/modalprocessors_example.py --api-key YOUR_API_KEY
```

--------------------------------

### Check ReportLab Installation

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Run a check for ReportLab installation using the text format test script.

```bash
python examples/text_format_test.py --check-reportlab --file dummy
```

--------------------------------

### Comprehensive Context Configuration (Python)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

An example of configuring RAGAnything for comprehensive context analysis. This setup uses a larger context window (`context_window=2`), page-based context mode (`context_mode="page"`), and allows up to 3000 context tokens. It includes both headers and captions and considers 'text', 'image', and 'table' content types.

```python
config = RAGAnythingConfig(
    context_window=2,
    context_mode="page",
    max_context_tokens=3000,
    include_headers=True,
    include_captions=True,
    context_filter_content_types=["text", "image", "table"]
)
```

--------------------------------

### Configure RAGAnything Pipeline

Source: https://context7.com/hkuds/rag-anything/llms.txt

Instantiate and configure the RAGAnything pipeline using RAGAnythingConfig. All configuration options can be set directly or will be read from environment variables by default. This example demonstrates setting storage paths, parser selection, modality toggles, batch processing options, context-aware settings, and path handling.

```python
from raganything import RAGAnythingConfig

config = RAGAnythingConfig(
    # Storage
    working_dir="./rag_storage",          # WORKING_DIR env var
    parser_output_dir="./output",         # OUTPUT_DIR env var

    # Parser selection
    parser="mineru",                      # PARSER env var: "mineru" | "docling" | "paddleocr"
    parse_method="auto",                  # PARSE_METHOD env var: "auto" | "ocr" | "txt"
    display_content_stats=True,           # DISPLAY_CONTENT_STATS env var

    # Modality toggles
    enable_image_processing=True,         # ENABLE_IMAGE_PROCESSING env var
    enable_table_processing=True,         # ENABLE_TABLE_PROCESSING env var
    enable_equation_processing=True,      # ENABLE_EQUATION_PROCESSING env var

    # Batch processing
    max_concurrent_files=1,               # MAX_CONCURRENT_FILES env var
    recursive_folder_processing=True,     # RECURSIVE_FOLDER_PROCESSING env var
    supported_file_extensions=[ ".pdf", ".docx", ".pptx", ".jpg", ".png", ".md"],

    # Context-aware multimodal analysis
    context_window=2,                     # CONTEXT_WINDOW env var (pages/chunks around item)
    context_mode="page",                  # CONTEXT_MODE env var: "page" | "chunk"
    max_context_tokens=3000,              # MAX_CONTEXT_TOKENS env var
    include_headers=True,                 # INCLUDE_HEADERS env var
    include_captions=True,                # INCLUDE_CAPTIONS env var
    context_filter_content_types=["text"],# CONTEXT_FILTER_CONTENT_TYPES env var
    content_format="minerU",              # CONTENT_FORMAT env var

    # Path handling
    use_full_path=False,                  # USE_FULL_PATH env var: store basename vs. full path
)

print(config.parser)           # "mineru"
print(config.context_window)   # 2

```

--------------------------------

### Check PIL/Pillow Installation

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Run a check for PIL/Pillow installation using the image format test script.

```bash
python examples/image_format_test.py --check-pillow --file dummy
```

--------------------------------

### Install PaddleOCR Parser Extras

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Install the PaddleOCR parser extras for RAG-Anything using pip or uv. Note that paddlepaddle itself also needs to be installed separately.

```bash
pip install -e ".[paddleocr]"
# or
uv sync --extra paddleocr
```

--------------------------------

### Install RAG-Anything Batch Dependencies

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Commands to install the core RAG-Anything package along with necessary dependencies for batch processing and OCR support.

```bash
pip install raganything[all]
pip install tqdm
pip install raganything[paddleocr]
```

--------------------------------

### Install Pandoc Backend Dependencies

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Installs Pandoc and wkhtmltopdf, which are optional dependencies for the Pandoc backend. Instructions are provided for Ubuntu/Debian, macOS using Homebrew, and Conda environments.

```bash
# Ubuntu/Debian:
sudo apt-get install pandoc wkhtmltopdf

# macOS:
brew install pandoc wkhtmltopdf

# Or using conda:
conda install -c conda-forge pandoc wkhtmltopdf
```

--------------------------------

### Run Office Document Parsing Test Example

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Execute the test script for parsing office documents using the MinerU parser. This test does not require an API key.

```bash
python examples/office_document_test.py --file path/to/document.docx
```

--------------------------------

### MinerU Content Format Example (JSON)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

An example illustrating the MinerU format for content sources, which is a list of dictionaries. Each dictionary represents a content item and can include types like 'text' or 'image', along with relevant metadata such as text content, page index, image paths, and captions.

```json
[
    {
        "type": "text",
        "text": "Document content here...",
        "text_level": 1,
        "page_idx": 0
    },
    {
        "type": "image",
        "img_path": "images/figure1.jpg",
        "image_caption": ["Figure 1: Architecture"],
        "image_footnote": [],
        "page_idx": 1
    }
]
```

--------------------------------

### Run Image Format Parsing Test Example

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Execute the test script for parsing image formats using the MinerU parser. This test does not require an API key.

```bash
python examples/image_format_test.py --file path/to/image.bmp
```

--------------------------------

### Install WeasyPrint System Dependencies

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Installs the necessary system-level dependencies for the WeasyPrint PDF generation backend on Ubuntu/Debian systems. This includes build tools, Python development headers, and Cairo/Pango libraries.

```bash
sudo apt-get install -y build-essential python3-dev python3-pip \
    python3-setuptools python3-wheel python3-cffi libcairo2 \
    libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 \
    libffi-dev shared-mime-info
```

--------------------------------

### Install System Dependencies for PDF Conversion

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Provides shell commands to install necessary system-level dependencies for WeasyPrint and Pandoc on Ubuntu/Debian systems.

```bash
# Ubuntu/Debian: Install system dependencies
sudo apt-get update
sudo apt-get install -y build-essential python3-dev libcairo2 \
    libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 \
    libffi-dev shared-mime-info

# Then reinstall WeasyPrint
pip install --force-reinstall weasyprint
```

```bash
# Check if Pandoc is installed
pandoc --version

# Install Pandoc (Ubuntu/Debian)
sudo apt-get install pandoc wkhtmltopdf
```

--------------------------------

### Run Text Format Parsing Test Example

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Execute the test script for parsing text formats using the MinerU parser. This test does not require an API key.

```bash
python examples/text_format_test.py --file path/to/document.md
```

--------------------------------

### Install RAG-Anything with Enhanced Markdown Dependencies

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Installs the RAG-Anything package with all optional dependencies, specifically including those required for enhanced markdown conversion like markdown, weasyprint, and pygments.

```bash
pip install raganything[all]
pip install markdown weasyprint pygments
```

--------------------------------

### Configure BatchParser Settings

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Examples for configuring the BatchParser to handle memory constraints, timeouts, and debugging requirements.

```python
# Memory optimization
batch_parser = BatchParser(max_workers=2)

# Timeout adjustment
batch_parser = BatchParser(timeout_per_file=600)

# Skip installation check
batch_parser = BatchParser(skip_installation_check=True)

# Debug logging
import logging
logging.basicConfig(level=logging.DEBUG)
batch_parser = BatchParser(parser_type="mineru", max_workers=2)
```

--------------------------------

### Execute RAG-Anything Application

Source: https://github.com/hkuds/rag-anything/blob/main/docs/offline_setup.md

Command to execute the RAG-Anything example script once the offline cache is properly configured.

```bash
uv run examples/raganything_example.py requirements.txt
```

--------------------------------

### Command Line Batch Interface

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Examples of using the CLI to trigger batch processing, specify parsers, and perform dry runs.

```bash
python -m raganything.batch_parser examples/sample_docs/ --output ./output --workers 4
python -m raganything.batch_parser examples/sample_docs/ --parser paddleocr --method ocr
python -m raganything.batch_parser examples/sample_docs/ --output ./output --dry-run
```

--------------------------------

### Header Formatting Example (Markdown)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Illustrates how headers are formatted when the `include_headers=True` configuration option is enabled. Headers are presented using markdown-style prefixes to denote their level, such as '# Level 1 Header', '## Level 2 Header', and '### Level 3 Header'.

```markdown
# Level 1 Header
## Level 2 Header
### Level 3 Header
```

--------------------------------

### Process Directories and Filter Files

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Practical examples of processing entire directories recursively and filtering file lists based on supported extensions.

```python
from pathlib import Path

# Process directory
batch_parser = BatchParser(max_workers=4)
directory_path = Path("./documents")
result = batch_parser.process_batch(file_paths=[str(directory_path)], output_dir="./processed", recursive=True)

# Filter files
all_files = ["doc1.pdf", "image.png", "spreadsheet.xlsx"]
supported_files = batch_parser.filter_supported_files(all_files)
result = batch_parser.process_batch(file_paths=supported_files, output_dir="./output")
```

--------------------------------

### Caption Integration Example (Text)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Shows the format for integrating image and table captions when `include_captions=True`. Captions are enclosed in square brackets, prefixed with 'Image:' or 'Table:', followed by the caption text.

```text
[Image: Figure 1 caption text]
[Table: Table 1 caption text]
```

--------------------------------

### Chunk-Based Analysis Configuration (Python)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

An example demonstrating configuration for chunk-based context analysis. This configuration uses a context window of 5 items (`context_window=5`), specifies chunk-based context mode (`context_mode="chunk"`), and sets a maximum of 2000 context tokens. It excludes headers and captions and filters context to only include 'text' content types.

```python
config = RAGAnythingConfig(
    context_window=5,
    context_mode="chunk",
    max_context_tokens=2000,
    include_headers=False,
    include_captions=False,
    context_filter_content_types=["text"]
)
```

--------------------------------

### Initialize RAGAnything with Custom LLM Functions

Source: https://context7.com/hkuds/rag-anything/llms.txt

Demonstrates initializing RAGAnything with custom language model and vision model functions, along with an embedding function. Supports passing arbitrary `lightrag_kwargs` for fine-tuning LightRAG behavior.

```python
import asyncio
from functools import partial
from raganything import RAGAnything, RAGAnythingConfig
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc

API_KEY = "sk-..."
BASE_URL = "https://api.openai.com/v1"

def llm_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    return openai_complete_if_cache(
        "gpt-4o-mini", prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key=API_KEY, base_url=BASE_URL, **kwargs,
    )

def vision_func(prompt, system_prompt=None, history_messages=[],
                image_data=None, messages=None, **kwargs):
    if messages:
        return openai_complete_if_cache(
            "gpt-4o", "", messages=messages,
            api_key=API_KEY, base_url=BASE_URL, **kwargs,
        )
    if image_data:
        return openai_complete_if_cache(
            "gpt-4o", "",
            messages=[
                {"role": "system", "content": system_prompt} if system_prompt else None,
                {"role": "user", "content": [
                    {"type": "text", "text": prompt},
                    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}},
                ]},
            ],
            api_key=API_KEY, base_url=BASE_URL, **kwargs,
        )
    return llm_func(prompt, system_prompt, history_messages, **kwargs)

embed_func = EmbeddingFunc(
    embedding_dim=3072, max_token_size=8192,
    func=partial(openai_embed.func, model="text-embedding-3-large",
                 api_key=API_KEY, base_url=BASE_URL),
)

config = RAGAnythingConfig(working_dir="./rag_storage", parser="mineru")

rag = RAGAnything(
    config=config,
    llm_model_func=llm_func,
    vision_model_func=vision_func,   # enables VLM-enhanced query automatically
    embedding_func=embed_func,
    # Pass any LightRAG kwargs:
    lightrag_kwargs={"max_parallel_insert": 4, "top_k": 60},
)
# No need to call initialize — done lazily on first use.
```

--------------------------------

### Initialize LightRAG with Environment Variables

Source: https://github.com/hkuds/rag-anything/blob/main/docs/offline_setup.md

Python implementation showing how to load environment variables before importing LightRAG to ensure the tiktoken cache path is correctly registered.

```python
import os
import sys
from pathlib import Path
from dotenv import load_dotenv

# Add project root directory to Python path
sys.path.insert(0, str(Path(__file__).parent.parent))

# Load environment variables FIRST - before any imports that use tiktoken
load_dotenv(dotenv_path=".env", override=False)

# Now import LightRAG
from lightrag import LightRAG
from lightrag.utils import logger
```

--------------------------------

### Initialize and Process Documents with RAGAnything

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Demonstrates how to initialize the RAGAnything system with specific configuration settings and trigger a complete document processing workflow.

```python
from raganything import RAGAnything

# Initialize system
rag = RAGAnything(config={
    "working_dir": "./storage",
    "enable_image_processing": True
})

# Process document
await rag.process_document_complete("document.pdf")
```

--------------------------------

### RAGAnything Initialization

Source: https://context7.com/hkuds/rag-anything/llms.txt

Demonstrates how to initialize the RAGAnything class with custom language model functions, vision model functions, and embedding functions. It also shows how to pass additional keyword arguments to the underlying LightRAG instance.

```APIDOC
## RAGAnything Initialization

### Description
Initialize the RAGAnything class with custom language model functions, vision model functions, and embedding functions. Supports BYO model functions and arbitrary `lightrag_kwargs` forwarded to LightRAG.

### Parameters

- **config** (RAGAnythingConfig) - Configuration for RAGAnything, including working directory and parser.
- **llm_model_func** (callable) - A function to handle language model completions.
- **vision_model_func** (callable, optional) - A function to handle vision model completions. Enables VLM-enhanced query automatically if provided.
- **embedding_func** (EmbeddingFunc) - An instance of EmbeddingFunc for generating embeddings.
- **lightrag_kwargs** (dict, optional) - Arbitrary keyword arguments to be passed to the underlying LightRAG instance.

### Example
```python
import asyncio
from functools import partial
from raganything import RAGAnything, RAGAnythingConfig
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc

API_KEY = "sk-..."
BASE_URL = "https://api.openai.com/v1"

def llm_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    return openai_complete_if_cache(
        "gpt-4o-mini", prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key=API_KEY, base_url=BASE_URL, **kwargs,
    )

def vision_func(prompt, system_prompt=None, history_messages=[],
                image_data=None, messages=None, **kwargs):
    if messages:
        return openai_complete_if_cache(
            "gpt-4o", "", messages=messages,
            api_key=API_KEY, base_url=BASE_URL, **kwargs,
        )
    if image_data:
        return openai_complete_if_cache(
            "gpt-4o", "",
            messages=[
                {"role": "system", "content": system_prompt} if system_prompt else None,
                {"role": "user", "content": [
                    {"type": "text", "text": prompt},
                    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}},
                ]},
            ],
            api_key=API_KEY, base_url=BASE_URL, **kwargs,
        )
    return llm_func(prompt, system_prompt, history_messages, **kwargs)

embed_func = EmbeddingFunc(
    embedding_dim=3072, max_token_size=8192,
    func=partial(openai_embed.func, model="text-embedding-3-large",
                 api_key=API_KEY, base_url=BASE_URL),
)

config = RAGAnythingConfig(working_dir="./rag_storage", parser="mineru")

rag = RAGAnything(
    config=config,
    llm_model_func=llm_func,
    vision_model_func=vision_func,   # enables VLM-enhanced query automatically
    embedding_func=embed_func,
    # Pass any LightRAG kwargs:
    lightrag_kwargs={"max_parallel_insert": 4, "top_k": 60},
)
# No need to call initialize — done lazily on first use.
```
```

--------------------------------

### Load and Use Existing LightRAG Instance with RAGAnything

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

This snippet shows how to load a previously saved LightRAG instance and initialize RAGAnything with it. It includes setting up API keys, checking for existing instances, defining custom model functions for text and vision, and then performing a query and processing a new document.

```python
import asyncio
from functools import partial
from raganything import RAGAnything, RAGAnythingConfig
from lightrag import LightRAG
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.kg.shared_storage import initialize_pipeline_status
from lightrag.utils import EmbeddingFunc
import os

async def load_existing_lightrag():
    # Set up API configuration
    api_key = "your-api-key"
    base_url = "your-base-url"  # Optional

    # First, create or load existing LightRAG instance
    lightrag_working_dir = "./existing_lightrag_storage"

    # Check if previous LightRAG instance exists
    if os.path.exists(lightrag_working_dir) and os.listdir(lightrag_working_dir):
        print("✅ Found existing LightRAG instance, loading...")
    else:
        print("❌ No existing LightRAG instance found, will create new one")

    # Create/load LightRAG instance with your configuration
    lightrag_instance = LightRAG(
        working_dir=lightrag_working_dir,
        llm_model_func=lambda prompt, system_prompt=None, history_messages=[], **kwargs: openai_complete_if_cache(
            "gpt-4o-mini",
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            api_key=api_key,
            base_url=base_url,
            **kwargs,
        ),
        embedding_func=EmbeddingFunc(
            embedding_dim=3072,
            max_token_size=8192,
            func=partial(
                openai_embed.func, 
                model="text-embedding-3-large",
                api_key=api_key,
                base_url=base_url,
            ),
        )
    )

    # Initialize storage (this will load existing data if available)
    await lightrag_instance.initialize_storages()
    await initialize_pipeline_status()

    # Define vision model function for image processing
    def vision_model_func(
        prompt, system_prompt=None, history_messages=[], image_data=None, messages=None, **kwargs
    ):
        # If messages format is provided (for multimodal VLM enhanced query), use it directly
        if messages:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=messages,
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Traditional single image format
        elif image_data:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=[
                    {"role": "system", "content": system_prompt}
                    if system_prompt
                    else None,
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt},
                            {
                                "type": "image_url",
                                "image_url": {
                                    "url": f"data:image/jpeg;base64,{image_data}"
                                },
                            },
                        ],
                    }
                    if image_data
                    else {"role": "user", "content": prompt},
                ],
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Pure text format
        else:
            return lightrag_instance.llm_model_func(prompt, system_prompt, history_messages, **kwargs)

    # Now use existing LightRAG instance to initialize RAGAnything
    rag = RAGAnything(
        lightrag=lightrag_instance,  # Pass existing LightRAG instance
        vision_model_func=vision_model_func,
        # Note: working_dir, llm_model_func, embedding_func, etc. are inherited from lightrag_instance
    )

    # Query existing knowledge base
    result = await rag.aquery(
        "What data has been processed in this LightRAG instance?",
        mode="hybrid"
    )
    print("Query result:", result)

    # Add new multimodal document to existing LightRAG instance
    await rag.process_document_complete(
        file_path="path/to/new/multimodal_document.pdf",
        output_dir="./output"
    )

if __name__ == "__main__":
    asyncio.run(load_existing_lightrag())

```

--------------------------------

### End-to-End Document Processing with RAGAnything

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

This script demonstrates the complete workflow of RAGAnything, from initialization to document processing and querying. It requires setting up API keys, configuring the RAGAnything instance with specific parsing and processing options, and then performing both text and multimodal queries.

```python
import asyncio
from functools import partial
from raganything import RAGAnything, RAGAnythingConfig
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc

async def main():
    # Set up API configuration
    api_key = "your-api-key"
    base_url = "your-base-url"  # Optional

    # Create RAGAnything configuration
    config = RAGAnythingConfig(
        working_dir="./rag_storage",
        parser="mineru",  # Parser selection: mineru, docling, or paddleocr
        parse_method="auto",  # Parse method: auto, ocr, or txt
        enable_image_processing=True,
        enable_table_processing=True,
        enable_equation_processing=True,
    )

    # Define LLM model function
    def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
        return openai_complete_if_cache(
            "gpt-4o-mini",
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            api_key=api_key,
            base_url=base_url,
            **kwargs,
        )

    # Define vision model function for image processing
    def vision_model_func(
        prompt, system_prompt=None, history_messages=[], image_data=None, messages=None, **kwargs
    ):
        # If messages format is provided (for multimodal VLM enhanced query), use it directly
        if messages:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=messages,
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Traditional single image format
        elif image_data:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=[
                    {"role": "system", "content": system_prompt}
                    if system_prompt
                    else None,
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt},
                            {
                                "type": "image_url",
                                "image_url": {
                                    "url": f"data:image/jpeg;base64,{image_data}"
                                },
                            },
                        ],
                    }
                    if image_data
                    else {"role": "user", "content": prompt},
                ],
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Pure text format
        else:
            return llm_model_func(prompt, system_prompt, history_messages, **kwargs)

    # Define embedding function
    embedding_func = EmbeddingFunc(
        embedding_dim=3072,
        max_token_size=8192,
        func=partial(
            openai_embed.func, 
            model="text-embedding-3-large", 
            api_key=api_key,
            base_url=base_url,
        ),
    )

    # Initialize RAGAnything
    rag = RAGAnything(
        config=config,
        llm_model_func=llm_model_func,
        vision_model_func=vision_model_func,
        embedding_func=embedding_func,
    )

    # Process a document
    await rag.process_document_complete(
        file_path="path/to/your/document.pdf",
        output_dir="./output",
        parse_method="auto"
    )

    # Query the processed content
    # Pure text query - for basic knowledge base search
    text_result = await rag.aquery(
        "What are the main findings shown in the figures and tables?",
        mode="hybrid"
    )
    print("Text query result:", text_result)

    # Multimodal query with specific multimodal content
    multimodal_result = await rag.aquery_with_multimodal(
    "Explain this formula and its relevance to the document content",
    multimodal_content=[{
        "type": "equation",
        "latex": "P(d|q) = \frac{P(q|d) \cdot P(d)}{P(q)}",
        "equation_caption": "Document relevance probability"
    }],
    mode="hybrid"
)
    print("Multimodal query result:", multimodal_result)

if __name__ == "__main__":
    asyncio.run(main())

```

--------------------------------

### Insert Content List and Query

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Demonstrates inserting a content list with text and table data, then querying the inserted content. Ensure absolute paths for image files if used.

```python
import asyncio

from rag.rag import RAG

async def insert_content_list_example():
    rag = RAG()

    # Insert a content list with text and table
    content_list = [
        {
            "type": "text",
            "text": "This is the main research paper content.",
            "page_idx": 0
        },
        {
            "type": "table",
            "table_body": "| Section | Summary |\n|---------|---------|\n| Intro | Background |\n| Methods | Approach |",
            "table_caption": ["Research Sections"],
            "page_idx": 1
        }
    ]

    await rag.insert_content_list(
        content_list=content_list,
        file_path="research_paper.pdf",
        split_by_character_only=False,   # Optional text splitting mode
        doc_id=None,                     # Optional custom document ID (will be auto-generated if not provided)
        display_stats=True               # Show content statistics
    )

    # Query the inserted content
    result = await rag.aquery(
        "What are the key findings and performance metrics mentioned in the research?",
        mode="hybrid"
    )
    print("Query result:", result)

    # You can also insert multiple content lists with different document IDs
    another_content_list = [
        {
            "type": "text",
            "text": "This is content from another document.",
            "page_idx": 0  # Page number where this content appears
        },
        {
            "type": "table",
            "table_body": "| Feature | Value |\n|---------|-------|\n| Speed | Fast |\n| Accuracy | High |",
            "table_caption": ["Feature Comparison"],
            "page_idx": 1  # Page number where this table appears
        }
    ]

    await rag.insert_content_list(
        content_list=another_content_list,
        file_path="another_document.pdf",
        doc_id="custom-doc-id-123"  # Custom document ID
    )

if __name__ == "__main__":
    asyncio.run(insert_content_list_example())

```

--------------------------------

### Initialize and Use BatchParser

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Demonstrates the core BatchParser interface, including initialization, filtering supported files, and executing batch processing tasks synchronously or asynchronously.

```python
class BatchParser:
    def __init__(self, parser_type: str = "mineru", max_workers: int = 4, ...):
        """Initialize batch parser"""

    def get_supported_extensions(self) -> List[str]:
        """Get list of supported file extensions"""

    def filter_supported_files(self, file_paths: List[str], recursive: bool = True) -> List[str]:
        """Filter files to only supported types"""

    def process_batch(self, file_paths: List[str], output_dir: str, ...) -> BatchProcessingResult:
        """Process files in batch"""

    async def process_batch_async(self, file_paths: List[str], output_dir: str, ...) -> BatchProcessingResult:
        """Process files in batch asynchronously"""
```

--------------------------------

### Check Available Backends and Select Specific Backend

Source: https://github.com/hkuds/rag-anything/blob/main/docs/enhanced_markdown.md

Demonstrates how to check the availability of different PDF conversion backends (WeasyPrint, Pandoc) using `get_backend_info()` and how to explicitly select a backend during the conversion process.

```python
# Check available backends
converter = EnhancedMarkdownConverter()
backend_info = converter.get_backend_info()

print("Available backends:")
for backend, available in backend_info["available_backends"].items():
    status = "✅" if available else "❌"
    print(f"  {status} {backend}")

print(f"Recommended backend: {backend_info['recommended_backend']}")

# Use specific backend
converter.convert_file_to_pdf(
    input_path="document.md",
    output_path="document.pdf",
    method="weasyprint"  # or "pandoc", "pandoc_system", "auto"
)
```

--------------------------------

### Execute Batch Processing

Source: https://github.com/hkuds/rag-anything/blob/main/docs/batch_processing.md

Demonstrates how to initialize the BatchParser and process a list of files or directories with configurable concurrency and progress tracking.

```python
from raganything.batch_parser import BatchParser

batch_parser = BatchParser(
    parser_type="mineru",
    max_workers=4,
    show_progress=True,
    timeout_per_file=300,
    skip_installation_check=False
)

result = batch_parser.process_batch(
    file_paths=["doc1.pdf", "doc2.docx", "folder/"],
    output_dir="./batch_output",
    parse_method="auto",
    recursive=True
)

print(result.summary())
```

--------------------------------

### Process Image, Table, and Equation Modalities

Source: https://context7.com/hkuds/rag-anything/llms.txt

Demonstrates processing of image, table, and equation content types using their respective modal processors. Ensure LightRAG and necessary functions are initialized.

```python
async def direct_modal_processing(lightrag_instance: LightRAG):
    ctx_config = ContextConfig(
        context_window=1, context_mode="page",
        max_context_tokens=2000, include_headers=True,
    )
    context_extractor = ContextExtractor(config=ctx_config,
                                         tokenizer=lightrag_instance.tokenizer)

    # --- Image ---
    img_processor = ImageModalProcessor(
        lightrag=lightrag_instance,
        modal_caption_func=vision_func,       # must accept image_data= kwarg
        context_extractor=context_extractor,
    )
    img_processor.set_content_source(full_content_list, content_format="minerU")

    caption, entity_info, _ = await img_processor.process_multimodal_content(
        modal_content={
            "img_path": "/abs/path/to/figure.png",
            "image_caption": ["Figure 2: Ablation Study"],
            "image_footnote": [],
            "page_idx": 5,
        },
        content_type="image",
        file_path="paper.pdf",
        entity_name="Ablation Study Figure",
        item_info={"page_idx": 5, "index": 12, "type": "image"},
    )
    print("Image entity:", entity_info["entity_name"])

    # --- Table ---
    tbl_processor = TableModalProcessor(
        lightrag=lightrag_instance, modal_caption_func=llm_func,
        context_extractor=context_extractor,
    )
    caption, entity_info, _ = await tbl_processor.process_multimodal_content(
        modal_content={
            "table_body": "| Method | F1 |\n|--------|----|\n| Ours | 0.94 |",
            "table_caption": ["Table 2: Results"],
            "page_idx": 7,
        },
        content_type="table",
        file_path="paper.pdf",
    )
    print("Table entity:", entity_info["entity_name"])

    # --- Equation ---
    eq_processor = EquationModalProcessor(
        lightrag=lightrag_instance, modal_caption_func=llm_func,
    )
    caption, entity_info, _ = await eq_processor.process_multimodal_content(
        modal_content={"latex": r"\mathcal{L} = -\sum y \log \hat{y}", "page_idx": 3},
        content_type="equation",
        file_path="paper.pdf",
    )
    print("Equation entity:", entity_info["entity_name"])

asyncio.run(direct_modal_processing(lightrag_instance))
```

--------------------------------

### High-Precision Context Configuration (Python)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

An example of configuring RAGAnything for high-precision context analysis. This configuration uses a small context window (`context_window=1`), page-based context mode (`context_mode="page"`), and limits context tokens to 1000. It includes headers but excludes captions and filters context to only include 'text' content types.

```python
config = RAGAnythingConfig(
    context_window=1,
    context_mode="page",
    max_context_tokens=1000,
    include_headers=True,
    include_captions=False,
    context_filter_content_types=["text"]
)
```

--------------------------------

### Multimodal Query with RAGAnything.aquery_with_multimodal()

Source: https://context7.com/hkuds/rag-anything/llms.txt

Enrich queries with specific multimodal content like images, tables, or equations. The content is analyzed and appended to the query before retrieval. Requires `asyncio`.

```python
import asyncio

async def multimodal_query():
    # Query with an image
    result = await rag.aquery_with_multimodal(
        query="How does this architecture compare to baselines in the document?",
        multimodal_content=[
            {
                "type": "image",
                "img_path": "/abs/path/to/architecture.png",
                "image_caption": ["Figure 3: Proposed Architecture"],
            }
        ],
        mode="hybrid",
    )
    print("Image query result:", result)

    # Query with a table
    result = await rag.aquery_with_multimodal(
        query="Which method in the document achieves similar accuracy?",
        multimodal_content=[
            {
                "type": "table",
                "table_data": "Method,Accuracy\nOurs,95.2%\nBaseline,87.3%",
                "table_caption": "Performance comparison",
            }
        ],
        mode="mix",
    )
    print("Table query result:", result)

    # Query with a LaTeX equation
    result = await rag.aquery_with_multimodal(
        query="Where is this formula used in the document?",
        multimodal_content=[
            {
                "type": "equation",
                "latex": r"P(d|q) = \frac{P(q|d) \cdot P(d)}{P(q)}",
                "equation_caption": "Bayesian relevance formula",
            }
        ],
        mode="hybrid",
    )
    print("Equation query result:", result)

asyncio.run(multimodal_query())
```

--------------------------------

### Directly Insert Pre-parsed Content List into RAGAnything

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Use this method when you have a content list already structured, such as from external parsers. Ensure image paths are absolute. The `file_path` serves as a reference for citations.

```python
import asyncio
from functools import partial
from raganything import RAGAnything, RAGAnythingConfig
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc

async def insert_content_list_example():
    # Set up API configuration
    api_key = "your-api-key"
    base_url = "your-base-url"  # Optional

    # Create RAGAnything configuration
    config = RAGAnythingConfig(
        working_dir="./rag_storage",
        enable_image_processing=True,
        enable_table_processing=True,
        enable_equation_processing=True,
    )

    # Define model functions
    def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
        return openai_complete_if_cache(
            "gpt-4o-mini",
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            api_key=api_key,
            base_url=base_url,
            **kwargs,
        )

    def vision_model_func(prompt, system_prompt=None, history_messages=[], image_data=None, messages=None, **kwargs):
        # If messages format is provided (for multimodal VLM enhanced query), use it directly
        if messages:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=messages,
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Traditional single image format
        elif image_data:
            return openai_complete_if_cache(
                "gpt-4o",
                "",
                system_prompt=None,
                history_messages=[],
                messages=[
                    {"role": "system", "content": system_prompt} if system_prompt else None,
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt},
                            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
                        ],
                    } if image_data else {"role": "user", "content": prompt},
                ],
                api_key=api_key,
                base_url=base_url,
                **kwargs,
            )
        # Pure text format
        else:
            return llm_model_func(prompt, system_prompt, history_messages, **kwargs)

    embedding_func = EmbeddingFunc(
        embedding_dim=3072,
        max_token_size=8192,
        func=partial(
            openai_embed.func, 
            model="text-embedding-3-large",
            api_key=api_key,
            base_url=base_url,
        ),
    )

    # Initialize RAGAnything
    rag = RAGAnything(
        config=config,
        llm_model_func=llm_model_func,
        vision_model_func=vision_model_func,
        embedding_func=embedding_func,
    )

    # Example: Pre-parsed content list from external source
    content_list = [
        {
            "type": "text",
            "text": "This is the introduction section of our research paper.",
            "page_idx": 0  # Page number where this content appears
        },
        {
            "type": "image",
            "img_path": "/absolute/path/to/figure1.jpg",  # IMPORTANT: Use absolute path
            "image_caption": ["Figure 1: System Architecture"],
            "image_footnote": ["Source: Authors' original design"],
            "page_idx": 1  # Page number where this image appears
        },
        {
            "type": "table",
            "table_body": "| Method | Accuracy | F1-Score |\n|--------|----------|----------|\n| Ours | 95.2% | 0.94 |\n| Baseline | 87.3% | 0.85 |",
            "table_caption": ["Table 1: Performance Comparison"],
            "table_footnote": ["Results on test dataset"],
            "page_idx": 2  # Page number where this table appears
        },
        {
            "type": "equation",
            "latex": "P(d|q) = \frac{P(q|d) \cdot P(d)}{P(q)}",
            "text": "Document relevance probability formula",
            "page_idx": 3  # Page number where this equation appears
        },
        {
            "type": "text",
            "text": "In conclusion, our method demonstrates superior performance across all metrics.",
            "page_idx": 4  # Page number where this content appears
        }
    ]

    # Insert the content list directly
    await rag.insert_content_list(
        content_list=content_list,
        file_path="research_paper.pdf",  # Reference file name for citation
        split_by_character=None,         # Optional text splitting
    )

```

--------------------------------

### Configure RAGAnything Context Settings

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Demonstrates how to define context extraction parameters using Python configuration objects and environment variables. These settings control window size, token limits, and content filtering.

```python
context_window: int = 1
context_mode: str = "page"
max_context_tokens: int = 2000
include_headers: bool = True
include_captions: bool = True
context_filter_content_types: List[str] = ["text"]
content_format: str = "minerU"
```

```bash
CONTEXT_WINDOW=2
CONTEXT_MODE=page
MAX_CONTEXT_TOKENS=3000
INCLUDE_HEADERS=true
INCLUDE_CAPTIONS=true
CONTEXT_FILTER_CONTENT_TYPES=text,image
CONTENT_FORMAT=minerU
```

--------------------------------

### Initialize and Use RAGAnything with Context

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Shows the instantiation of RAGAnything with a custom configuration and the execution of document processing. It also covers runtime updates to context settings and manual content source configuration.

```python
from raganything import RAGAnything, RAGAnythingConfig

config = RAGAnythingConfig(
    context_window=2,
    context_mode="page",
    max_context_tokens=3000,
    include_headers=True,
    include_captions=True,
    context_filter_content_types=["text", "image"],
    content_format="minerU"
)

rag_anything = RAGAnything(
    config=config,
    llm_model_func=your_llm_function,
    embedding_func=your_embedding_function
)

# Automatic processing
await rag_anything.process_document_complete("document.pdf")

# Manual updates
rag_anything.set_content_source_for_context(content_list, "minerU")
rag_anything.update_context_config(context_window=1, max_context_tokens=1500, include_captions=False)
```

--------------------------------

### Setting Content Source and Processing Multimodal Content (Python)

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Demonstrates how to set a content source using a list of content items and then process multimodal content, such as an image, with associated metadata. This involves specifying the content type, file path, entity name, and item information for context-aware processing.

```python
processor.set_content_source(content_list, "minerU")

item_info = {
    "page_idx": 2,
    "index": 5,
    "type": "image"
}

result = await processor.process_multimodal_content(
    modal_content=image_data,
    content_type="image",
    file_path="document.pdf",
    entity_name="Architecture Diagram",
    item_info=item_info
)
```

--------------------------------

### Direct Modal Processor Integration

Source: https://github.com/hkuds/rag-anything/blob/main/docs/context_aware_processing.md

Illustrates how to manually initialize a ContextExtractor and inject it into a modal processor for targeted multimodal analysis.

```python
from raganything.modalprocessors import (
    ContextExtractor,
    ContextConfig,
    ImageModalProcessor
)

config = ContextConfig(
    context_window=1,
    context_mode="page",
    max_context_tokens=2000,
    include_headers=True,
    include_captions=True,
    filter_content_types=["text"]
)

context_extractor = ContextExtractor(config)
processor = ImageModalProcessor(lightrag, caption_func, context_extractor)
```

--------------------------------

### Configure Environment Variables for RAG-Anything

Source: https://github.com/hkuds/rag-anything/blob/main/README.md

Set environment variables in a .env file for RAG-Anything configuration. Includes API keys, output directory, and parser settings.

```bash
OPENAI_API_KEY=your_openai_api_key
OPENAI_BASE_URL=your_base_url  # Optional
OUTPUT_DIR=./output             # Default output directory for parsed documents
PARSER=mineru                   # Parser selection: mineru, docling, or paddleocr
PARSE_METHOD=auto              # Parse method: auto, ocr, or txt
```