### Install API Dependencies and Run Locally

Source: https://mindee.github.io/doctr/project/python-doctr

Installs necessary dependencies for the docTR API template using Poetry and pip, then runs the API locally using uvicorn.

```bash
cd api/
pip install poetry
make lock
pip install -r requirements.txt
uvicorn --reload --workers 1 --host 0.0.0.0 --port=8002 --app-dir api/ app.main:app
```

--------------------------------

### OCR Route Integration Example

Source: https://mindee.github.io/doctr/project/python-doctr

A Python example demonstrating how to send a request to the OCR route of the deployed docTR API.

```APIDOC
## OCR Route Integration Example

This example shows how to send a request to the OCR route of the docTR API using Python's `requests` library.

### Request Example

```python
import requests

params = {"det_arch": "db_resnet50", "reco_arch": "crnn_vgg16_bn"}

with open('/path/to/your/doc.jpg', 'rb') as f:
    files = [  # application/pdf, image/jpeg, image/png supported
        ("files", ("doc.jpg", f.read(), "image/jpeg")),
    ]
    response = requests.post("http://localhost:8002/ocr", params=params, files=files)
    print(response.json())
```

**Supported file types:** `application/pdf`, `image/jpeg`, `image/png`.

**API Endpoint:** `http://localhost:8002/ocr`
```

--------------------------------

### Run docTR Demo App Locally (Bash)

Source: https://mindee.github.io/doctr/project/python-doctr

Installs the required dependencies for the docTR demo app and then runs it locally using Streamlit. This allows you to interactively test the OCR models.

```bash
pip install -r demo/pt-requirements.txt
streamlit run demo/app.py

```

--------------------------------

### Install docTR with Optional Dependencies (Bash)

Source: https://mindee.github.io/doctr/project/python-doctr

Installs docTR along with optional dependencies for visualization, HTML, and contrib modules. Use this command if you need extended functionalities.

```bash
pip install "python-doctr[viz,html,contrib]"

```

--------------------------------

### Deploying the API Locally

Source: https://mindee.github.io/doctr/project/python-doctr

Steps to deploy the docTR API locally using FastAPI, including dependency installation and running the server.

```APIDOC
## Deploy your API locally

To deploy the docTR API locally, follow these steps:

### Install Dependencies
Navigate to the `api/` directory and install the required dependencies:

```bash
cd api/
pip install poetry
make lock
pip install -r requirements.txt
```

### Run the API Server
Start the API server using uvicorn:

```bash
uvicorn --reload --workers 1 --host 0.0.0.0 --port 8002 --app-dir api/ app.main:app
```

Alternatively, you can use Docker Compose:

```bash
PORT=8002 docker-compose up -d --build
```

Once running, access the API documentation at `http://localhost:8002/redoc`.
```

--------------------------------

### Install docTR from Source in Developer Mode (Bash)

Source: https://mindee.github.io/doctr/project/python-doctr

Installs docTR from its source repository in developer mode. This requires Git to clone the repository and allows for direct code modifications.

```bash
git clone https://github.com/mindee/doctr.git
pip install -e doctr/.

```

--------------------------------

### Install docTR Latest Release (Bash)

Source: https://mindee.github.io/doctr/project/python-doctr

Installs the latest release of the docTR package from PyPI using pip. This is the standard method for adding docTR to your Python environment.

```bash
pip install python-doctr

```

--------------------------------

### Run API using Docker Compose

Source: https://mindee.github.io/doctr/project/python-doctr

Deploys the docTR API locally using Docker Compose. This command builds the necessary Docker image and starts the services in detached mode.

```bash
PORT=8002 docker-compose up -d --build
```

--------------------------------

### Visualize OCR Results

Source: https://mindee.github.io/doctr/project/python-doctr

Shows how to visualize the OCR results interactively using the `show()` method of the result object. This requires `matplotlib` and `mplcursors` to be installed. It displays the detected text and bounding boxes on the document image.

```python
# Display the result (requires matplotlib & mplcursors to be installed)
result.show()
```

--------------------------------

### Send Request to docTR OCR API

Source: https://mindee.github.io/doctr/project/python-doctr

Example Python script using the 'requests' library to send a POST request to the docTR OCR API endpoint. It includes parameters for detection and recognition architectures and uploads a document file.

```python
import requests

params = {"det_arch": "db_resnet50", "reco_arch": "crnn_vgg16_bn"}

with open('/path/to/your/doc.jpg', 'rb') as f:
    files = [  # application/pdf, image/jpeg, image/png supported
        ("files", ("doc.jpg", f.read(), "image/jpeg")),
    ]
print(requests.post("http://localhost:8080/ocr", params=params, files=files).json())
```

--------------------------------

### Building Docker Images

Source: https://mindee.github.io/doctr/project/python-doctr

Instructions on how to build docTR Docker images locally, with options to specify Python and docTR versions.

```APIDOC
## Building Docker Images Locally

You can build docTR Docker images locally using the `docker build` command.

### Basic Build
```bash
docker build -t doctr .
```

### Custom Build with Arguments
To specify custom Python and docTR versions, use build arguments:

```bash
docker build -t doctr --build-arg FRAMEWORK=torch --build-arg PYTHON_VERSION=3.9.10 --build-arg DOCTR_VERSION=v0.7.0 .
```

**Build Arguments:**
- `FRAMEWORK`: Specify the framework (e.g., `torch`).
- `PYTHON_VERSION`: Specify the Python version.
- `DOCTR_VERSION`: Specify the docTR version.
```

--------------------------------

### Run docTR Docker Container with GPU Support (Bash)

Source: https://mindee.github.io/doctr/project/python-doctr

Launches a docTR Docker container with GPU support enabled. Ensure your Docker is configured for GPU usage and your CUDA version is compatible.

```bash
docker run -it --gpus all ghcr.io/mindee/doctr:torch-py3.9.18-2024-10 bash

```

--------------------------------

### Initialize OCR Predictor Model

Source: https://mindee.github.io/doctr/project/python-doctr

Initializes an OCR predictor model with specified text detection and recognition architectures. It allows for loading pretrained weights for immediate use. Dependencies include the doctr library.

```python
from doctr.models import ocr_predictor

model = ocr_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True)
```

--------------------------------

### Build docTR Docker Image Locally

Source: https://mindee.github.io/doctr/project/python-doctr

Builds a docTR Docker image locally. You can specify custom Python and docTR versions using build arguments like FRAMEWORK, PYTHON_VERSION, and DOCTR_VERSION.

```docker
docker build -t doctr .
```

```docker
docker build -t doctr --build-arg FRAMEWORK=torch --build-arg PYTHON_VERSION=3.9.10 --build-arg DOCTR_VERSION=v0.7.0 .
```

--------------------------------

### Run docTR Analysis Script

Source: https://mindee.github.io/doctr/project/python-doctr

Executes a Python script for documentation analysis on a given PDF or image file. Use '--help' for a full list of script arguments.

```python
python scripts/analyze.py path/to/your/doc.pdf
```

--------------------------------

### Use KIE Predictor for Document Analysis (Python)

Source: https://mindee.github.io/doctr/project/python-doctr

Demonstrates how to use the KIE predictor with a detection and recognition model to analyze a PDF document. It shows how to load a PDF, perform analysis, and extract predictions for different classes.

```python
from doctr.io import DocumentFile
from doctr.models import kie_predictor

# Model
model = kie_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True)
# PDF
doc = DocumentFile.from_pdf("path/to/your/doc.pdf")
# Analyze
result = model(doc)

predictions = result.pages[0].predictions
for class_name in predictions.keys():
    list_predictions = predictions[class_name]
    for prediction in list_predictions:
        print(f"Prediction for {class_name}: {prediction}")

```

--------------------------------

### Load Documents from Various Sources

Source: https://mindee.github.io/doctr/project/python-doctr

Demonstrates how to load documents for OCR processing from different sources including PDFs, single images, multiple images, and webpages. Requires the doctr.io module. The `from_url` method may require additional dependencies like `weasyprint`.

```python
from doctr.io import DocumentFile

# PDF
pdf_doc = DocumentFile.from_pdf("path/to/your/doc.pdf")
# Image
single_img_doc = DocumentFile.from_images("path/to/your/img.jpg")
# Webpage (requires `weasyprint` to be installed)
webpage_doc = DocumentFile.from_url("https://www.yoursite.com")
# Multiple page images
multi_img_doc = DocumentFile.from_images(["path/to/page1.jpg", "path/to/page2.jpg"])
```

--------------------------------

### Rebuild Document from OCR Predictions

Source: https://mindee.github.io/doctr/project/python-doctr

Demonstrates how to reconstruct the original document appearance from the OCR predictions. It synthesizes image data from the results, which can then be displayed using matplotlib. Dependencies include matplotlib.

```python
import matplotlib.pyplot as plt

synthetic_pages = result.synthesize()
plt.imshow(synthetic_pages[0]); plt.axis('off'); plt.show()
```

--------------------------------

### Perform OCR on a Document

Source: https://mindee.github.io/doctr/project/python-doctr

Combines document loading and OCR prediction to process a PDF file. It uses a pretrained OCR predictor and outputs the recognition results. Dependencies include doctr.io and doctr.models.

```python
from doctr.io import DocumentFile
from doctr.models import ocr_predictor

model = ocr_predictor(pretrained=True)
# PDF
doc = DocumentFile.from_pdf("path/to/your/doc.pdf")
# Analyze
result = model(doc)
```

--------------------------------

### docTR Project Citation

Source: https://mindee.github.io/doctr/project/python-doctr

BibTeX reference for citing the docTR project. This provides the necessary information for academic and research citations.

```bibtex
@misc{doctr2021,
    title={docTR: Document Text Recognition},
    author={Mindee},
    year={2021},
    publisher = {GitHub},
    howpublished = {\url{https://github.com/mindee/doctr}}
}
```

--------------------------------

### Export OCR Results to JSON

Source: https://mindee.github.io/doctr/project/python-doctr

Explains how to export the structured OCR results into a nested dictionary format, suitable for JSON serialization. This allows for easy integration with other systems or further processing.

```python
json_output = result.export()
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.