### Install DocClassifier from whl File

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Install the generated wheel file for docclassifier-docsaid.

```bash
pip install dist/docclassifier_docsaid-*-py3-none-any.whl
```

--------------------------------

### Install Wheel Package

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Install the wheel package, which is a build frontend for the bdist_wheel command.

```bash
pip install wheel
```

--------------------------------

### Install docclassifier-docsaid via PyPI

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Use this command to install the docclassifier-docsaid package from the Python Package Index.

```bash
pip install docclassifier-docsaid
```

--------------------------------

### Initialize and Use DocClassifier

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Demonstrates basic and advanced initialization of the DocClassifier, including loading an image, classifying it, and accessing model information. Ensure necessary libraries like OpenCV are installed.

```python
import cv2
import numpy as np
from docclassifier import DocClassifier, ModelType

# Basic initialization with default settings
classifier = DocClassifier()

# Advanced initialization with custom parameters
classifier = DocClassifier(
    model_type=ModelType.margin_based,  # Model architecture type
    model_cfg='20240326',               # Model configuration/version
    backend='cpu',                       # 'cpu' or 'cuda'
    gpu_id=0,                           # GPU device ID if using CUDA
    threshold=0.6,                      # Similarity threshold (default: 0.627)
    register_root='/path/to/custom/templates'  # Custom registration folder
)

# Load an image for classification
img = cv2.imread('document.jpg')

# Classify the document - returns (label, score) tuple
most_similar, max_score = classifier(img)
print(f'Document type: {most_similar}, Confidence: {max_score:.4f}')
# Output: Document type: Taiwan driver's license front, Confidence: 0.6116

# If no match above threshold, returns (None, 0.0)
if most_similar is None:
    print("No matching document type found above threshold")

# List available model configurations
available_models = classifier.list_models()
print(f'Available models: {available_models}')
# Output: Available models: ['20240326']

# Access the registered document bank
print(f'Registered documents: {list(classifier.bank.keys())}')

# Access current threshold
print(f'Current threshold: {classifier.threshold}')
```

--------------------------------

### Run DocClassifier Demo Script

Source: https://github.com/docsaidlab/docclassifier/blob/main/demo/README.md

Execute the `demo_ipcam.py` script from your terminal to start the DocClassifier web demo. This will generate a URL to access the demo in your web browser.

```bash
python demo_ipcam.py
```

--------------------------------

### Verify docclassifier Installation

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Run this Python command to check if the docclassifier library is installed correctly and print its version.

```python
import docclassifier; print(docclassifier.__version__)
```

--------------------------------

### Perform Document Classification Inference

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

This example demonstrates how to load an image, initialize the DocClassifier, and perform inference to find the most similar document. The default threshold may result in 'None' if similarity is low.

```python
import cv2
from skimage import io
from docclassifier import DocClassifier

img = io.imread('https://github.com/DocsaidLab/DocClassifier/blob/main/docs/test_driver.jpg?raw=true')
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)

model = DocClassifier()

most_similar, max_score = model(img)
print(f'most_similar: {most_similar}, max_score: {max_score:.4f}')
```

--------------------------------

### IP Camera Demo Integration with DocAligner and DocClassifier

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Integrate DocClassifier with IP camera feeds for real-time document processing. This example uses DocAligner for document detection and alignment, and Capybara for visualization.

```python
import capybara as cb
from docaligner import DocAligner  # Separate package for document alignment
from docclassifier import DocClassifier

# IP camera address (requires IP Camera mobile app)
IPADDR = '192.168.0.179'

# Initialize models
aligner = DocAligner()       # Detects and aligns documents in frame
classifier = DocClassifier(threshold=0.5)

def process_frame(img):
    """Process a single frame from IP camera."""
    # Detect and align document in the frame
    doc_info = aligner(img)

    if doc_info.has_doc_polygon:
        # Draw detected document boundary
        img = cb.draw_polygon(img, doc_info.doc_polygon, (0, 255, 0), 3)

        # Classify the aligned document image
        max_sim, max_score = classifier(doc_info.doc_flat_img)

        # Overlay classification result
        img = cb.draw_text(
            img,
            f'{max_sim} {max_score:.2f}',
            (30, 30),
            (0, 255, 0),
            36
        )

    return img

# Start web demo server
# Access via browser at workstation IP:5001
demo = cb.WebDemo(IPADDR, pipelines=[process_frame])
demo.run()
```

--------------------------------

### Model Training Configuration

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Example YAML configuration for training custom document classification models using PyTorch Lightning. Specifies model architecture, loss function, optimizer, and learning rate scheduler.

```yaml
# Example configuration: model/config/lcnet050_cosface_f256_r128_squeeze_lbn_imagenet.yaml

global_settings:
  image_size: [128, 128]

common:
  use_imagenet: true    # Use ImageNet-1K for training data diversity
  use_clip: true        # Align features with CLIP for stability
  preview_batch: 100

model:
  backbone:
    name: Backbone
    options:
      name: lcnet_050   # PP-LCNet 0.5x as feature extractor
      pretrained: true

  head:
    name: FeatureLearningSqueezeLBNHead  # Combined LayerNorm + BatchNorm
    options:
      in_dim: 256
      embed_dim: 256
      feature_map_size: 4
      squeeze_ratio: 0.25

  loss:
    name: CosFace       # Margin-based loss function
    embed_dim: 256
    options:
      s: 64.0           # Scale factor
      m: 0.4            # Margin parameter

optimizer:
  name: AdamW
  options:
    lr: 0.001
    weight_decay: 0.05

lr_scheduler:
  name: CosineAnnealingWarmRestarts
  options:
    T_0: 10
    T_mult: 2
```

--------------------------------

### Adjust Threshold for Document Classification

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Modify the threshold parameter when initializing DocClassifier to potentially get a classification result even for images with lower similarity to registered documents.

```python
model = DocClassifier(
    threshold=0.6
)

# 重新進行推論
most_similar, max_score = model(img)
print(f'most_similar: {most_similar}, max_score: {max_score:.4f}')
```

--------------------------------

### Build DocClassifier Wheel Package

Source: https://github.com/docsaidlab/docclassifier/blob/main/README.md

Build the wheel file for the DocClassifier package after cloning the repository.

```bash
pip install wheel
cd DocClassifier
python setup.py bdist_wheel
```

--------------------------------

### Build whl File for DocClassifier

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Navigate to the DocClassifier directory and build the wheel file for the package.

```bash
cd DocClassifier
python setup.py bdist_wheel
```

--------------------------------

### Initialize DocClassifier with Custom Registration

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Initialize DocClassifier with a custom registration folder and threshold. Labels are automatically generated from the folder structure.

```python
from docclassifier import DocClassifier

# Initialize with custom registration folder
classifier = DocClassifier(
    register_root='/path/to/register',
    threshold=0.5  # Lower threshold for more permissive matching
)

# Labels are generated from folder structure:
# - "id_card_front", "id_card_back", "passport_template", "drivers_license_template"
print(f'Registered labels: {list(classifier.bank.keys())}')

# Reload registration from a different folder
new_bank = classifier.get_register('/path/to/another/register')
print(f'New bank contains {len(new_bank)} templates')

# Supported image formats: .jpg, .jpeg, .png
# Images are automatically resized to 128x128 for feature extraction
```

--------------------------------

### DocClassifier Initialization and Classification

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Demonstrates how to initialize the DocClassifier with default or custom parameters and perform document classification.

```APIDOC
## DocClassifier Class

### Description
The main interface for document classification that wraps the underlying inference model. It provides methods for classifying document images, extracting features, and managing registered document templates. The classifier automatically downloads required models on first use.

### Method
`__call__` (or `classify` implicitly)

### Endpoint
N/A (Python Class Method)

### Parameters
#### Initialization Parameters
- **model_type** (ModelType) - Optional - Model architecture type (e.g., `ModelType.margin_based`).
- **model_cfg** (string) - Optional - Model configuration/version (e.g., '20240326').
- **backend** (string) - Optional - Inference backend ('cpu' or 'cuda'). Defaults to 'cpu'.
- **gpu_id** (int) - Optional - GPU device ID if using CUDA.
- **threshold** (float) - Optional - Similarity threshold for classification. Defaults to 0.627.
- **register_root** (string) - Optional - Path to a custom registration folder for document templates.

#### Classification Input (when calling the classifier instance)
- **img** (numpy.ndarray) - Required - The document image loaded using a library like OpenCV.

### Request Example
```python
import cv2
from docclassifier import DocClassifier, ModelType

# Basic initialization
classifier = DocClassifier()

# Advanced initialization
classifier = DocClassifier(
    model_type=ModelType.margin_based,
    model_cfg='20240326',
    backend='cpu',
    threshold=0.6,
    register_root='/path/to/custom/templates'
)

# Load an image
img = cv2.imread('document.jpg')

# Classify the document
most_similar, max_score = classifier(img)
print(f'Document type: {most_similar}, Confidence: {max_score:.4f}')

if most_similar is None:
    print("No matching document type found above threshold")
```

### Response
#### Success Response
- **most_similar** (string or None) - The label of the most similar document type found, or None if no match above the threshold.
- **max_score** (float) - The confidence score of the best match, ranging from 0.0 to 1.0.

#### Response Example
```json
{
  "most_similar": "Taiwan driver's license front",
  "max_score": 0.6116
}
```

#### Error Response Example (if no match)
```json
{
  "most_similar": null,
  "max_score": 0.0
}
```

### Additional Methods
- **list_models()**: Returns a list of available model configurations.
- **bank**: Accesses the registered document bank (dictionary).
- **threshold**: Gets or sets the current similarity threshold.
```

--------------------------------

### Initialize MarginBasedInference Engine

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Initialize the MarginBasedInference engine with specified parameters. This class handles model loading, preprocessing, and feature comparison for direct control over the inference pipeline.

```python
import numpy as np
import cv2
from docclassifier import MarginBasedInference

# Initialize inference engine
inference = MarginBasedInference(
    gpu_id=0,
    backend='cpu',  # or 'cuda' for GPU
    model_cfg='20240326',
    threshold=0.627,
    register_root=None  # Uses default registration
)

# Available model configurations
print(f'Available configs: {list(inference.configs.keys())}')
# Output: ['20240326']

# Model configuration details
cfg = inference.configs['20240326']
print(f'Input size: {cfg["img_size_infer"]}')  # (128, 128)
print(f'Default threshold: {cfg["threshold"]}')  # 0.627 (FPR=0.01)

# Load and preprocess image
img = cv2.imread('document.jpg')

# Extract features
features = inference.extract_feature(img)
print(f'Feature vector shape: {features.shape}')  # (256,)

# Compare two feature vectors
feat1 = inference.extract_feature(cv2.imread('doc1.jpg'))
feat2 = inference.extract_feature(cv2.imread('doc2.jpg'))
similarity = inference.compare(feat1, feat2)
print(f'Similarity: {similarity:.4f}')

# Run classification
most_similar, max_score = inference(img)
print(f'Result: {most_similar} ({max_score:.4f})')
```

--------------------------------

### Set Indoor and ImageNet Dataset Paths

Source: https://github.com/docsaidlab/docclassifier/blob/main/model/README.md

Defines the root directories for the indoor scene recognition and ImageNet datasets. Modify these variables if your datasets are located elsewhere.

```python
INDOOR_ROOT = '/data/Dataset/indoor_scene_recognition/Images'
IMAGENET_ROOT = '/data/Dataset/ILSVRC2012/train'
```

--------------------------------

### Custom Registration

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Details on how to set up and manage custom registration folders for defining new document types.

```APIDOC
## Custom Registration

### Description
Create and manage custom document registration folders to define which document types the classifier can recognize. Images in the registration folder become the reference templates that input documents are compared against.

### Method
N/A (Configuration via `register_root` parameter during initialization)

### Endpoint
N/A (File System Operation)

### Parameters
- **register_root** (string) - Path to the custom directory containing reference document images.

### Process
1. Create a directory (e.g., `/path/to/custom/templates`).
2. Place representative images of each new document type within this directory.
3. The filename (without extension) can optionally serve as the document label, or labels can be managed programmatically.
4. Initialize `DocClassifier` with the `register_root` parameter pointing to this directory.

   ```python
   from docclassifier import DocClassifier

   classifier = DocClassifier(register_root='/path/to/custom/templates')
   ```

5. The classifier will automatically load these images as reference templates upon initialization.
6. New document images can then be classified against this custom bank of templates.

### Example Structure
```
/path/to/custom/templates/
├── ID_Card_Front.jpg
├── Passport_Page1.png
└── Drivers_License_Back.jpeg
```

In this example, the classifier would recognize 'ID_Card_Front', 'Passport_Page1', and 'Drivers_License_Back' as distinct document types if their similarity scores exceed the threshold.
```

--------------------------------

### Clone DocClassifier Repository

Source: https://github.com/docsaidlab/docclassifier/blob/main/README_tw.md

Download the DocClassifier project files from GitHub using git clone.

```bash
git clone https://github.com/DocsaidLab/DocClassifier.git
```

--------------------------------

### Configure IP Camera Address

Source: https://github.com/docsaidlab/docclassifier/blob/main/demo/README.md

Modify the IPADDR variable in `demo_ipcam.py` to match your IP camera's network address. Ensure your workstation and mobile device are on the same network.

```python
IPADDR = '192.168.0.179'  # Change this to your IP camera address
```

--------------------------------

### Custom Document Registration

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Explains the process of creating and managing custom document registration folders. Images placed in these folders serve as reference templates for the classifier.

```python
import os
from docclassifier import DocClassifier
```

--------------------------------

### ONNX Model Inference with Capybara

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Loads an ONNX model using Capybara's `ONNXEngine` and performs inference on an input image. Demonstrates image preprocessing, including resizing, transposing, and normalization.

```python
# Using the ONNX model for inference
import capybara as cb
import numpy as np

# Load ONNX model
model = cb.ONNXEngine(
    'model.onnx',
    gpu_id=0,
    backend='cpu'  # or 'cuda'
)

# Prepare input (128x128 RGB image normalized to [0, 1])
img = cb.imread('document.jpg')
img = cb.imresize(img, size=(128, 128))
img = np.transpose(img, (2, 0, 1)).astype('float32')
img = img[None] / 255.0  # Add batch dimension and normalize

# Run inference
output = model(img=img)
features = output['feats'][0]  # 256-dimensional normalized features
print(f'Feature shape: {features.shape}')
```

--------------------------------

### Extract Features and Compare Documents

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Shows how to extract normalized 256-dimensional feature vectors from document images using `extract_feature`. It also demonstrates computing cosine similarity and building a custom document bank for classification.

```python
import numpy as np
from docclassifier import DocClassifier
import cv2

classifier = DocClassifier()

# Load document images
doc1 = cv2.imread('document1.jpg')
doc2 = cv2.imread('document2.jpg')

# Extract feature vectors (256-dimensional, normalized)
feat1 = classifier.extract_feature(doc1)
feat2 = classifier.extract_feature(doc2)

print(f'Feature shape: {feat1.shape}')  # Output: Feature shape: (256,)
print(f'Feature norm: {np.linalg.norm(feat1):.4f}')  # Output: ~1.0 (normalized)

# Compute similarity between two documents (same method used internally)
def compare_features(feat1, feat2):
    # Cosine similarity scaled to [0, 1] range
    return (np.dot(feat1, feat2) + 1) / 2

similarity = compare_features(feat1, feat2)
print(f'Similarity score: {similarity:.4f}')

# Build custom document bank
custom_bank = {}
template_images = ['id_card.jpg', 'passport.jpg', 'drivers_license.jpg']
labels = ['ID Card', 'Passport', 'Drivers License']

for img_path, label in zip(template_images, labels):
    img = cv2.imread(img_path)
    if img is not None:
        custom_bank[label] = classifier.extract_feature(img)

# Find best match from custom bank
query_img = cv2.imread('unknown_document.jpg')
query_feat = classifier.extract_feature(query_img)

best_match = None
best_score = 0.0
threshold = 0.6

for label, template_feat in custom_bank.items():
    score = compare_features(query_feat, template_feat)
    if score > threshold and score > best_score:
        best_match = label
        best_score = score

print(f'Best match: {best_match}, Score: {best_score:.4f}')
```

--------------------------------

### Export PyTorch Model to ONNX

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Converts a trained PyTorch classifier model to ONNX format using the `main_classifier_torch2onnx` function. The exported model includes metadata like input shape, FLOPs, and configuration details.

```python
from model.to_onnx import main_classifier_torch2onnx

# Export model to ONNX format
# Generates: lcnet050_cosface_f256_r128_squeeze_lbn_imagenet_finetune_20240326_fp32.onnx
main_classifier_torch2onnx('lcnet050_cosface_f256_r128_squeeze_lbn_imagenet_finetune')
```

--------------------------------

### Feature Extraction

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Explains how to extract 256-dimensional feature vectors from document images for custom similarity comparisons.

```APIDOC
## Feature Extraction

### Description
Extract normalized feature vectors from document images for custom similarity comparisons or building your own document bank. The feature vectors are 256-dimensional and L2-normalized for cosine similarity computation.

### Method
`extract_feature(img)`

### Endpoint
N/A (Python Class Method)

### Parameters
#### Path Parameters
None

#### Query Parameters
None

#### Request Body
None

### Input Image
- **img** (numpy.ndarray) - Required - The document image loaded using a library like OpenCV.

### Request Example
```python
import numpy as np
from docclassifier import DocClassifier
import cv2

classifier = DocClassifier()

# Load document images
doc1 = cv2.imread('document1.jpg')
doc2 = cv2.imread('document2.jpg')

# Extract feature vectors
feat1 = classifier.extract_feature(doc1)
feat2 = classifier.extract_feature(doc2)

print(f'Feature shape: {feat1.shape}')
print(f'Feature norm: {np.linalg.norm(feat1):.4f}')

# Compute similarity
def compare_features(feat1, feat2):
    return (np.dot(feat1, feat2) + 1) / 2

similarity = compare_features(feat1, feat2)
print(f'Similarity score: {similarity:.4f}')
```

### Response
#### Success Response
- **feature_vector** (numpy.ndarray) - A 256-dimensional, L2-normalized feature vector.

#### Response Example
```python
# feat1 will be a numpy array like:
# [ 0.0123 -0.4567 ... 0.7890]
# Shape: (256,)
# Norm: ~1.0
```

### Usage Example (Custom Document Bank)
```python
# ... (feature extraction code above) ...

custom_bank = {}
template_images = ['id_card.jpg', 'passport.jpg', 'drivers_license.jpg']
labels = ['ID Card', 'Passport', 'Drivers License']

for img_path, label in zip(template_images, labels):
    img = cv2.imread(img_path)
    if img is not None:
        custom_bank[label] = classifier.extract_feature(img)

query_img = cv2.imread('unknown_document.jpg')
query_feat = classifier.extract_feature(query_img)

best_match = None
best_score = 0.0
threshold = 0.6

for label, template_feat in custom_bank.items():
    score = compare_features(query_feat, template_feat)
    if score > threshold and score > best_score:
        best_match = label
        best_score = score

print(f'Best match: {best_match}, Score: {best_score:.4f}')
```
```

--------------------------------

### Find Threshold at Specific FPR Values

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Iterates through target False Positive Rate (FPR) values to find corresponding True Positive Rate (TPR) and threshold values. Useful for calibrating model performance.

```python
fpr_targets = [1e-5, 1e-4, 1e-3, 1e-2, 1e-1]
for target_fpr in fpr_targets:
    idx = np.argmin(np.abs(fpr - target_fpr))
    print(f'FPR={target_fpr:.0e}: TPR={tpr[idx]:.3f}, Threshold={thresholds[idx]:.4f}')
```

--------------------------------

### Sequential Feature Embedding with Normalization

Source: https://github.com/docsaidlab/docclassifier/blob/main/README.md

Defines a sequential module for embedding features, incorporating Linear layers, Layer Normalization, and Batch Normalization. Use this for processing extracted features before classification.

```python
self.embed_feats = nn.Sequential(
    nn.Linear(in_dim_flatten, embed_dim, bias=False),
    nn.LayerNorm(embed_dim),
    nn.BatchNorm1d(embed_dim),
    nn.Linear(embed_dim, embed_dim, bias=False),
    nn.LayerNorm(embed_dim),
    nn.BatchNorm1d(embed_dim),
)
```

--------------------------------

### Document Classification with Lowered Threshold

Source: https://github.com/docsaidlab/docclassifier/blob/main/README.md

Perform document classification inference with a lowered threshold to potentially identify similar documents when the default threshold is too high. This can help in cases where the input image is not a perfect match but shares similarities with registered document types.

```python
model = DocClassifier(
    threshold=0.6
)

# Re-run the inference
most_similar, max_score = model(img)
print(f'most_similar: {most_similar}, max_score: {max_score:.4f}')
```

--------------------------------

### Benchmarking DocClassifier Performance

Source: https://context7.com/docsaidlab/docclassifier/llms.txt

Evaluate model performance using ROC curves and TPR/FPR metrics. This involves calculating pairwise similarities and binary labels from extracted features.

```python
import numpy as np
from itertools import combinations
from sklearn.metrics import roc_curve
from docclassifier import DocClassifier
import capybara as cb

# Initialize model
model = DocClassifier(model_cfg='20240326')

def calculate_combinations(embeddings, labels):
    """Calculate pairwise similarities and labels."""
    pairs = np.array(list(combinations(range(len(embeddings)), 2)))
    base_idx, target_idx = pairs[:, 0], pairs[:, 1]

    # Compute cosine similarity scaled to [0, 1]
    scores = np.sum(embeddings[base_idx] * embeddings[target_idx], axis=-1)
    scores = (scores + 1) / 2

    # Binary labels: 1 if same class, 0 if different
    pair_labels = np.where(labels[base_idx] == labels[target_idx], 1, 0)

    return scores, pair_labels

# Example: Evaluate on test dataset
test_images = ['doc1.jpg', 'doc2.jpg', 'doc3.jpg', 'doc4.jpg']
test_labels = np.array([0, 0, 1, 1])  # Document type labels

# Extract features
features = []
for img_path in test_images:
    img = cb.imread(img_path)
    feat = model.extract_feature(img)
    features.append(feat)

features = np.stack(features)
scores, labels = calculate_combinations(features, test_labels)

# Compute ROC curve

```

--------------------------------

### BibTeX Citation for DocClassifier

Source: https://github.com/docsaidlab/docclassifier/blob/main/README.md

Use this BibTeX entry to cite the DocClassifier GitHub repository in academic work. Ensure the URL and note fields are included.

```bibtex
@misc{lin2024docclassifier,
  author = {Kun-Hsiang Lin, Ze Yuan},
  title = {DocClassifier},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/DocsaidLab/DocClassifier},
  note = {GitHub repository}
}
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.