### Install PPOCRLabel Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Installs the PPOCRLabel package using pip. This is the recommended method for convenient startup. ```bash pip install PPOCRLabel # install ``` -------------------------------- ### Install and Launch PPOCRLabel Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Install PaddlePaddle (CPU), PPOCRLabel, and launch the application with different language and mode options. Includes building a standalone executable. ```bash # Install PaddlePaddle (CPU) python3 -m pip install paddlepaddle -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ # Install PPOCRLabel pip install PPOCRLabel # Launch in English UI, normal detection+recognition mode PPOCRLabel --lang en # Launch in KIE mode (detection + recognition + keyword extraction) PPOCRLabel --lang en --kie True # Launch via Python script (for development / custom models) python PPOCRLabel.py --lang en # Build standalone executable with PyInstaller pyrcc5 -o libs/resources.py resources.qrc pyinstaller -c PPOCRLabel.py \ --collect-all paddleocr --collect-all pyclipper --collect-all imghdr \ --collect-all skimage --collect-all imgaug --collect-all scipy.io \ --collect-all lmdb --collect-all paddle \ --hidden-import=pyqt5 \ -p ./libs -p ./ -p ./data -p ./resources -F # Run the result: dist/PPOCRLabel --lang en ``` -------------------------------- ### Install PPOCRLabel on Ubuntu Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Installs PPOCRLabel and trash-cli on Ubuntu. Consider setting QT_QPA_PLATFORM to wayland for Wayland environments. ```bash pip3 install PPOCRLabel pip3 install trash-cli export QT_QPA_PLATFORM=wayland # Consider adding it to the system environment variables to avoid entering it multiple times. ``` -------------------------------- ### Build and Install PPOCRLabel Whl Package Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Builds and installs the PPOCRLabel package locally in editable mode after navigating to its directory. ```bash cd ./PPOCRLabel pip3 install -e . ``` -------------------------------- ### Install PPOCRLabel on MacOS Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Installs PPOCRLabel and a specific version of opencv-contrib-python-headless on MacOS. ```bash pip3 install PPOCRLabel pip3 install opencv-contrib-python-headless==4.2.0.32 ``` -------------------------------- ### Install PaddlePaddle (CPU) Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Installs the PaddlePaddle library for CPU-only machines. Ensure pip is up-to-date before running. ```bash pip3 install --upgrade pip # If you only have cpu on your machine, please run the following command to install python3 -m pip install paddlepaddle -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ ``` -------------------------------- ### Run PPOCRLabel with English UI Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Launches PPOCRLabel with the user interface set to English. By default, it starts with Chinese. ```bash python3 tools/train.py --lang en ``` -------------------------------- ### Pyinstaller Build for PPOCRLabel Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Builds an executable for PPOCRLabel using Pyinstaller. Includes steps for installing Pyinstaller, regenerating resources, and packaging the application with specific dependencies. ```bash cd ./PPOCRLabel # install pyinstaller pip install pyinstaller # Regenerate Resources pyrcc5 -o libs/resources.py resources.qrc # Packaging executable programs pyinstaller -c PPOCRLabel.py --collect-all paddleocr --collect-all pyclipper --collect-all imghdr --collect-all skimage --collect-all imgaug --collect-all scipy.io --collect-all lmdb --collect-all paddle --hidden-import=pyqt5 -p ./libs -p ./ -p ./data -p ./resources -F # Run the executable program in dist, windows as an example PPOCRLabel.exe --lang ch ``` -------------------------------- ### Install Missing win32com Module Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md On Windows, if you encounter 'No module named 'win32com'' when using table recognition, install the 'premailer' and 'pywin32' packages. ```bash pip install premailer pip install pywin32 ``` -------------------------------- ### Resolve Qt Platform Plugin Error Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md For Linux users experiencing 'qt.qpa.plugin: Could not load the Qt platform plugin "xcb"' errors, uninstall existing OpenCV, install the headless version, and set the QT_QPA_PLATFORM environment variable. ```bash pip uninstall opencv-python pip uninstall opencv-contrib-python pip install opencv-python-headless export QT_QPA_PLATFORM=wayland ``` -------------------------------- ### Install Specific OpenCV Version Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md If you encounter 'objc[XXXXX]' errors on Linux due to a high OpenCV version, install version 4.2.0.32. ```bash pip install opencv-python==4.2.0.32 ``` -------------------------------- ### PPOCRLabel Application Entry Point Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Initializes the main application window with various configuration parameters. This Python script sets up the UI and loads OCR models. ```python from PPOCRLabel import MainWindow from PyQt5.QtWidgets import QApplication import sys app = QApplication(sys.argv) window = MainWindow( lang="en", # UI language: "en" or "ch" gpu=False, # Use GPU if available img_list_natural_sort=True, # Natural sort for file list bbox_auto_zoom_center=True, # Auto-zoom single bbox images kie_mode=False, # Enable KIE annotation mode default_filename="/data/images", # Auto-open this folder on start default_predefined_class_file="data/predefined_classes.txt", default_save_dir="/data/labels", # Override save directory det_model_name="PP-OCRv5_mobile_det", rec_model_name="PP-OCRv5_mobile_rec", det_model_dir=None, # None = use downloaded default rec_model_dir=None, cls_model_dir=None, label_font_path="/fonts/NotoSans.ttf", # Custom label font selected_shape_color=(255, 255, 0), # Highlight color (R,G,B) ) window.show() sys.exit(app.exec_()) ``` -------------------------------- ### Run PPOCRLabel on Ubuntu Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Launches PPOCRLabel in normal mode for detection and recognition, or in KIE mode for detection, recognition, and keyword extraction. ```bash PPOCRLabel # [Normal mode] for [detection + recognition] labeling PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling ``` -------------------------------- ### CLI Launch with Custom Models Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Launch PPOCRLabel via the command line, specifying custom model paths for detection, recognition, and classification. Also shows how to disable natural sort and customize label font/color. ```bash # Use custom detection and recognition models python PPOCRLabel.py \ --lang en \ --det_model_dir /models/my_det_model \ --det_model_name my_det_model \ --rec_model_dir /models/my_rec_model \ --rec_model_name my_rec_model \ --rec_char_dict_path /models/my_dict.txt \ --cls_model_dir /models/my_cls_model # Disable natural sort (use lexicographic order for file list) python PPOCRLabel.py --lang en --img_list_natural_sort False # Custom label font and selected-box highlight color python PPOCRLabel.py \ --lang en \ --label_font_path /fonts/Arial.ttf \ --selected_shape_color 255 0 128 ``` -------------------------------- ### Run PPOCRLabel with Custom Models via Command Line Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Use this command to launch PPOCRLabel and specify custom detection and recognition models. Ensure the model directories and names are correctly provided. ```bash python PPOCRLabel.py --det_model_dir {your_det_model_dir} --det_model_name {your_det_model_name} --rec_model_dir {your_rec_model_dir} --rec_model_name {your_rec_model_name} ``` -------------------------------- ### Run PPOCRLabel from Python Script Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Executes PPOCRLabel using a Python script after navigating to its directory. Supports normal and KIE modes. ```bash cd ./PPOCRLabel # Switch to the PPOCRLabel directory # Select label mode and run python PPOCRLabel.py # [Normal mode] for [detection + recognition] labeling python PPOCRLabel.py --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling ``` -------------------------------- ### Divide Dataset with PPOCRLabel Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Execute this command in the terminal to divide your dataset into training, validation, and test sets. Adjust the ratio and dataset path as needed. ```bash cd ./PPOCRLabel python gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ../train_data ``` -------------------------------- ### Initialize PaddleOCR with Custom Models in Python Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md Instantiate PaddleOCR with custom text detection and recognition models by providing their names and directories. This is an alternative to command-line arguments for model configuration. ```python from paddleocr import PaddleOCR, PPStructureV3 ocr = PaddleOCR( text_detection_model_name='{your_text_det_model_name}', text_detection_model_dir='{your_text_det_model_dir}', text_recognition_model_name='{your_text_rec_model_name}', text_recognition_model_dir='{your_text_rec_model_dir}', ) table_ocr = PPStructureV3( layout_detection_model_name='{your_layout_det_model_name}', layout_detection_model_dir='{your_layout_det_model_dir}', chart_recognition_model_name='{your_chart_rec_model_name}', chart_recognition_model_dir='{your_chart_rec_model_dir}', region_detection_model_name='{your_region_det_model_name}', region_detection_model_dir='{your_region_det_model_dir}', # For detailed replacement of other models, refer to the instantiation of the PPStructure class below. Simply replace the model path with the path to your own inference model. ) ``` -------------------------------- ### Recompile Resources Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md If you encounter a 'Missing string id' error, you need to recompile the resources using the 'pyrcc5' command. ```bash pyrcc5 -o libs/resources.py resources.qrc ``` -------------------------------- ### Split OCR Datasets for Training Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Splits annotated detection and recognition datasets into train/val/test subsets. Copies images and generates split-specific label files. Allows custom ratios and paths. ```bash # Basic usage: 60% train, 20% val, 20% test python gen_ocr_train_val_test.py \ --trainValTestRatio 6:2:2 \ --datasetRootPath ../train_data ``` ```bash # Custom paths and label filenames python gen_ocr_train_val_test.py \ --trainValTestRatio 7:2:1 \ --datasetRootPath /data/ocr_dataset \ --detRootPath /data/ocr_dataset/det \ --recRootPath /data/ocr_dataset/rec \ --detLabelFileName Label.txt \ --recLabelFileName rec_gt.txt \ --recImageDirName crop_img ``` ```text # Output structure: # /data/ocr_dataset/det/ # ├── train/ ← detection images (60%) # ├── val/ ← detection images (20%) # ├── test/ ← detection images (20%) # ├── train.txt, val.txt, test.txt ← label files # /data/ocr_dataset/rec/ # ├── train/, val/, test/ # └── train.txt, val.txt, test.txt ``` -------------------------------- ### Export Table Labels in PubTabNet Format Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Combines bounding-box annotations with table structure information to export `gt.txt` in a PubTabNet-compatible JSON format. Requires table recognition results from .xlsx files. ```python # Triggered via File > Export Table Label window.exportJSON() ``` ```text # Output gt.txt format (one JSON object per line): # {"filename": "table.jpg", "split": "train", # "imgid": 0, "html": { # "structure": {"tokens": ["", "", "", ...]}, # "cells": [{"tokens": ["Revenue"], "bbox": [10, 5, 80, 20]}, ...] # }} ``` -------------------------------- ### MainWindow.savePPlabel Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Exports detection labels to `Label.txt` in the image folder. Each line is a tab-separated pair of ` `. ```APIDOC ## MainWindow.savePPlabel — Export Detection Labels Writes the confirmed annotation results to `Label.txt` in the image folder. Each line is a tab-separated pair of ` `. ```python # Manual export via File > Export Label Results window.savePPlabel(mode="Manual") # Automatic export (triggered internally after every 5 confirmed images) window.savePPlabel(mode="Auto") # Output format in Label.txt: # folder/image.jpg [{"transcription": "Hello", "points": [[10,10],[100,10],[100,30],[10,30]], "difficult": false}, ...] ``` ``` -------------------------------- ### PPOCRLabel Batch Auto-Annotation Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Programmatically trigger batch auto-annotation for unchecked images in the current folder. Results are stored in internal dictionaries and auto-saved. ```python # Triggered via the "Auto Recognition" button or programmatically: # Set how many images to process (0 = all remaining) window.autoRecognitionNum(0) # process all unchecked images window.autoRecognition() # After completion, confirmed results are in window.PPlabel ``` -------------------------------- ### MainWindow.exportJSON Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Exports table labels in PubTabNet format by combining bounding-box annotations with table structure information. ```APIDOC ## MainWindow.exportJSON — Export Table Labels (PubTabNet Format) Combines bounding-box annotations from `Label.txt` with table structure from `.xlsx` files (produced by table recognition) and exports `gt.txt` in PubTabNet-compatible JSON format. ```python # Triggered via File > Export Table Label window.exportJSON() # Output gt.txt format (one JSON object per line): # {"filename": "table.jpg", "split": "train", # "imgid": 0, "html": { # "structure": {"tokens": ["", "", "", ...]}, # "cells": [{"tokens": ["Revenue"], "bbox": [10, 5, 80, 20]}, ...] # }} ``` ``` -------------------------------- ### MainWindow.saveRecResult Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Exports recognition training data by cropping annotated text regions and saving them along with ground-truth text. ```APIDOC ## MainWindow.saveRecResult — Export Recognition Training Data Crops each annotated text region from its source image and saves cropped images to `crop_img/` and ground-truth text to `rec_gt.txt`. These files are ready for PP-OCR recognition model training. ```python # Triggered via File > Export Recognition Results window.saveRecResult() # Output structure: # / # ├── crop_img/ # │ ├── image001_crop_0.jpg ← cropped text region # │ ├── image001_crop_1.jpg # │ └── ... # └── rec_gt.txt ← tab-separated: crop_img/filename transcription # # Example line: # # crop_img/image001_crop_0.jpg Hello World ``` ``` -------------------------------- ### Initialize AutoDialog for OCR Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Initializes and displays the AutoDialog for processing a list of images with OCR. This is the core call within the autoRecognition functionality. ```python from libs.autoDialog import AutoDialog dialog = AutoDialog( parent=window, ocr=window.ocr, # PaddleOCR instance image_list=uncheckedList, # Paths of unconfirmed images len_bar=len(uncheckedList), ) dialog.popUp() # Displays progress bar; blocks until done ``` -------------------------------- ### MainWindow.reRecognition Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Re-runs the TextRecognition model on every bounding box of the current image, overwriting existing labels. Useful after adjusting detection boxes manually. ```APIDOC ## MainWindow.reRecognition — Re-recognize All Boxes Re-runs the `TextRecognition` model on every bounding box of the **current image**, overwriting existing labels. Useful after adjusting detection boxes manually. ```python # Triggered by Ctrl+Shift+R or the "Re-recognize" button window.reRecognition() # Internally crops each shape from the image and calls: result = window.text_recognizer.predict(img_crop)[0] # result keys: "rec_text" (str), "rec_score" (float) # Example result: {"rec_text": "Hello World", "rec_score": 0.987} ``` ``` -------------------------------- ### Shape Object for Annotations Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Demonstrates the usage of the Shape class for creating and manipulating annotation boxes. Supports adding points, closing polygons, rotation, moving the shape, and modifying individual vertices. Includes deep copying and checking shape properties. ```python from libs.shape import Shape from PyQt5.QtCore import QPointF # Create a shape for a detected text region shape = Shape( label="Invoice Number", key_cls="invoice_id", # KIE class (only used in KIE mode) paintLabel=True, # Draw label text on canvas paintIdx=True, # Draw order index on canvas ) # Add four corner points (in image pixel coordinates) shape.addPoint(QPointF(50, 100)) shape.addPoint(QPointF(200, 100)) shape.addPoint(QPointF(200, 130)) shape.addPoint(QPointF(50, 130)) shape.close() # Computes center, finalizes polygon # Rotate shape by 15 degrees import math shape.rotate(math.radians(15)) # Move entire shape shape.moveBy(QPointF(10, 5)) # Move a single vertex (index 0 = top-left) shape.moveVertexBy(0, QPointF(-5, -2)) # Deep copy shape2 = shape.copy() print(shape2.label) # "Invoice Number" print(shape2.isClosed()) # True print(len(shape2)) # 4 ``` -------------------------------- ### Export Detection Labels to Label.txt Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Writes confirmed annotation results to Label.txt. Each line contains a relative image path and its corresponding JSON label list. Supports manual and automatic export modes. ```python # Manual export via File > Export Label Results window.savePPlabel(mode="Manual") ``` ```python # Automatic export (triggered internally after every 5 confirmed images) window.savePPlabel(mode="Auto") ``` ```text # Output format in Label.txt: # folder/image.jpg [{"transcription": "Hello", "points": [[10,10],[100,10],[100,30],[10,30]], "difficult": false}, ...] ``` -------------------------------- ### gen_ocr_train_val_test.py Source: https://context7.com/pfcclab/ppocrlabel/llms.txt A script to split annotated detection and recognition datasets into train/val/test subsets. ```APIDOC ## gen_ocr_train_val_test.py — Dataset Splitting Script Splits annotated detection and recognition datasets from PPOCRLabel into train/val/test subsets, copying images and writing split-specific label files. ```bash # Basic usage: 60% train, 20% val, 20% test python gen_ocr_train_val_test.py \ --trainValTestRatio 6:2:2 \ --datasetRootPath ../train_data # Custom paths and label filenames python gen_ocr_train_val_test.py \ --trainValTestRatio 7:2:1 \ --datasetRootPath /data/ocr_dataset \ --detRootPath /data/ocr_dataset/det \ --recRootPath /data/ocr_dataset/rec \ --detLabelFileName Label.txt \ --recLabelFileName rec_gt.txt \ --recImageDirName crop_img # Output structure: # /data/ocr_dataset/det/ # ├── train/ ← detection images (60%) # ├── val/ ← detection images (20%) # ├── test/ ← detection images (20%) # ├── train.txt, val.txt, test.txt ← label files # /data/ocr_dataset/rec/ # ├── train/, val/, test/ # └── train.txt, val.txt, test.txt ``` ``` -------------------------------- ### Rebuild HTML from PP-Structure Label Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Reconstructs an HTML table string from a dictionary representing table annotation data. This utility is used for exporting table annotations and previewing table structures. ```python from libs.utils import rebuild_html_from_ppstructure_label label_info = { "html": { "structure": { "tokens": ["", "", "", "", "", "", "", ""] }, "cells": [ {"tokens": ["R", "e", "v", "e", "n", "u", "e"]}, {"tokens": ["$", "1", "0", "0"]} ] } } html = rebuild_html_from_ppstructure_label(label_info) print(html) #
Revenue$100
``` -------------------------------- ### Export Recognition Training Data Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Crops annotated text regions and saves them for recognition model training. Generates cropped images in `crop_img/` and ground truth text in `rec_gt.txt`. ```python # Triggered via File > Export Recognition Results window.saveRecResult() ``` ```text # Output structure: # / # ├── crop_img/ # │ ├── image001_crop_0.jpg ← cropped text region # │ ├── image001_crop_1.jpg # │ └── ... # └── rec_gt.txt ← tab-separated: crop_img/filename transcription # # Example line: # # crop_img/image001_crop_0.jpg Hello World ``` -------------------------------- ### Crop and Deskew Text Region Utility Source: https://context7.com/pfcclab/ppocrlabel/llms.txt A utility function that extracts and rectifies arbitrarily-oriented text regions from an image using perspective transform. Returns a crop ready for recognition. ```python import cv2 import numpy as np from libs.utils import get_rotate_crop_image img = cv2.imread("document.jpg") ``` -------------------------------- ### Re-recognize All Boxes in Current Image Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Triggers a re-recognition of all bounding boxes in the current image using the TextRecognition model. Overwrites existing labels. Useful after manual adjustments to detection boxes. ```python window.reRecognition() ``` ```python # Internally crops each shape from the image and calls: result = window.text_recognizer.predict(img_crop)[0] # result keys: "rec_text" (str), "rec_score" (float) # Example result: {"rec_text": "Hello World", "rec_score": 0.987} ``` -------------------------------- ### Natural Sort Utility Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Provides a natural sorting function for lists of strings, typically file paths. It ensures that alphanumeric sorting is human-friendly (e.g., 'img2.jpg' before 'img10.jpg'). ```python from libs.utils import natural_sort images = ["img10.jpg", "img2.jpg", "img1.jpg", "img20.jpg"] natural_sort(images, key=lambda x: x.lower()) print(images) # ['img1.jpg', 'img2.jpg', 'img10.jpg', 'img20.jpg'] # Sort file paths preserving directory structure paths = ["/data/batch2/img10.png", "/data/batch1/img2.png", "/data/batch1/img10.png"] natural_sort(paths) print(paths) # ['/data/batch1/img2.png', '/data/batch1/img10.png', '/data/batch2/img10.png'] ``` -------------------------------- ### Reinstall Headless OpenCV for INTER_NEAREST Error Source: https://github.com/pfcclab/ppocrlabel/blob/main/README.md If you see 'module 'cv2' has no attribute 'INTER_NEAREST'', remove all OpenCV packages and reinstall version 4.2.0.32 of the headless OpenCV. ```bash pip install opencv-contrib-python-headless==4.2.0.32 ``` -------------------------------- ### Crop Rotated Text Image Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Uses get_rotate_crop_image to extract and straighten a text region defined by four corner points. The output is saved as a JPG file. Automatically rotates the image if the height/width ratio exceeds 1.5. ```python points = np.array([ [120, 45], # top-left [310, 38], # top-right [315, 72], # bottom-right [118, 79], # bottom-left ], dtype=np.float32) cropped = get_rotate_crop_image(img, points) # cropped: straightened text patch as np.ndarray (H, W, 3) # Automatically rotated 90° if height/width ratio > 1.5 cv2.imwrite("cropped_text.jpg", cropped) ``` -------------------------------- ### get_rotate_crop_image Source: https://context7.com/pfcclab/ppocrlabel/llms.txt A utility function to crop and deskew text regions from an image, returning a rectified crop suitable for recognition. ```APIDOC ## get_rotate_crop_image — Crop and Deskew Text Region A utility in `libs/utils.py` that extracts an arbitrarily-oriented text region from an image using perspective transform (handles both clockwise and counter-clockwise quadrilaterals). Returns a rectified crop ready for recognition. ```python import cv2 import numpy as np from libs.utils import get_rotate_crop_image img = cv2.imread("document.jpg") ``` ``` -------------------------------- ### MainWindow.cellReRecognition Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Re-recognizes the selected bounding box only. For table cells, it first runs text detection on the cropped cell region, then recognition on each detected sub-box, concatenating the results. ```APIDOC ## MainWindow.cellReRecognition — Re-recognize a Single Cell Re-recognizes the selected bounding box only. For table cells, it first runs text detection on the cropped cell region, then recognition on each detected sub-box, concatenating the results. ```python # Select a shape on canvas, then trigger: window.cellReRecognition() # Pipeline inside: img_crop = get_rotate_crop_image(img, np.array(box, np.float32)) det_res = window.text_detector.predict(img_crop)[0] bboxes = det_res["dt_polys"].tolist() # detected sub-boxes for _bbox in bboxes: patch = get_rotate_crop_image(img_crop, np.array(_bbox, np.float32)) rec_res = window.text_recognizer.predict(patch)[0] texts += rec_res["rec_text"] ``` ``` -------------------------------- ### Re-recognize Single Cell Bounding Box Source: https://context7.com/pfcclab/ppocrlabel/llms.txt Re-recognizes a single selected bounding box. For table cells, it performs detection and then recognition on sub-boxes, concatenating results. ```python window.cellReRecognition() ``` ```python # Pipeline inside: img_crop = get_rotate_crop_image(img, np.array(box, np.float32)) det_res = window.text_detector.predict(img_crop)[0] bboxes = det_res["dt_polys"].tolist() # detected sub-boxes for _bbox in bboxes: patch = get_rotate_crop_image(img_crop, np.array(_bbox, np.float32)) rec_res = window.text_recognizer.predict(patch)[0] texts += rec_res["rec_text"] ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.