### Install WinOCR and Start Web API Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Install the WinOCR library with API support and launch the web server. This allows you to send image files via HTTP POST requests for OCR processing. ```bash pip install winocr[api] winocr_serve curl -X POST "http://localhost:8000/?lang=en" --data-binary @test.jpg ``` -------------------------------- ### Define Optional Dependencies in setup.py Source: https://github.com/github30/winocr/blob/main/_autodocs/configuration.md Example of how optional dependencies for a Python package are defined using lists in the setup.py file. These lists are used for installing extra features. ```python api = ["Pillow", "fastapi", "uvicorn"] cv2 = ["opencv-python"] all = api + cv2 ``` -------------------------------- ### Launch WinOCR Web Server Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/web-server.md Install the necessary package and run the console script to start the web server. The server defaults to host 0.0.0.0 and port 8000 with CORS enabled. ```bash pip install winocr[api] winocr_serve ``` -------------------------------- ### Install WinOCR with all extras Source: https://github.com/github30/winocr/blob/main/README.md Install WinOCR with all optional dependencies included. Use this for a full-featured installation. ```powershell pip install winocr[all] ``` -------------------------------- ### Launch FastAPI Web Server for OCR Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md This bash command installs the winocr library with API extras and starts the integrated FastAPI web server. The server listens for HTTP POST requests to perform OCR on uploaded images. ```bash # Install with API extra pip install winocr[api] # Start server winocr_serve # Output: Uvicorn running on http://0.0.0.0:8000 # In another terminal, test with curl curl -X POST "http://localhost:8000/?lang=en" --data-binary @test.jpg | jq '.text' ``` -------------------------------- ### Web Server Deployment Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/quick-reference.md Deploys a web API endpoint for OCR using FastAPI. Install with API dependencies using 'pip install winocr[api]'. Start the server with 'winocr_serve'. ```bash # Install with API dependencies pip install winocr[api] # Start server winocr_serve # Call from another terminal curl -X POST "http://localhost:8000/?lang=en" --data-binary @test.jpg ``` -------------------------------- ### Set up local Colaboratory runtime Source: https://github.com/github30/winocr/blob/main/README.md Commands to set up a local Jupyter runtime for Google Colaboratory. This involves installing jupyterlab and jupyter_http_over_ws, enabling the server extension, and starting the Jupyter notebook server. ```powershell pip install jupyterlab jupyter_http_over_ws jupyter serverextension enable --py jupyter_http_over_ws jupyter notebook --NotebookApp.allow_origin='https://colab.research.google.com' --ip=0.0.0.0 --port=8888 --NotebookApp.port_retries=0 ``` -------------------------------- ### Install WinOCR with API Support Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Command to install the WinOCR library with the necessary dependencies for running the HTTP service. ```bash pip install winocr[api] ``` -------------------------------- ### Install WinOCR with Optional Dependencies Source: https://github.com/github30/winocr/blob/main/_autodocs/configuration.md Command-line instructions for installing WinOCR using pip, including options for web API server support, OpenCV, or all optional dependencies. ```bash # Basic installation (Windows media OCR only) pip install winocr # With web API server pip install winocr[api] # With OpenCV support pip install winocr[cv2] # With all optional dependencies pip install winocr[all] ``` -------------------------------- ### Install WinOCR with Dependencies Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Shows different ways to install the WinOCR library, including options for Pillow, OpenCV, or all dependencies. Choose the command that best suits your project's needs. ```bash # Basic (PIL/OpenCV must be installed separately) pip install winocr # With Pillow pip install winocr[api] # With OpenCV pip install winocr[cv2] # With all dependencies pip install winocr[all] ``` -------------------------------- ### Install WinOCR Package Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Command to install the core WinOCR Python package. ```bash pip install winocr ``` -------------------------------- ### Node.js Fetch Example for POST / Endpoint Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Perform OCR in a Node.js environment using the node-fetch library. This example reads an image file into a buffer and sends it to the POST / endpoint. ```javascript const fs = require('fs'); const fetch = require('node-fetch'); const imageBuffer = fs.readFileSync('document.jpg'); const response = await fetch('http://localhost:8000/?lang=en', { method: 'POST', body: imageBuffer }); const result = await response.json(); console.log(result.text); ``` -------------------------------- ### Install OCR Language Packs on Windows Source: https://github.com/github30/winocr/blob/main/README.md Install Japanese and English OCR language packs on Windows Server using PowerShell. Run these commands as an Administrator. ```powershell # Run as Administrator Add-WindowsCapability -Online -Name "Language.OCR~~~en-US~0.0.1.0" Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" # Search for installed languages Get-WindowsCapability -Online -Name "Language.OCR*" ``` -------------------------------- ### Install Windows OCR Language Packs Source: https://github.com/github30/winocr/blob/main/_autodocs/configuration.md Use PowerShell to install specific OCR language capabilities for Windows. Ensure you are running PowerShell as an administrator. The `Get-WindowsCapability` command can be used to check the installation status of any language pack. ```powershell # Install English OCR Add-WindowsCapability -Online -Name "Language.OCR~~~en-US~0.0.1.0" # Install Japanese OCR Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" # Check installation status Get-WindowsCapability -Online -Name "Language.OCR*" ``` -------------------------------- ### Install OCR Language Packs on Windows Source: https://github.com/github30/winocr/blob/main/_autodocs/configuration.md Use PowerShell to install or list available OCR language packs for the Windows OCR engine. Run these commands as an Administrator. ```powershell # Run as Administrator Add-WindowsCapability -Online -Name "Language.OCR~~~en-US~0.0.1.0" Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" Add-WindowsCapability -Online -Name "Language.OCR~~~ko-KR~0.0.1.0" # List available OCR language packs Get-WindowsCapability -Online -Name "Language.OCR*" ``` -------------------------------- ### Handling Language Pack Not Installed Error Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md A common scenario demonstrating how to catch an AssertionError when the language pack is missing and print the suggested installation command. ```python import winocr from PIL import Image img = Image.open('japanese_text.jpg') try: result = await winocr.recognize_pil(img, 'ja') except AssertionError as e: print(f"Install language pack: {e}") # Output: Language.OCR~~~ja-JP~0.0.1.0 ``` -------------------------------- ### Recognize Image Text with Specified Language Source: https://github.com/github30/winocr/blob/main/_autodocs/configuration.md Example of using the `recognize_pil` function to perform OCR on an image with different language settings. Ensure the specified language pack is installed. ```python import winocr from PIL import Image async def recognize_with_language(img_path, language): img = Image.open(img_path) result = await winocr.recognize_pil(img, language) return result.text # English await recognize_with_language('english.jpg', 'en') # Japanese await recognize_with_language('japanese.jpg', 'ja') # Korean await recognize_with_language('korean.jpg', 'ko') ``` -------------------------------- ### Programmatic Call to Start OCR Web Server Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md This Python code snippet shows how to programmatically start the winocr web server using a direct function call. Note that this call will block indefinitely while the server is running. ```python import winocr # This will start the server and block indefinitely winocr.serve() ``` -------------------------------- ### Response Examples Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Provides examples of successful OCR responses, including simple text, multiple lines, no recognized text, and rotated text. ```APIDOC ### Response Examples Example 1 - Simple text: ```json { "text": "Hello World", "text_angle": 0, "lines": [ { "text": "Hello World", "words": [ {"text": "Hello", "bounding_rect": {"x": 10, "y": 10, "width": 50, "height": 20}}, {"text": "World", "bounding_rect": {"x": 70, "y": 10, "width": 50, "height": 20}} ] } ] } ``` Example 2 - Multiple lines: ```json { "text": "First LineSecond Line", "text_angle": 0, "lines": [ { "text": "First Line", "words": [ {"text": "First", "bounding_rect": {"x": 10, "y": 10, "width": 40, "height": 20}}, {"text": "Line", "bounding_rect": {"x": 55, "y": 10, "width": 35, "height": 20}} ] }, { "text": "Second Line", "words": [ {"text": "Second", "bounding_rect": {"x": 10, "y": 40, "width": 45, "height": 20}}, {"text": "Line", "bounding_rect": {"x": 60, "y": 40, "width": 35, "height": 20}} ] } ] } ``` Example 3 - No recognized text: ```json { "text": "", "text_angle": 0, "lines": [] } ``` Example 4 - Rotated text (90 degrees): ```json { "text": "Rotated Text", "text_angle": 90, "lines": [...] } ``` ``` -------------------------------- ### Installing Missing Language Pack Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Use this PowerShell command to install a missing Windows language capability for OCR. ```powershell # Run PowerShell as Administrator Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" # Verify installation Get-WindowsCapability -Online -Name "Language.OCR~~~ja-JP*" ``` -------------------------------- ### Start WinOCR HTTP Service Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Command to launch the WinOCR web server, making OCR functionality available via HTTP. ```bash winocr_serve ``` -------------------------------- ### serve Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/quick-reference.md Starts a FastAPI web server to provide OCR functionality via a web API endpoint. ```APIDOC ## serve ### Description Starts a FastAPI web server that exposes OCR functionality as a web API endpoint. Use `winocr_serve` command to start. ### Signature `serve()` ### Use Case Web API endpoint. ``` -------------------------------- ### JavaScript Fetch Example for POST / Endpoint Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Perform OCR using the browser's Fetch API. Examples show sending a file from an input element or a Blob object. The 'lang' parameter is used to set the OCR language. ```javascript // From file input const fileInput = document.querySelector('input[type=file]'); const file = fileInput.files[0]; const response = await fetch('http://localhost:8000/?lang=en', { method: 'POST', body: file }); const result = await response.json(); console.log(result.text); // From blob const blob = await fetch('https://example.com/image.jpg').then(r => r.blob()); const response = await fetch('http://localhost:8000/?lang=en', { method: 'POST', body: blob }); const result = await response.json(); console.log(result.text); ``` -------------------------------- ### Resolving Language Pack Not Installed Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md The PowerShell command to install a missing language pack, to be run as Administrator. ```powershell Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" ``` -------------------------------- ### Python Urllib Example for POST / Endpoint Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Perform OCR using Python's built-in urllib library. This example demonstrates sending an image file and processing the JSON response to extract the recognized text. ```python import urllib.request import json with open('document.jpg', 'rb') as f: req = urllib.request.Request( 'http://localhost:8000/?lang=en', data=f.read(), method='POST' ) with urllib.request.urlopen(req) as response: result = json.loads(response.read()) print(result['text']) ``` -------------------------------- ### Using WinOCR as a Python Library Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Example of how to import and use the `recognize_pil` function from the WinOCR library within an asynchronous Python context. ```python import winocr result = await winocr.recognize_pil(img, 'en') ``` -------------------------------- ### cURL Examples for POST / Endpoint Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Examples of how to perform OCR using cURL. You can specify the language using the 'lang' query parameter or omit it to use the default (English). The image data is sent as binary in the request body. ```bash # English OCR curl -X POST "http://localhost:8000/?lang=en" \ --data-binary @document.jpg # Japanese OCR curl -X POST "http://localhost:8000/?lang=ja" \ --data-binary @document.jpg # Default language (English) curl -X POST "http://localhost:8000/" \ --data-binary @document.jpg ``` -------------------------------- ### Utility Functions Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Provides utility functions for object serialization, coroutine conversion, and starting the web server. ```python picklify(o: object) -> Union[list, dict, object] to_coroutine(awaitable: Awaitable) -> Coroutine serve() -> None ``` -------------------------------- ### Check Installed OCR Language Capabilities (PowerShell) Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Lists installed and available OCR language capabilities on Windows using PowerShell. Helps verify if the necessary language packs are present. ```powershell # List all installed OCR capabilities Get-WindowsCapability -Online -Name "Language.OCR*" | Where-Object State -eq Installed # List all available OCR capabilities Get-WindowsCapability -Online -Name "Language.OCR*" # Check specific language Get-WindowsCapability -Online -Name "Language.OCR~~~ja-JP*" ``` -------------------------------- ### JavaScript Fetch Example for OCR Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/web-server.md Utilize the fetch API in JavaScript to send image data to the OCR endpoint. This example shows how to send a local file or a blob from a remote URL and process the JSON response. ```javascript // From file input const file = document.querySelector('[type=file]').files[0]; const response = await fetch('http://localhost:8000/?lang=en', { method: 'POST', body: file }); const result = await response.json(); console.log(result.text); // From remote URL const blob = await fetch('https://example.com/image.jpg').then(r => r.blob()); const response = await fetch('http://localhost:8000/?lang=en', { method: 'POST', body: blob }); const result = await response.json(); ``` -------------------------------- ### Microservice OCR Integration Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Example of how a microservice can call the OCR service using httpx. Assumes the OCR service is available at 'ocr-service:8000'. ```python # In a service that needs OCR import httpx async def recognize_text(image_bytes, lang='en'): async with httpx.AsyncClient() as client: response = await client.post( 'http://ocr-service:8000/?lang={lang}', content=image_bytes ) return response.json() ``` -------------------------------- ### Error Response: Internal Server Error (Language Pack Not Installed) Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md A 500 Internal Server Error indicates issues like a missing language pack. The detail field provides the PowerShell command to install it. ```json { "detail": "Language.OCR~~~ja-JP~0.0.1.0" } ``` -------------------------------- ### Asynchronous OCR from Raw Bytes Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/core-functions.md Low-level asynchronous OCR from raw RGBA byte data. Requires explicit image dimensions (width, height). Ensure the language pack is installed; an AssertionError will provide installation instructions if not. ```python import winocr from PIL import Image img = Image.open('document.jpg') if img.mode != 'RGBA': img = img.convert('RGBA') async def main(): result = await winocr.recognize_bytes(img.tobytes(), img.width, img.height, 'en') print(result.text) import asyncio asyncio.run(main()) ``` -------------------------------- ### Check Language Pack Availability Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Safely checks if a language pack is installed for OCR without raising exceptions. Use this before attempting OCR to prevent errors. ```python from winrt.windows.media.ocr import OcrEngine from winrt.windows.globalization import Language def is_language_available(lang_code: str) -> bool: """Check if language pack is installed without raising errors.""" try: return OcrEngine.is_language_supported(Language(lang_code)) except Exception: return False async def recognize_safe(img, lang='en'): if not is_language_available(lang): raise ValueError(f"Language {lang} not installed") import winocr return await winocr.recognize_pil(img, lang) ``` -------------------------------- ### Python Requests Example for POST / Endpoint Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Perform OCR by sending an image file or image bytes to the POST / endpoint using the Python requests library. The 'lang' query parameter can be used to specify the OCR language. ```python import requests # From file with open('document.jpg', 'rb') as f: response = requests.post( 'http://localhost:8000/?lang=en', data=f.read() ) print(response.json()) # From bytes response = requests.post( 'http://localhost:8000/?lang=ja', data=image_bytes ) ``` -------------------------------- ### Error Propagation Example Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Demonstrates how errors like unsupported languages, invalid images, or WinRT engine failures propagate to the caller. FastAPI handles web server errors by default. ```python # Language not supported recognize_pil(img, 'unknown') # Raises: AssertionError("Language.OCR~~~unknown~0.0.1.0") # Invalid image recognize_pil(None, 'en') # Raises: AttributeError or similar from PIL/WinRT # WinRT errors recognize_pil(img, 'en') # If OCR engine fails # Raises: WinRT exception (RuntimeError, etc.) ``` -------------------------------- ### Python OCR with WinOCR Source: https://github.com/github30/winocr/blob/main/README.md Recognizes text from a list of PIL Image objects using the winocr library. Ensure the necessary language packs are installed. ```python import winocr images = [Image.open('testocr.png') for i in range(1000)] [(await winocr.recognize_pil(img)).text for img in images] ``` -------------------------------- ### Python Requests Example for OCR Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/web-server.md Use the requests library to send an image file to the OCR endpoint and process the JSON response. The 'lang' parameter can be set to specify the OCR language. ```python import requests with open('test.jpg', 'rb') as f: image_bytes = f.read() response = requests.post( 'http://localhost:8000/?lang=en', data=image_bytes ) result = response.json() print(result['text']) print(result['text_angle']) ``` -------------------------------- ### Extract Word Coordinates and Text (Async) Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md An example demonstrating how to extract individual words along with their bounding box coordinates from an OCR result using the asynchronous function. ```python import winocr from PIL import Image async def extract_words(): img = Image.open('document.jpg') result = await winocr.recognize_pil(img, 'en') words_with_coords = [] for line in result.lines: for word in line.words: rect = word.bounding_rect words_with_coords.append({ 'text': word.text, 'x': rect.x, 'y': rect.y, 'width': rect.width, 'height': rect.height }) return words_with_coords ``` -------------------------------- ### Send image to WinOCR API via curl Source: https://github.com/github30/winocr/blob/main/README.md Example of sending an image file to the WinOCR API using curl. The 'lang' parameter specifies the recognition language. ```bash curl localhost:8000?lang=ja --data-binary @test.jpg ``` -------------------------------- ### Graceful Degradation with Fallback Language Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Implements a fallback mechanism to try a secondary language if the preferred language pack is not installed. This ensures OCR can still proceed. ```python async def recognize_with_fallback(img, preferred_lang='en', fallback_lang='en'): """Try preferred language, fall back if not available.""" import winocr try: return await winocr.recognize_pil(img, preferred_lang) except AssertionError: # Language not installed, try fallback try: return await winocr.recognize_pil(img, fallback_lang) except AssertionError: raise RuntimeError(f"Neither {preferred_lang} nor {fallback_lang} installed") ``` -------------------------------- ### Handling Errors in Multiprocessing with WinOCR Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Example of using `concurrent.futures.ProcessPoolExecutor` with WinOCR, ensuring to use the synchronous `recognize_pil_sync()` function for proper serialization across processes and handling potential exceptions. ```python import winocr from PIL import Image import concurrent.futures def worker(img_path): try: img = Image.open(img_path) return winocr.recognize_pil_sync(img, 'en') except Exception as e: return {'error': str(e)} images = ['image1.jpg', 'image2.jpg', 'image3.jpg'] with concurrent.futures.ProcessPoolExecutor() as executor: results = list(executor.map(worker, images)) for result in results: if 'error' in result: print(f"Error: {result['error']}") else: print(result['text']) ``` -------------------------------- ### Catching Image Processing Failures Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Handle exceptions that occur during image loading or processing, such as invalid formats or corrupted data. This example wraps the recognition process in a try-except block. ```python import winocr from PIL import Image from io import BytesIO async def safe_recognize(image_bytes, lang='en'): try: img = Image.open(BytesIO(image_bytes)) result = await winocr.recognize_pil(img, lang) return result except Exception as e: print(f"OCR failed: {type(e).__name__}: {e}") return None # In web server context try: img = Image.open(BytesIO(request_body)) result = await winocr.recognize_pil(img, lang) except Exception as e: # Return error response return {"error": str(e), "status": 500} ``` -------------------------------- ### Running WinOCR as a Command-Line Tool Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Demonstrates how to execute the WinOCR server using the `winocr_serve` command. This command is defined in setup.py and calls the `serve` function from the `winocr` module. ```bash winocr_serve ``` ```python entry_points={"console_scripts": ["winocr_serve = winocr:serve"]} ``` -------------------------------- ### WinOCR Import Patterns Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/quick-reference.md Demonstrates different ways to import the WinOCR library, including importing the entire module, specific functions, or with type hints for better code analysis. ```python # Basic import - access all exported functions import winocr # Specific functions from winocr import recognize_pil, recognize_cv2, recognize_pil_sync, recognize_cv2_sync # With type hints (requires type stubs for WinRT) import winocr from typing import Coroutine from PIL import Image async def process(img: Image.Image) -> Coroutine: return await winocr.recognize_pil(img) ``` -------------------------------- ### Launch FastAPI Web Server with WinOCR Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md Use this snippet to launch a FastAPI web server that integrates WinOCR for image recognition. Customize the host and port by wrapping the `uvicorn.run` call. ```python import uvicorn from fastapi import FastAPI, Request, Response from fastapi.middleware.cors import CORSMiddleware from PIL import Image from io import BytesIO import winocr import json app = FastAPI() app.add_middleware(CORSMiddleware, allow_origins=['*'], allow_credentials=True, allow_methods=['*'], allow_headers=['*']) @app.post('/') async def recognize(request: Request, lang: str = 'en'): result = await winocr.recognize_pil(Image.open(BytesIO(await request.body())), lang) return Response(json.dumps(winocr.picklify(result), indent=2, ensure_ascii=False), media_type='application/json') # Custom host/port uvicorn.run(app, host='127.0.0.1', port=9000) ``` -------------------------------- ### Create PIL Image from File Source: https://github.com/github30/winocr/blob/main/_autodocs/types.md Shows how to open and load an image from a file path using the PIL.Image.open() method. ```python from PIL import Image # Open from file img = Image.open('path/to/image.jpg') ``` -------------------------------- ### Catching Language Not Supported Error Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Catch AssertionError when a required language pack is not installed. The error message contains the PowerShell command to install the language. ```python import winocr from PIL import Image img = Image.open('document.jpg') try: result = await winocr.recognize_pil(img, 'ja') except AssertionError as e: print(f"Language not installed: {e}") # Install using: Add-WindowsCapability -Online -Name "Language.OCR~~~ja-JP~0.0.1.0" ``` -------------------------------- ### serve Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md Launches a FastAPI web server for OCR via HTTP POST requests. The server listens on 0.0.0.0:8000 and accepts image uploads for text recognition. ```APIDOC ## serve ### Description Launch a FastAPI web server for OCR via HTTP POST. ### Signature ```python def serve() ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters Table None (all configuration is hardcoded) ### Return Type None (blocks indefinitely while server runs) ### Purpose Start an HTTP server listening on `0.0.0.0:8000` (default uvicorn port) that accepts image uploads and performs OCR. ### CORS Configuration - `allow_origins=['*']` — Accept from any origin - `allow_credentials=True` — Allow credential headers - `allow_methods=['*']` — Allow all HTTP methods - `allow_headers=['*']` — Allow all request headers ### Endpoints - `POST /` — Recognize text from image (see [web-server.md](web-server.md)) ### Usage As console entry point ```bash # Install with API extra pip install winocr[api] # Start server winocr_serve # Output: Uvicorn running on http://0.0.0.0:8000 # In another terminal, test with curl curl -X POST "http://localhost:8000/?lang=en" --data-binary @test.jpg | jq '.text' ``` ### Usage Programmatic call ```python import winocr # This will start the server and block indefinitely winocr.serve() ``` ``` -------------------------------- ### Create PIL Image from Bytes Source: https://github.com/github30/winocr/blob/main/_autodocs/types.md Illustrates how to create a PIL Image object from raw image bytes using BytesIO. ```python from io import BytesIO from PIL import Image # Create from bytes img = Image.open(BytesIO(image_bytes)) ``` -------------------------------- ### WinOCR Server Launch Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md The command to launch the WinOCR server using uvicorn. ```python uvicorn.run(app, host='0.0.0.0') ``` -------------------------------- ### Correct Multiprocessing with Sync Wrapper Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Demonstrates the correct way to use `concurrent.futures.ProcessPoolExecutor` with WinOCR by using a synchronous wrapper function. Avoids pickling errors with async results. ```python # WRONG - pickling async results try: with concurrent.futures.ProcessPoolExecutor() as executor: # This will fail - can't pickle coroutines results = executor.map(winocr.recognize_pil, images) except Exception as e: print(f"Error: {e}") # CORRECT - use sync wrapper import winocr with concurrent.futures.ProcessPoolExecutor() as executor: # recognize_pil_sync returns dict (JSON-serializable) results = list(executor.map(winocr.recognize_pil_sync, images)) print(results[0]['text']) ``` -------------------------------- ### Status Codes and Errors Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Details the possible HTTP status codes, their meanings, and example error responses for common issues. ```APIDOC ### Status Codes and Errors **200 OK** — Successfully recognized text **400 Bad Request**: - Empty request body - Corrupted image data that PIL cannot open Error response: ```json { "detail": "Empty request body" } ``` **422 Unprocessable Entity**: - Invalid `lang` parameter type (not a string) - Missing required parameters Error response: ```json { "detail": [ { "type": "value_error", "loc": ["query", "lang"], "msg": "value is not a valid string", "input": 123 } ] } ``` **500 Internal Server Error**: - Language pack not installed - WinRT OCR engine failure - Unhandled exception Error response (language not installed): ```json { "detail": "Language.OCR~~~ja-JP~0.0.1.0" } ``` The error detail contains the PowerShell command to install the language pack. ``` -------------------------------- ### OCR Result for Rotated Text Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Indicates the overall clockwise rotation of the text in degrees. This example shows a 90-degree rotation. ```json { "text": "Rotated Text", "text_angle": 90, "lines": [...] } ``` -------------------------------- ### Package Entry Point Registration Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Defines how the 'winocr' module is registered as a Python package and exposes a console script entry point for 'winocr_serve'. ```python py_modules=["winocr"] entry_points={"console_scripts": ["winocr_serve = winocr:serve"]} ``` -------------------------------- ### Load PIL Image for OCR Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Demonstrates how to load an image file into a PIL Image object, which can then be used with WinOCR's PIL-based recognition functions. ```python from PIL import Image img = Image.open('document.jpg') ``` -------------------------------- ### Recognize Multiple Languages Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/quick-reference.md Asynchronously recognizes text from an image using multiple specified languages. Handles cases where a language might not be installed. ```python import winocr from PIL import Image import asyncio async def recognize_multiple(img_path, languages): img = Image.open(img_path) results = {} for lang in languages: try: result = await winocr.recognize_pil(img, lang) results[lang] = result.text except AssertionError: print(f"Language {lang} not installed") return results # Usage result = asyncio.run(recognize_multiple('image.jpg', ['en', 'ja', 'ko'])) ``` -------------------------------- ### POST / Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/web-server.md Recognize text from an image using Windows OCR. The endpoint accepts binary image data and returns a JSON object containing the recognized text, its angle, and detailed line and word information. ```APIDOC ## POST / ### Description Recognize text from an image using Windows OCR. ### Method POST ### Endpoint / ### Parameters #### Query Parameters - **lang** (str) - Optional - Language code for OCR (e.g., 'en', 'ja', 'ko'). Must correspond to an installed Windows language capability. Defaults to 'en'. #### Request Body - **Binary image data** (binary) - Required - JPEG, PNG, or other format supported by PIL.Image.open ### Request Example ```bash curl -X POST "http://localhost:8000/?lang=ja" \ --data-binary @test.jpg \ -H "Content-Type: application/octet-stream" ``` ### Response #### Success Response (200) - **text** (string) - Full recognized text from the entire image - **text_angle** (int) - Rotation angle of the text in degrees - **lines** (array) - List of detected text lines - **lines[].text** (string) - Recognized text of the line - **lines[].words** (array) - List of words in the line with bounding boxes - **lines[].words[].text** (string) - Individual word text - **lines[].words[].bounding_rect** (object) - Bounding box of the word in pixel coordinates - **lines[].words[].bounding_rect.x** (int) - Left x-coordinate - **lines[].words[].bounding_rect.y** (int) - Top y-coordinate - **lines[].words[].bounding_rect.width** (int) - Width in pixels - **lines[].words[].bounding_rect.height** (int) - Height in pixels #### Response Example ```json { "text": "string", "text_angle": "int", "lines": [ { "text": "string", "words": [ { "bounding_rect": { "x": "int", "y": "int", "width": "int", "height": "int" }, "text": "string" } ] } ] } ``` #### Error Response - **422**: Invalid query parameters or missing required fields - **500**: Internal server error (e.g., language not installed) ``` -------------------------------- ### Asynchronous OCR from PIL Image Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/core-functions.md Recognize text from a PIL Image asynchronously. Ensure the specified language is installed on your system. The function returns a coroutine that resolves to an OcrResult object. ```python import winocr from PIL import Image async def main(): img = Image.open('document.jpg') result = await winocr.recognize_pil(img, 'en') print(result.text) # Full text from image print(result.text_angle) # Rotation angle # Run in async context import asyncio asyncio.run(main()) ``` -------------------------------- ### Optional Dependencies for CV2 and Web Server Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Imports for optional functionalities like OpenCV image processing and the web server components. These are imported conditionally. ```python # In recognize_cv2: import cv2 # In serve: import json import uvicorn from PIL import Image from io import BytesIO from fastapi import FastAPI, Request, Response from fastapi.middleware.cors import CORSMiddleware ``` -------------------------------- ### Load OpenCV Image for OCR Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Shows how to load an image file into an OpenCV image format, suitable for use with WinOCR's OpenCV-based recognition functions. ```python import cv2 img = cv2.imread('document.jpg') ``` -------------------------------- ### Synchronous OCR with OpenCV Image Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/core-functions.md Use this function to synchronously recognize text from an OpenCV image. It blocks until the OCR process is complete and returns a dictionary. Ensure the specified language is installed. ```python import winocr import cv2 img = cv2.imread('testocr.png') result = winocr.recognize_cv2_sync(img) print(result['text']) ``` -------------------------------- ### Asynchronous OCR from OpenCV Image Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/core-functions.md Recognize text from an OpenCV image (numpy ndarray) asynchronously. The image is expected in BGR format and will be converted internally. Requires the specified language to be installed. ```python import winocr import cv2 async def main(): img = cv2.imread('document.jpg') result = await winocr.recognize_cv2(img, 'ja') print(result.text) import asyncio asyncio.run(main()) ``` -------------------------------- ### Sync Path Data Flow: PIL Image to Dict Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Details the data flow for the synchronous 'recognize_pil_sync' function, showing the conversion from a PIL Image to a JSON-compatible dictionary. ```text PIL.Image ↓ [recognize_pil_sync] ├─ Call recognize_pil(img, lang) ├─ Wrap in to_coroutine() ├─ Run with asyncio.run() ↓ OcrResult (WinRT object) ↓ [picklify] └─ Recursively convert to dict ├─ Iterate attributes (skip private/methods) ├─ Recursively picklify nested objects ↓ dict (JSON-compatible) ├─ {'text': ..., 'text_angle': ...} ├─ {'lines': [...]} └─ Each word: {'text': ..., 'bounding_rect': {...}} ``` -------------------------------- ### Synchronous OCR from PIL Image Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/core-functions.md Synchronously recognize text from a PIL Image using a blocking wrapper. Returns a dictionary representation of the OCR result, suitable for synchronous workflows. Language must be installed. ```python import winocr from PIL import Image img = Image.open('document.png') result = winocr.recognize_pil_sync(img, 'en') print(result['text']) # Dictionary access print(result['text_angle']) ``` -------------------------------- ### picklify Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md Recursively serializes Windows Runtime objects to JSON-compatible dictionaries. It handles collections, class instances, and primitive types, converting them into a format suitable for JSON serialization or pickling across processes. ```APIDOC ## picklify ### Description Recursively serialize Windows Runtime objects to JSON-compatible dictionaries. ### Signature ```python def picklify(o: object) -> Union[list, dict, object] ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters Table | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | o | object | Yes | Any Python object, typically a WinRT OcrResult or nested structure | ### Return Type dict, list, or primitive ### Purpose Convert WinRT objects (which cannot be directly JSON serialized or pickled across processes) to plain Python dictionaries. ### Example ```python import winocr from PIL import Image import json async def main(): img = Image.open('document.jpg') result = await winocr.recognize_pil(img, 'en') serialized = winocr.picklify(result) json_str = json.dumps(serialized, ensure_ascii=False) return serialized import asyncio data = asyncio.run(main()) ``` ### Output Structure Example ```json { "text": "Full text from image", "text_angle": 0, "lines": [ { "text": "Line 1 text", "words": [ { "text": "Word1", "bounding_rect": { "x": 10, "y": 20, "width": 50, "height": 20 } } ] } ] } ``` ### Limitations - Circular references will cause infinite recursion - Does not preserve type information beyond primitive types - Private attributes (starting with `_`) are excluded - Methods are excluded ``` -------------------------------- ### Python Example for Processing OCR Response Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/web-server.md Process the JSON response from the OCR endpoint in Python to access the full recognized text, text angle, and iterate through detected lines and words with their bounding boxes. ```python import requests import json response = requests.post( 'http://localhost:8000/?lang=en', data=open('test.jpg', 'rb').read() ) result = response.json() # Access full text print(result['text']) # Iterate through lines for line in result['lines']: print(f"Line: {line['text']}") # Iterate through words with coordinates for word in line['words']: bbox = word['bounding_rect'] print(f" Word: {word['text']} at ({bbox['x']}, {bbox['y']})") ``` -------------------------------- ### POST / Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md Performs optical character recognition on an uploaded image. Supports specifying the OCR language via a query parameter. ```APIDOC ## POST / ### Description Perform optical character recognition on an uploaded image. ### Method POST ### Endpoint / ### Parameters #### Query Parameters - **lang** (string) - Optional - ISO language code. Defaults to 'en'. Must correspond to an installed Windows language pack. ### Request Body Binary image data - Format: JPEG, PNG, BMP, TIFF, or any format supported by PIL.Image.open() - Size: No documented limit, but typical images <10MB recommended - Dimensions: No minimum/maximum, but practical limit ~10000×10000 pixels ### Request Example **cURL**: ```bash # English OCR curl -X POST "http://localhost:8000/?lang=en" \ --data-binary @document.jpg ``` **Python (requests)**: ```python import requests with open('document.jpg', 'rb') as f: response = requests.post( 'http://localhost:8000/?lang=en', data=f.read() ) print(response.json()) ``` ### Response #### Success Response (200) - **text** (string) - Full text extracted from the image. - **text_angle** (integer) - The detected angle of the text in the image. - **lines** (array) - An array of objects, where each object represents a line of text. - **text** (string) - The text content of the line. - **words** (array) - An array of objects, where each object represents a word in the line. - **text** (string) - The text content of the word. - **bounding_rect** (object) - The bounding box of the word. - **x** (integer) - The x-coordinate of the top-left corner. - **y** (integer) - The y-coordinate of the top-left corner. - **width** (integer) - The width of the bounding box. - **height** (integer) - The height of the bounding box. #### Response Example ```json { "text": "Full text from the image", "text_angle": 0, "lines": [ { "text": "First line of text", "words": [ { "text": "First", "bounding_rect": { "x": 10, "y": 15, "width": 45, "height": 20 } }, { "text": "line", "bounding_rect": { "x": 60, "y": 15, "width": 30, "height": 20 } } ] } ] } ``` ``` -------------------------------- ### WinOCR Endpoint Implementation Source: https://github.com/github30/winocr/blob/main/_autodocs/endpoints.md The core OCR recognition endpoint implementation within the winocr.py file. ```python @app.post('/') async def recognize(request: Request, lang: str = 'en'): result = await recognize_pil(Image.open(BytesIO(await request.body())), lang) return Response(json.dumps(picklify(result), indent=2, ensure_ascii=False), media_type='application/json') ``` -------------------------------- ### picklify Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/quick-reference.md Serializes a Python object into a dictionary, recursively handling native WinRT objects. ```APIDOC ## picklify ### Description Serializes a Python object into a dictionary, recursively handling native WinRT objects. Useful for manual serialization. ### Signature `picklify(obj)` ### Returns dict (serialized object) ### Use Case Manual WinRT object serialization. ``` -------------------------------- ### Recognize Text (Sync PIL) Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Synchronously recognizes text from a PIL Image object. Returns a dictionary. ```APIDOC ## recognize_pil_sync ### Description Synchronously recognizes text from a PIL Image object. Returns a JSON-serializable dictionary. ### Method `recognize_pil_sync(img, lang='en')` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python import winocr from PIL import Image img = Image.open('document.jpg') result = winocr.recognize_pil_sync(img, 'en') print(result['text']) ``` ### Response #### Success Response - **dict** - A dictionary containing the recognized text and its structure. #### Response Example ```json { "text": "Recognized text from the image." } ``` ``` -------------------------------- ### Internal Helper: recognize_cv2 Signature Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md This is the base async implementation for OpenCV images. The input image must be in OpenCV format (BGR, shape=(height, width, 3)). ```python def recognize_cv2(img: ndarray, lang: str = 'en') -> Awaitable ``` -------------------------------- ### Enable Detailed Error Messages with Traceback Source: https://github.com/github30/winocr/blob/main/_autodocs/errors.md Prints a full stack trace and detailed error information when an exception occurs during OCR processing. Useful for debugging. ```python import traceback import winocr try: result = await winocr.recognize_pil(img, 'unknown') except Exception as e: traceback.print_exc() # Full stack trace print(f"\nError Type: {type(e).__name__}") print(f"Error Message: {str(e)}") ``` -------------------------------- ### Async Path Data Flow: PIL Image to Text Source: https://github.com/github30/winocr/blob/main/_autodocs/module-structure.md Illustrates the data transformation steps from a PIL Image input to the final OcrResult object via the asynchronous 'recognize_pil' function. ```text PIL.Image ↓ [recognize_pil] ├─ Convert to RGBA if needed: img.convert('RGBA') ├─ Get raw bytes: img.tobytes() ├─ Get dimensions: img.width, img.height ↓ [recognize_bytes] ├─ Create DataWriter ├─ Write bytes to buffer ├─ Create SoftwareBitmap with BitmapPixelFormat.RGBA8 ├─ Get OcrEngine for language ├─ Call recognize_async(bitmap) ↓ IAsyncOperation ↓ [User awaits] ↓ OcrResult (WinRT object) ├─ .text: str ├─ .text_angle: int └─ .lines: List[OcrLine] └─ Each line has .text and .words └─ Each word has .text and .bounding_rect ``` -------------------------------- ### Access OCR Result Text (Async) Source: https://github.com/github30/winocr/blob/main/_autodocs/README.md Demonstrates how to access the full recognized text from an image using the asynchronous recognize_pil function. ```python result = await winocr.recognize_pil(img, 'en') print(result.text) # Full text ``` -------------------------------- ### Convert WinRT Objects to JSON-Serializable Dictionaries Source: https://github.com/github30/winocr/blob/main/_autodocs/api-reference/utilities.md Use picklify to recursively serialize Windows Runtime objects into JSON-compatible Python dictionaries. This is necessary because WinRT objects cannot be directly serialized or pickled across processes. Ensure all necessary imports are present. ```python import winocr from PIL import Image import json async def main(): img = Image.open('document.jpg') result = await winocr.recognize_pil(img, 'en') # result is WinRT OcrResult - can't json.dumps it directly # result.text works print(result.text) # Convert to dict for serialization serialized = winocr.picklify(result) json_str = json.dumps(serialized, ensure_ascii=False) return serialized import asyncio data = asyncio.run(main()) ```