### Local Development Setup: Dependencies and Run Source: https://context7.com/devforth/imagetotext.app/llms.txt Commands to install Python dependencies using Pipenv and to run the FastAPI application with auto-reloading for local development. ```bash # 2. Install Python dependencies pipenv sync # 3. Run with auto-reload pipenv run uvicorn main:app --reload # Listening on http://127.0.0.1:8000 ``` -------------------------------- ### Local Development Setup: Tesseract Installation Source: https://context7.com/devforth/imagetotext.app/llms.txt Instructions for installing Tesseract OCR and its development libraries on Ubuntu 20.04 or WSL 2. This step is crucial and must be completed before installing Python dependencies. ```bash # 1. Install Tesseract (Ubuntu 20.04 / WSL 2) — must be done BEFORE pipenv sync sudo apt install tesseract-ocr libtesseract-dev libleptonica-dev pkg-config # Verify: tesseract --version (must be >= 4.1.1) ``` -------------------------------- ### Install Dependencies for Local Development Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Installs necessary system dependencies for Tesseract OCR before running `pipenv sync` for local development. ```bash apt install tesseract-ocr libtesseract-dev libleptonica-dev pkg-config ``` -------------------------------- ### Run Local Development Server Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Starts the development server with auto-reloading enabled using `uvicorn` and `pipenv`. ```bash pipenv run uvicorn main:app --reload ``` -------------------------------- ### Dockerfile Summary Source: https://context7.com/devforth/imagetotext.app/llms.txt A summary of the Dockerfile's actions, including base image, system package installation for Tesseract, dependency installation via Pipenv, and copying project files. ```bash # Dockerfile summary — builds the image: # FROM python:3.8 # RUN apt install tesseract-ocr libtesseract-dev libleptonica-dev ... # RUN pipenv sync # COPY . /code/ ``` -------------------------------- ### GET /about/ Source: https://context7.com/devforth/imagetotext.app/llms.txt Serves the informational about page for the ImageToText application. ```APIDOC ## GET /about/ ### Description Returns the informational about page rendered via the Jinja2 `about.html` template. ### Method GET ### Endpoint /about/ ### Request Example ```bash curl -i http://127.0.0.1:8000/about/ ``` ### Response Example ``` HTTP/1.1 200 OK content-type: text/html; charset=utf-8 ``` ``` -------------------------------- ### Docker Build and Run Commands Source: https://context7.com/devforth/imagetotext.app/llms.txt Commands to build and start the Docker Compose service, and a command for running the application in development mode with hot-reloading. ```bash # Build and start docker-compose -f deploy/docker-compose.yml up -d --build # Development mode with hot-reload (from repo root): docker run --rm -p 8000:8000 \ $(docker build -q .) \ /bin/bash -c "cd /code/ && pipenv run uvicorn main:app --reload --host 0.0.0.0 --port 8000" ``` -------------------------------- ### GET / Source: https://context7.com/devforth/imagetotext.app/llms.txt Serves the main HTML page for the ImageToText web application, providing the user interface for image uploads and text interaction. ```APIDOC ## GET / ### Description Returns the main application HTML page rendered via the Jinja2 `home.html` template. This is the entry point for the browser-based UI. ### Method GET ### Endpoint / ### Request Example ```bash curl -i http://127.0.0.1:8000/ ``` ### Response Example ``` HTTP/1.1 200 OK content-type: text/html; charset=utf-8 ... ``` ``` -------------------------------- ### Local Development: Smoke-test OCR Endpoint Source: https://context7.com/devforth/imagetotext.app/llms.txt Example cURL command to send a POST request with JSON data to the local OCR endpoint and verify the response format. ```bash # 4. Smoke-test the OCR endpoint curl -X POST http://127.0.0.1:8000/upload/ \ -H 'Content-Type: application/json' \ -d @testreq.json # Returns: { "texts": [ { "text": "...", "left": N, "top": N, "width": N, "height": N }, ... ] } ``` -------------------------------- ### Get Image Scale with Original Size Fallback Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Gets the image scale, first checking if natural size is already available, otherwise fetching it. ```javascript getImgScale(base64) { var inst = this; if(this.naturalSize) return Promise.resolve(this.calcScale(this.naturalSize)); return this.getOriginalSize(base64) .then(function(sizeOrig) { return inst.calcScale(sizeOrig); }); } ``` -------------------------------- ### Get Text Content from Element Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Retrieves and trims the text content from the 'imgTexts' element. ```javascript getTexts() { return imgTexts.textContent.trim(); } ``` -------------------------------- ### Get Image Information from Local Storage Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Retrieves image information (base64, texts, copy mode) from local storage. Includes error handling for parsing JSON. ```javascript getInfFromStorage() { try { var base64 = JSON.parse(localStorage.getItem('imgInf')); } catch (e) { console.log(e); } return base64; } ``` -------------------------------- ### Get Original Image Size Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Asynchronously retrieves the original width and height of an image from its base64 data. ```javascript getOriginalSize(base64) { return new Promise(function(resolve) { var img = new Image(); img.onload = function(readerEvent) { var origSize = { width: img.naturalWidth, height: img.naturalHeight, } this.naturalSize = origSize; resolve(origSize); } img.src = base64; }) } ``` -------------------------------- ### Nginx Reverse Proxy Configuration Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Configures Nginx to proxy requests to the imagetotext application running on port 8314. This setup includes SSL configuration and request header management. ```nginx # imagetotext.app server { server_name imagetotext.app; listen 80; listen 443 ssl; ssl_certificate /etc/ssl/cert/$CERT_NAME.crt; ssl_certificate_key /etc/ssl/cert/$CERT_NAME.key; charset utf-8; client_max_body_size 75M; location / { proxy_set_header Authorization ""; proxy_connect_timeout 400s; proxy_read_timeout 400s; proxy_set_header X-Forwarded-For $$proxy_add_x_forwarded_for; proxy_set_header Host $$http_host; proxy_redirect off; proxy_buffers 128 8k; proxy_buffer_size 16k; proxy_pass http://127.0.0.1:8314; add_header Last-Modified $$date_gmt; etag off; } } ``` -------------------------------- ### Initialize Application and Load from Storage Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Initializes the application, attempting to load previous state from local storage. If data exists, it restores the layout; otherwise, it shows the initial layout. ```javascript window.app = app; window.copyText = function() { navigator.clipboard.writeText(app.getTexts()); }; var imgStorage = app.getInfFromStorage(); if(imgStorage && imgStorage.base64 && imgStorage.texts) { app.copyMode = 'mode' in imgStorage && imgStorage.mode; app.startUploadLayout(imgStorage.base64, app.copyMode) .endUploadLayout(imgStorage.texts); } else { app.showInitLayout(); } ``` -------------------------------- ### Serve Home Page Source: https://context7.com/devforth/imagetotext.app/llms.txt Use this endpoint to retrieve the main application HTML page. It serves as the entry point for the browser-based UI. ```bash curl -i http://127.0.0.1:8000/ # HTTP/1.1 200 OK # content-type: text/html; charset=utf-8 # # ... ``` -------------------------------- ### Serve About Page Source: https://context7.com/devforth/imagetotext.app/llms.txt Use this endpoint to retrieve the informational about page. It is rendered via the Jinja2 about.html template. ```bash curl -i http://127.0.0.1:8000/about/ # HTTP/1.1 200 OK # content-type: text/html; charset=utf-8 ``` -------------------------------- ### Create File Input Element Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Creates and returns an HTML input element of type 'file' with the class 'input-upload'. ```javascript createInputElement() { var input = document.createElement('input'); input.type = 'file'; input.classList.add('input-upload'); return input; } ``` -------------------------------- ### Initialize DOM Event Listener Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Sets up event listeners for the entire application once the DOM is fully loaded. It selects key DOM elements to manage the application's state and UI. ```javascript document.addEventListener("DOMContentLoaded", () => { var rootApp = document.querySelector("#app"); var container = rootApp.querySelector('#app > .container'); var img = container.querySelector('.img-paste'); var imgTexts = container.querySelector('.img-with-texts'); var header = container.querySelector('.container-header'); var upload = container.querySelector('.upload-wrapper'); var imgFooter = container.querySelector('.img-with-text-footer'); var dragging = container.querySelector('.dragging'); var uploadSocial = container.querySelector('.upload-social-wrapper'); var aboutLink = container.querySelector('.about-page'); var duoButton = container.querySelector('.duo-button'); var message = container.querySelector('.message'); ``` -------------------------------- ### Local Development: Test Tesseract Directly Source: https://context7.com/devforth/imagetotext.app/llms.txt Command to test Tesseract OCR functionality directly from the command line, bypassing the FastAPI application. ```bash # 5. Test Tesseract directly (bypassing the API) tesseract /absolute/path/to/image_with_text.png stdout ``` -------------------------------- ### Convert File to Base64 and Submit Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Reads a file, converts it to a base64 data URL, and then submits it to the server for processing. ```javascript function convertOnBase64AndSubmit(file) { var reader = new FileReader(); reader.onload = (ev) => { sendImageToServer(ev.target.result); } reader.readAsDataURL(file); } ``` -------------------------------- ### Toggle Copy Mode and Update UI Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Switches the copy mode between 'copy' and other types, updates local storage, and refreshes the UI buttons. ```javascript toogleCopyMode(type) { this.copyMode = type; this.setInfToStorage(); duoButton.innerHTML = this.copyModeHTML(type); this.replaceTextButton(type); return this; } ``` -------------------------------- ### Set Image Source and Load Handler Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Sets the image source using base64 data and defines an onload handler to show the image layout. ```javascript setImgSrc() { this.imgData = this.base64.split(',')[1]; var inst = this; img.onload = function() { inst.showImageLayout(); } img.src = this.base64; return this; } ``` -------------------------------- ### Docker Compose for ImageToText Service Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Defines a Docker Compose service for the imagetotext application. Ensure you replace `` with the actual path to your repository. ```yaml services: imagetotext: network_mode: host build: command: /bin/bash -c "cd /code/ && pipenv run uvicorn main:app --reload --host 0.0.0.0 --port 8314 --workers 6" restart: always ``` -------------------------------- ### Test Tesseract OCR Directly Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Executes the Tesseract OCR command-line tool to extract text from an image file. Replace `/home/ykorolikhin/Pictures/test_text.png` with the absolute path to your image. ```bash tesseract `absolute path to any image with text (/home/ykorolikhin/Pictures/test_text.png)` stdout ``` -------------------------------- ### Window Reset Click Handler Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Global function to reset the application when the reset button is clicked. ```javascript window.onResetClick = function() { app.reset(); }; ``` -------------------------------- ### API Request using JSON File Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Sends a POST request to the `/upload/` endpoint using a JSON file as the request body. This is an alternative method for testing the API. ```bash curl -d @testreq.json -H 'Content-Type: application/json' http://127.0.0.1:8000/upload/ ``` -------------------------------- ### Log Data to Console Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Logs provided data to the console, stringifying it if it's an object. ```javascript logger(data, id) { console.log(id, typeof data === 'object' ? JSON.stringify(data) : data); } ``` -------------------------------- ### Layout Management for UI States Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Manages different visual layouts of the application, including showing/hiding image elements, initializing the UI, handling upload states, and restoring the initial view. It also includes logic to toggle copy mode. ```javascript var layouts = { showImageLayout() { img.classList.remove('hide'); imgTexts.classList.remove('hide'); imgFooter.classList.remove('hide'); return this; }, hideImageLayout() { img.classList.add('hide'); imgTexts.classList.add('hide'); imgFooter.classList.add('hide'); return this; }, hideInitLayout() { header.classList.add('hide'); upload.classList.add('hide'); uploadSocial.classList.add('hide'); aboutLink.classList.add('hide'); return this; }, showInitLayout() { header.classList.remove('hide'); upload.classList.remove('hide'); uploadSocial.classList.remove('hide'); aboutLink.classList.remove('hide'); return this; }, startUploadLayout(base64, copyMode) { this.hideInitLayout().toogleCopyMode(copyMode || app.copyMode ).setBase64(base64).setImgSrc().setScaleContainer().showLoader(); return this; }, endUploadLayout(texts) { this.setTexts(texts).insertText().hideLoader(); return this; }, restoreInitLayout() { this.clearState().hideImageLayout().hideLoader().setInfToStorage().showInitLayout(); return this; }, }; ``` -------------------------------- ### Define Type Button HTML Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Returns the HTML for either a copy line button or a Google search button based on the provided type. ```javascript defineTypeButton(type, height) { return type === 'copy' ? this.createCopyLineButtonHTML(height) : this.createGoogleSearchHTML(height); } ``` -------------------------------- ### Toggle Copy/Google Mode Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Switches the application's mode between 'copy' and 'google' search functionality, updating the UI accordingly. ```javascript window.onToogleMode = function(e) { var btn = duoButton.querySelector('div'); var curType = btn.dataset.type; var newType = curType === 'copy' ? 'google' : 'copy'; app.toogleCopyMode(newType); } ``` -------------------------------- ### Insert Text into UI Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Placeholder function for inserting text; currently does nothing if no texts are available. ```javascript insertText() { if(!this.texts.length) return thi ``` -------------------------------- ### Docker Compose for FastAPI Deployment Source: https://context7.com/devforth/imagetotext.app/llms.txt Defines the Docker service for the FastAPI application, using Gunicorn with Uvicorn workers to serve the OCR API on port 8314. Includes network mode and build context. ```yaml version: '3.5' services: fastapi: network_mode: host build: ../ command: > /bin/bash -c "cd /code/ && pipenv run gunicorn main:app -k uvicorn.workers.UvicornWorker --workers 1 --threads 2 --bind 0.0.0.0:8314" restart: always ``` -------------------------------- ### Reset Application State Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Clears the displayed text, resets the image source, and restores the initial layout of the application. ```javascript imgTexts.innerHTML = ''; img.src = ''; imgTexts.style.transform = ''; this.restoreInitLayout(); ``` -------------------------------- ### Handle Image Upload Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Triggers a file input click to allow users to upload an image. It then processes the selected file for conversion and submission. ```javascript window.onUploadClick = function() { var input = app.removeInputElement().createInputElement(); input.click(); input.addEventListener('change', function(e) { var { files } = input; var file = Array.from(files)[0]; convertOnBase64AndSubmit(file); }, {once: true}); container.append(input); } ``` -------------------------------- ### FastAPI OCR Upload Route Handler Source: https://context7.com/devforth/imagetotext.app/llms.txt Server-side implementation for OCR endpoint. Decodes Base64 images, uses Tesseract for text recognition, and returns detected text with bounding boxes. Handles very large images by increasing Pillow's MAX_IMAGE_PIXELS. ```python from fastapi import FastAPI from pydantic import BaseModel from typing import List import base64, io from PIL import Image from tesserocr import PyTessBaseAPI, RIL, iterate_level, OEM app = FastAPI() Image.MAX_IMAGE_PIXELS = 1_000_000_000 # allow very large images class ImageModel(BaseModel): base64: str class ImageItemResp(BaseModel): text: str width: int height: int left: int top: int class ImageModalResp(BaseModel): texts: List[ImageItemResp] @app.post('/upload/', response_model=ImageModalResp) def upload(request: ImageModel): # Decode Base64 → bytes → PIL Image msg = base64.b64decode(request.base64) buf = io.BytesIO(msg) image = Image.open(buf) text_list = [] with PyTessBaseAPI(oem=OEM.LSTM_ONLY) as api: api.SetImage(image) api.Recognize() api.SetVariable("save_blob_choices", "T") ri = api.GetIterator() level = RIL.TEXTLINE for r in iterate_level(ri, level): symbol = r.GetUTF8Text(level) bbox = r.BoundingBoxInternal(level) # (left, top, right, bottom) text_list.append({ "text": symbol, "left": bbox[0], "top": bbox[1], "width": bbox[2] - bbox[0], "height": bbox[3] - bbox[1], }) return {"texts": text_list} ``` -------------------------------- ### Copy Link to Clipboard Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Copies the current document URL to the clipboard and provides user feedback. ```javascript window.onCopyLink = function() { navigator.clipboard.writeText(document.URL); app.showMessage('Link copied to clipboard').hideMessageWithInterval(); }; ``` -------------------------------- ### Drag and Drop Event Listeners Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Sets up event listeners for drag and drop functionality to allow users to upload images by dragging them onto the application area. ```javascript window.addEventListener('dragenter', function(e) { e.preventDefault(); app.showDragging(); }, false); window.addEventListener('dragleave', function(e) { e.preventDefault(); if (e.fromElement === null) { app.hideDragging(); } }, false); window.addEventListener("dragover",function(e) { e.preventDefault(); }, false); window.addEventListener("drop", function(e) { e.preventDefault(); e.stopPropagation(); app.hideDragging(); var files = Array.from(e.dataTransfer.files); convertOnBase64AndSubmit(files[0]); }, false); ``` -------------------------------- ### HTML Element Generation Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Contains utility functions for creating HTML elements dynamically, such as a loading spinner, buttons for copying text lines or performing Google searches, and the HTML structure for copy mode toggling. ```javascript var html = { getLoaderElement() { var div = document.createElement('div'); div.classList.add('lds-ring', 'loader', 'loader-texts'); div.innerHTML = '
'; return div; }, createCopyLineButtonHTML(height) { //var id = Date.now() + '-' + height + '-' + Math.random() * 100; return ` `; }, createGoogleSearchHTML(height) { return ` `; }, copyModeHTML(type) { return type === 'copy' ? `
Copy
Google` : `Copy
Google
` }, createTextHTML({text, top, left, width, ``` -------------------------------- ### Calculate Text Size Parameters Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Computes various height-related parameters for texts, including max, min, average, and popular heights, along with their rates. ```javascript calcTextSizeParams() { var maxH = 0; var minH = Infinity; var sum = 0; this.texts.forEach(function(text) { if(text.height > maxH) maxH = text.height; if(text.height < minH) minH = text.height; sum += text.height; }); sum -= maxH + minH; var averageH = sum / this.texts.length - 2; var popularH = this.texts.sort((a,b) => this.texts.filter(v => v.height=== a.height).length - this.texts.filter(v => v.height === b.height).length ) popularH = popularH[popularH.length - 1].height; var maxRate = maxH / popularH; var minRate = popularH / minH; return { maxH, averageH, popularH, maxRate, minRate, minH, } } ``` -------------------------------- ### Window Resize Handler Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Adjusts the application's layout and scaling when the browser window is resized. ```javascript window.onresize = function(e) { app.setScaleContainer(); } ``` -------------------------------- ### Set Image Information to Local Storage Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Saves image-related information (base64, texts, copy mode) to local storage. Includes error handling for storage operations. ```javascript setInfToStorage() { try { localStorage.setItem('imgInf', JSON.stringify({ base64: this.base64, texts: this.texts, mode: this.copyMode, })); } catch (e) { console.log(e); } return this; } ``` -------------------------------- ### Set Base64 Image Data in Application State Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Updates the 'base64' property in the application's state. ```javascript setBase64(base64) { this.base64 = base64; return this; } ``` -------------------------------- ### API Request with JSON Payload Source: https://github.com/devforth/imagetotext.app/blob/master/Readme.md Sends a POST request to the `/upload/` endpoint with a JSON payload containing a base64 encoded string. This is useful for testing the API directly. ```bash curl -d '{"base64":"baeldung"}' -H 'Content-Type: application/json' http://127.0.0.1:8000/upload/ ``` -------------------------------- ### Call REST API for Text Extraction Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Handles the asynchronous fetching of text from an image using a POST request to the '/upload/' endpoint. It expects a JSON response and rejects the promise if the response is not OK. ```javascript function callREST(img) { return new Promise(function(resolve, reject) { fetch('/upload/', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify(img), }).then(response => { if(!response.ok) { reject('no text'); return; } response.json().then(json => { resolve(json) }); }) }) } ``` -------------------------------- ### POST /upload/ - Image OCR Endpoint Source: https://context7.com/devforth/imagetotext.app/llms.txt This endpoint accepts a Base64 encoded image, performs OCR using Tesseract, and returns detected text along with bounding box information. ```APIDOC ## POST /upload/ ### Description This endpoint processes an image to extract text using Tesseract OCR. It decodes a Base64 encoded image, performs OCR, and returns a list of detected text lines with their bounding box coordinates. ### Method POST ### Endpoint /upload/ ### Parameters #### Request Body - **base64** (string) - Required - Base64 encoded string of the image. ### Request Example ```json { "base64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII=" } ``` ### Response #### Success Response (200) - **texts** (array) - A list of detected text items. - **text** (string) - The extracted text. - **left** (integer) - The x-coordinate of the bounding box. - **top** (integer) - The y-coordinate of the bounding box. - **width** (integer) - The width of the bounding box. - **height** (integer) - The height of the bounding box. #### Response Example ```json { "texts": [ { "text": "Sample Text", "left": 10, "top": 20, "width": 100, "height": 30 } ] } ``` ``` -------------------------------- ### Send Image Data to Server Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Initiates the upload process by sending base64 encoded image data to the server via a REST call. Handles success and error responses. ```javascript function sendImageToServer(base64) { app.startUploadLayout(base64); try { callREST({base64: app.imgData}) .then(function(resp) { app.endUploadLayout(resp.texts); }) .catch(() => { app.hideLoader(); }) } catch (e) { app.hideLoader(); } } ``` -------------------------------- ### Frontend JavaScript API Call Source: https://context7.com/devforth/imagetotext.app/llms.txt JavaScript function to send an image to the /upload/ endpoint and handle the response. ```APIDOC ## Frontend JavaScript API Call ### Description This JavaScript function `callREST` takes a Base64 encoded image object and sends it to the `/upload/` endpoint using a POST request. It returns a Promise that resolves with the JSON response from the server or rejects if an error occurs. ### Usage ```javascript // img is an object like { base64: "..." } callREST(img) .then(function(resp) { // resp.texts = [{ text, left, top, width, height }, ...] console.log(resp.texts); }) .catch(function(error) { console.error("Error processing image:", error); }); ``` ### Function Definition ```javascript function callREST(img) { return new Promise(function(resolve, reject) { fetch('/upload/', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(img), }).then(response => { if (!response.ok) { reject('no text'); return; } response.json().then(json => resolve(json)); }); }); } ``` ``` -------------------------------- ### Format Text Line HTML Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Generates HTML for a formatted text line, including positioning, styling, and a button for copy mode. ```javascript return

${text}

${inst.defineTypeButton(inst.copyMode, height)}
``` -------------------------------- ### Handle Paste Event Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Processes pasted content, specifically looking for files and converting them to base64 for submission. ```javascript document.onpaste = function(e) { app.reset(); var { items } = event.clipboardData || event.originalEvent.clipboardData; Array.from(items).forEach(function(item) { if (item.kind === 'file') { var blob = item.getAsFile(); convertOnBase64AndSubmit(blob); } }); } ``` -------------------------------- ### Perform Google Search Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Opens a new window to perform a Google search using the text content of a preceding sibling element. ```javascript window.onGoogleSearch = function(e) { var text = e.previousElementSibling.textContent.trim(); var newW = window.open('https://www.google.com/search?q='+text); } ``` -------------------------------- ### Extract Text from Image (POST /upload/) Source: https://context7.com/devforth/imagetotext.app/llms.txt This is the core OCR endpoint. It accepts a Base64-encoded image and returns detected text lines with bounding boxes. Ensure the request body is JSON with a 'base64' field. ```json { "base64": "" } ``` ```json { "texts": [ { "text": "Hello World\n", "left": 42, "top": 100, "width": 210, "height": 28 } ] } ``` ```bash curl -X POST http://127.0.0.1:8000/upload/ \ -H 'Content-Type: application/json' \ -d '{"base64":""}' # Expected response: # { # "texts": [ # {"text": "Sample text line\n", "left": 10, "top": 5, "width": 300, "height": 22}, # {"text": "Another line\n", "left": 10, "top": 32, "width": 200, "height": 22} # ] # } ``` ```bash # testreq.json contains {"base64": ""} curl -X POST http://127.0.0.1:8000/upload/ \ -H 'Content-Type: application/json' \ -d @testreq.json ``` ```python import base64, requests # Load any image and encode it with open("screenshot.png", "rb") as f: img_b64 = base64.b64encode(f.read()).decode("utf-8") response = requests.post( "http://127.0.0.1:8000/upload/", json={"base64": img_b64}, timeout=60, ) response.raise_for_status() data = response.json() for item in data["texts"]: print(f"[{item['top']},{item['left']}] {item['text'].strip()!r} " f"({item['width']}x{item['height']}px)") # Example output: # [5, 10] 'Invoice #12345' (280x24px) # [35, 10] 'Date: 2024-01-15' (190x22px) # [65, 10] 'Total: $499.00' (175x22px) ``` -------------------------------- ### POST /upload/ Source: https://context7.com/devforth/imagetotext.app/llms.txt Extracts text from an image using OCR. Accepts a Base64-encoded image and returns structured text data with bounding boxes. ```APIDOC ## POST /upload/ ### Description Accepts a JSON body with a single `base64` field containing a Base64-encoded image (without the `data:image/...;base64,` prefix — just the raw Base64 string). Decodes the image, runs Tesseract LSTM OCR, and returns a list of detected text lines with their bounding boxes. ### Method POST ### Endpoint /upload/ ### Parameters #### Request Body - **base64** (string) - Required - Base64-encoded image bytes. ### Request Example #### JSON Body ```json { "base64": "" } ``` #### curl example using an inline Base64 string: ```bash curl -X POST http://127.0.0.1:8000/upload/ \ -H 'Content-Type: application/json' \ -d '{"base64":""}' ``` #### curl example using the bundled `testreq.json` test file: ```bash # testreq.json contains {"base64": ""} curl -X POST http://127.0.0.1:8000/upload/ \ -H 'Content-Type: application/json' \ -d @testreq.json ``` ### Response #### Success Response (200) - **texts** (array) - A list of detected text lines. - **text** (string) - The extracted text content of the line. - **left** (integer) - The x-coordinate of the bounding box. - **top** (integer) - The y-coordinate of the bounding box. - **width** (integer) - The width of the bounding box. - **height** (integer) - The height of the bounding box. #### Response Example ```json { "texts": [ {"text": "Hello World\n", "left": 42, "top": 100, "width": 210, "height": 28} ] } ``` #### Python client example: ```python import base64, requests # Load any image and encode it with open("screenshot.png", "rb") as f: img_b64 = base64.b64encode(f.read()).decode("utf-8") response = requests.post( "http://127.0.0.1:8000/upload/", json={"base64": img_b64}, timeout=60, ) response.raise_for_status() data = response.json() for item in data["texts"]: print(f"[{item['top']},{item['left']}] {item['text'].strip()!r} " f"({item['width']}x{item['height']}px)") # Example output: # [5, 10] 'Invoice #12345' (280x24px) # [35, 10] 'Date: 2024-01-15' (190x22px) # [65, 10] 'Total: $499.00' (175x22px) ``` ``` -------------------------------- ### Set Texts in Application State Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Updates the 'texts' property in the application's state. ```javascript setTexts(texts) { this.texts = texts; return this; } ``` -------------------------------- ### UI Element Visibility Management Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Provides methods to control the visibility of various UI elements like loaders, dragging indicators, text on images, and messages. It also includes a method to show and hide the application's message area. ```javascript var hideAndShowElm = { showLoader() { this.hideLoader(); this.isLoading = true; var imgWrap = container.querySelector('.content'); imgWrap.insertAdjacentElement('afterbegin', this.getLoaderElement()); return this; }, hideLoader() { this.isLoading = false; var loader = container.querySelector('.loader'); if(loader) loader.remove(); return this; }, showDragging() { dragging.classList.remove('hide'); return this; }, hideDragging() { dragging.classList.add('hide'); return this; }, showTextOnImg() { imgTexts.classList.remove('hide'); return this; }, hideTextOnImg() { imgTexts.classList.add('hide'); return this; }, showMessage(text) { message.classList.remove('hide'); message.querySelector('.message-text').innerHTML = text.trim(); return this; }, hideMessage(text) { message.classList.add('hide'); return this; }, }; ``` -------------------------------- ### Clear Application State Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Resets specified properties of the application instance to null, excluding functions and explicitly included properties. ```javascript clearState(includeProp = ['root']) { var inst = this; Object.keys(inst).forEach(function(key) { if(typeof inst[key] !== 'function' && !includeProp.includes(key)) { inst[key] = null; } }); return this; } ``` -------------------------------- ### Replace Text Buttons in UI Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Iterates through formatted text lines and replaces existing buttons with new ones based on the specified copy mode. ```javascript replaceTextButton(type = 'copy') { var inst = this; imgTexts.querySelectorAll('.text.notformated').forEach(function(line) { var button = line.querySelector('.btn.text-btn'); var height = parseInt(button.style.height, 10); button.remove(); var newBtn = inst.defineTypeButton(type, height); line.insertAdjacentHTML('beforeend', newBtn); }) return this; } ``` -------------------------------- ### Copy Single Line of Text Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Copies the text content of a preceding sibling element to the clipboard, providing user feedback. ```javascript window.onCopyLineClick = function(e) { var text = e.previousElementSibling.textContent.trim(); navigator.clipboard.writeText(text); app.showMessage('Text copied to clipboard').hideMessageWithInterval(); } ``` -------------------------------- ### Set Scale for Image Container Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Applies a CSS transform scale to the 'imgTexts' element based on the calculated image scale, only if the scale is greater than 1. ```javascript setScaleContainer() { this.getImgScale(this.base64) .then(function(scale) { if(scale < 1) return; imgTexts.style.transform = `scale(${(1 / scale).toFixed(6)})` }); return this; } ``` -------------------------------- ### Browser-Side API Call for OCR Source: https://context7.com/devforth/imagetotext.app/llms.txt JavaScript function to send Base64 image data to the FastAPI backend and handle the JSON response. It returns a Promise that resolves with the parsed JSON or rejects on error. ```javascript // Called inside the browser after reading a file or paste event function callREST(img) { return new Promise(function(resolve, reject) { fetch('/upload/', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(img), // img = { base64: "..." } }).then(response => { if (!response.ok) { reject('no text'); return; } response.json().then(json => resolve(json)); }); }); } // Usage — triggered after a file drop or paste: callREST({ base64: app.imgData }) .then(function(resp) { // resp.texts = [{ text, left, top, width, height }, ...] app.endUploadLayout(resp.texts); }) .catch(() => app.hideLoader()); ``` -------------------------------- ### Image Text Processing Logic Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Handles text processing on an image, adjusting text height based on anomaly rates and displaying it. It also manages the visibility and adjustment of text elements on the image. ```javascript if(this.texts.length === 1 && !this.texts[0].text.trim()) return this; var inst = this; this.setInfToStorage(); var textParams = this.calcTextSizeParams(); this.hideTextOnImg(); this.texts.forEach(function(text) { if(inst.chechIsAnomalyRate('max', text.height, textParams)) text.height = textParams.popularH; if(inst.chechIsAnomalyRate('min', text.height, textParams)) text.height = textParams.popularH; var html = inst.createTextHTML(text); imgTexts.insertAdjacentHTML('beforeend', html); }); setTimeout(function() { inst.showTextOnImg(); inst.adjustFontSize('.text.notformated', 'p'); }, 0); return this; ``` -------------------------------- ### Calculate Container Width Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Retrieves and parses the computed max-width style of the container element. ```javascript calcContainerWidth() { this.containerWidth = window.parseInt(window.getComputedStyle(container)['max-width']); return this.containerWidth; } ``` -------------------------------- ### Calculate Image Scale Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Calculates the scaling factor of an image based on its width and the container's maximum width. ```javascript calcScale(size) { var containerWidth = this.calcContainerWidth(); var scale = size.width / containerWidth; this.scale = scale; return scale; } ``` -------------------------------- ### Check for Anomaly Rate Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Determines if a text's height matches an anomaly rate threshold for a given type (max or min). ```javascript chechIsAnomalyRate(type, textHeight, textParams) { return Boolean(textParams[`${type}Rate`] >= this.anomalyRate && textParams[`${type}H`] === textHeight) } ``` -------------------------------- ### Remove File Input Element Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Removes the file input element with the class 'input-upload' if it exists within the container. ```javascript removeInputElement() { var inputExist = container.querySelector('.input-upload'); if(inputExist) inputExist.remove(); return this; } ``` -------------------------------- ### Hide Message with Interval Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Hides a message after a specified delay using setTimeout. Defaults to 2000ms. ```javascript hideMessageWithInterval(t = 2000) { var inst = this; setTimeout(function() { inst.hideMessage(); }, t); return this; } ``` -------------------------------- ### Adjust Font Size Based on Parent Width Source: https://github.com/devforth/imagetotext.app/blob/master/templates/home.html Adjusts the font size of a child element to fit within its parent's width, with a cap based on parent height. ```javascript adjustFontSize(parentBlock, childBlock) { document.querySelectorAll(parentBlock).forEach(function(parent) { var needWidth = parent.clientWidth var parentH = parent.clientHeight; var child = parent.querySelector(childBlock) var currentWidth = child.clientWidth; var fs = child.style.fontSize.replace('px','') var rate = needWidth / currentWidth; var newFs = fs * rate; var fsBigH = (newFs * 100 / parentH) - 100; child.style.fontSize = fsBigH > 50 ? parentH + 'px' : newFs + 'px'; }); } ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.