### Install python-poppler from Git Source

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/installation.md

This method involves cloning the python-poppler repository and then installing it using pip. This process compiles the C++ bindings and installs the package. Ensure all prerequisites, including the poppler library, are installed prior to running these commands.

```bash
git clone https://github.com/cbrunet/python-poppler.git
pip install --use-pep517 .
```

--------------------------------

### Install python-poppler from PyPI

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/installation.md

This command installs the python-poppler package using pip. Ensure all system requirements, including the correct poppler library version, are met beforehand. It's recommended to perform this installation within a Python virtual environment.

```bash
pip install --use-pep517 python-poppler
```

--------------------------------

### Verify Poppler Version in Python

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/installation.md

A simple Python snippet to import the poppler library and print its version. This is used to verify that the installation was successful and that the correct version of the Poppler library is being used by the python-poppler bindings.

```python
import poppler
print(poppler.version())
```

--------------------------------

### Compile Poppler from Source and Set Environment Variables

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/installation.md

Instructions for compiling a custom version of the Poppler library from its source code and setting environment variables to ensure python-poppler can find the compiled library. This is useful if a more recent version of Poppler is required than what is available in system repositories. It includes build configuration with CMake and setting PKG_CONFIG_PATH and LD_LIBRARY_PATH.

```bash
git clone https://gitlab.freedesktop.org/poppler/poppler.git
cd poppler
git checkout poppler-0.89.0
mkdir build
cd build
cmake \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX:PATH=/usr/local \
    -DENABLE_UNSTABLE_API_ABI_HEADERS=ON \
    -DBUILD_GTK_TESTS=OFF \
    -DBUILD_QT5_TESTS=OFF \
    -DBUILD_CPP_TESTS=OFF \
    -DENABLE_CPP=ON \
    -DENABLE_GLIB=OFF \
    -DENABLE_GOBJECT_INTROSPECTION=OFF \
    -DENABLE_GTK_DOC=OFF \
    -DENABLE_QT5=OFF \
    -DBUILD_SHARED_LIBS=ON \
    ..
sudo make install
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
```

--------------------------------

### Load and Render PDF Page in Python

Source: https://github.com/cbrunet/python-poppler/blob/master/README.md

Demonstrates loading a PDF document, accessing a specific page, extracting its text, and rendering the page into an image format using python-poppler. Requires the 'poppler' library to be installed.

```python
from poppler import load_from_file, PageRenderer

pdf_document = load_from_file("sample.pdf")
page_1 = pdf_document.create_page(0)
page_1_text = page_1.text()

renderer = PageRenderer()
image = renderer.render_page(page_1)
image_data = image.data
```

--------------------------------

### Get Font Information from PDF Page (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Demonstrates how to retrieve font information associated with text boxes on a specific PDF page. This requires passing the `text_list_include_font` option to the `text_list` method. It shows how to access the font name and size for a given text box.

```python
font_iterator = document.create_font_iterator()
for page, fonts in font_iterator:
    print(f"Fonts for page {page}")
    for font in fonts:
        print(f"- {font.name}")
```

```python
boxes = pdf_page.text_list(pdf_page.TextListOption.text_list_include_font)
box = boxes[0]

assert box.has_font_info
print(box.get_font_name())
print(box.get_font_size())
```

--------------------------------

### Get Named Destinations Map in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Retrieves a map of named destinations from a PDF document. It iterates through each destination, printing its name, type, page number, coordinates, zoom level, and change status flags. This functionality is useful for navigating or understanding the structure of a PDF document.

```python
destinations = pdf_document.create_destination_map()

for name, destination in destinations.items():
    print(f"\nDestination: {name}")
    print(f"Type: {destination.type}")
    print(f"Page number: {destination.page_number}")

    # Destination coordinates and zoom
    print(f"Left: {destination.left}")
    print(f"Top: {destination.top}")
    print(f"Right: {destination.right}")
    print(f"Bottom: {destination.bottom}")
    print(f"Zoom: {destination.zoom}")

    # Check if destination values are set
    print(f"Is change left: {destination.is_change_left}")
    print(f"Is change top: {destination.is_change_top}")
    print(f"Is change zoom: {destination.is_change_zoom}")
```

--------------------------------

### Convert PDF Image to QImage (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Demonstrates converting a PDF page image into a Qt `QImage` object. This facilitates integration with Qt applications. The conversion requires a mapping between Poppler's `ImageFormat` and Qt's `QtGui.QImage.Format`.

```python
# Assuming 'image' is an Image object obtained from rendering a PDF page
# and 'QtGui' is imported from PyQt5 or PySide2

P2QFormat = {
    ImageFormat.invalid: QtGui.QImage.Format_Invalid,
    ImageFormat.argb32: QtGui.QImage.Format_ARGB32,
    ImageFormat.bgr24: QtGui.QImage.Format_BGR888,
    ImageFormat.gray8: QtGui.QImage.Format_Grayscale8,
    ImageFormat.mono: QtGui.QImage.Format_Mono,
    ImageFormat.rgb24: QtGui.QImage.Format_RGB888,
}

qimg = QtGui.QImage(image.data, image.width, image.height,
                    image.bytes_per_row,
                    P2QFormat[image.format])
```

--------------------------------

### Render PDF Page to Image (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Illustrates the process of converting a PDF page into an image format. It involves creating a `PageRenderer` object and then using its `render_page` method to obtain an `Image` object.

```python
# Assuming 'document' is a loaded Document object
page_number = 0  # Example page number
pdf_page = document.pages[page_number]

# Create a PageRenderer object
renderer = pdf_page.create_renderer()

# Render the page to an Image object
image = renderer.render_page()
```

--------------------------------

### Load PDF Documents in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Demonstrates loading PDF documents from files, byte data, or file-like objects using python-poppler. It also shows how to handle password-protected PDFs and access basic document properties like page count and encryption status. Dependencies include the poppler library.

```python
from poppler import load_from_file, load_from_data, load
from pathlib import Path

# Load from file path (string or Path object)
pdf_document = load_from_file("document.pdf")

# Load password-protected document
pdf_document = load_from_file("secure.pdf", owner_password="owner", user_password="user")

# Load from bytes
with open("document.pdf", "rb") as f:
    file_data = f.read()
pdf_document = load_from_data(file_data)

# Load using generic function (accepts str, Path, bytes, or file-like objects)
pdf_document = load("document.pdf", owner_password="owner")
pdf_document = load(Path("document.pdf"))

with open("document.pdf", "rb") as f:
    pdf_document = load(f)

# Check document properties
print(f"Pages: {pdf_document.pages}")
print(f"Encrypted: {pdf_document.is_encrypted()}")
print(f"Locked: {pdf_document.is_locked()}")
print(f"PDF Version: {pdf_document.pdf_version}")  # Returns tuple like (1, 5)

# Unlock a locked document
if pdf_document.is_locked():
    unlocked = pdf_document.unlock("owner_pass", "user_pass")
    print(f"Successfully unlocked: {not unlocked}")
```

--------------------------------

### Manage PDF Document Destinations and Links

Source: https://context7.com/cbrunet/python-poppler/llms.txt

This Python snippet shows how to load a PDF document and access its named destinations and document links for navigation purposes. It requires the `poppler` library version 0.74.0 or later. The code initializes the PDF document and prepares for potential operations on destinations and links, although the specific extraction logic for these is not detailed in the provided snippet.

```python
from poppler import load_from_file, DestinationType

pdf_document = load_from_file("document.pdf")


```

--------------------------------

### Convert PDF Image to PIL Image (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Shows how to convert a PDF page image into a PIL (Pillow) `Image` object. This is useful for further image manipulation with the Pillow library. Note that a copy of the image data is unavoidable in this conversion.

```python
from PIL import Image, ImageTk

# Assuming 'image' is an Image object obtained from rendering a PDF page

pil_image = Image.frombytes(
    "RGBA",
    (image.width, image.height),
    image.data,
    "raw",
    str(image.format),
 )
# tk_image = ImageTk.PhotoImage(pil_image) # Example for Tkinter
```

--------------------------------

### Create and Use Rectangles in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Illustrates the creation and manipulation of Rectangle objects for defining regions or bounding boxes within a PDF page. Rectangles can be created with specific coordinates and dimensions, and their properties can be accessed. They are also used to extract text from specific areas of a page.

```python
from poppler import Rectangle

# Create rectangle (x, y, width, height)
rect = Rectangle(100.0, 150.0, 200.0, 300.0)

# Access coordinates
print(f"X: {rect.x}")
print(f"Y: {rect.y}")
print(f"Width: {rect.width}")
print(f"Height: {rect.height}")

# Get as tuple
coords = rect.as_tuple()  # (x, y, width, height)

# Create empty rectangle
empty_rect = Rectangle(0.0, 0.0, 0.0, 0.0)

# Rectangles are used for text extraction regions
page = pdf_document.create_page(0)
region_text = page.text(rect=rect)
```

--------------------------------

### Render PDF Page to Image

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Renders a PDF page to raw image data. Supports customizable resolution, rendering hints (like antialiasing), paper color, image format, rotation, and specific region rendering. The output can be accessed as raw bytes or converted to NumPy arrays or PIL Images.

```python
from poppler import load_from_file, PageRenderer, RenderHint, ImageFormat, Rotation
import numpy as np
from PIL import Image as PILImage

pdf_document = load_from_file("document.pdf")
page = pdf_document.create_page(0)

# Create renderer with default settings
renderer = PageRenderer()

# Check if rendering is supported
if not PageRenderer.can_render():
    raise RuntimeError("Poppler compiled without rendering support")

# Configure rendering options
renderer.set_render_hint(RenderHint.antialiasing, True)
renderer.set_render_hint(RenderHint.text_antialiasing, True)
renderer.render_hints = RenderHint.antialiasing | RenderHint.text_antialiasing

# Set paper color (default is white)
renderer.paper_color = (255, 255, 255)

# Set image format (requires poppler >= 0.65.0)
renderer.image_format = ImageFormat.argb32

# Render page at 150 DPI
image = renderer.render_page(page, xres=150.0, yres=150.0)

# Render with rotation
image = renderer.render_page(page, xres=72.0, yres=72.0, rotate=Rotation.rotate_90)

# Render specific region (x, y, width, height in pixels)
image = renderer.render_page(page, xres=72.0, yres=72.0, x=0, y=0, w=400, h=600)

# Access image data
print(f"Image size: {image.width}x{image.height}")
print(f"Format: {image.format}")
print(f"Bytes per row: {image.bytes_per_row}")
print(f"Valid: {image.is_valid}")

# Get raw image bytes
image_bytes = image.data

# Save image to file
image.save("output.png", ImageFormat.argb32, dpi=150)

# Convert to numpy array (zero-copy)
array = np.array(image.memoryview(), copy=False)
print(f"Array shape: {array.shape}")

# Convert to PIL Image
pil_image = PILImage.frombytes(
    "RGBA",
    (image.width, image.height),
    image.data,
    "raw",
    str(image.format)
)
pil_image.save("output_pil.png")
```

--------------------------------

### Access PDF Page Properties and Layout

Source: https://context7.com/cbrunet/python-poppler/llms.txt

This Python code retrieves and prints various properties of each page within a PDF document, including its label, orientation, duration, and dimensions for different page boxes (media, crop, bleed, trim, art). It also accesses page transition effects if present. Finally, it fetches and displays the document's overall page layout and mode. Dependencies include the `poppler` library and its `PageBox`, `PageLayout`, and `PageMode` enums.

```python
from poppler import load_from_file, PageBox, Rotation

pdf_document = load_from_file("document.pdf")

for page_index in range(pdf_document.pages):
    page = pdf_document.create_page(page_index)

    print(f"\n--- Page {page_index} ---")
    print(f"Label: {page.label}")
    print(f"Orientation: {page.orientation}")
    print(f"Duration: {page.duration}")

    media_box = page.page_rect(PageBox.media_box)
    crop_box = page.page_rect(PageBox.crop_box)
    bleed_box = page.page_rect(PageBox.bleed_box)
    trim_box = page.page_rect(PageBox.trim_box)
    art_box = page.page_rect(PageBox.art_box)

    print(f"Media box: {media_box.as_tuple()}")
    print(f"Crop box: {crop_box.as_tuple()}")

    transition = page.transition()
    if transition:
        print(f"Transition type: {transition.type}")
        print(f"Duration: {transition.duration}")
        print(f"Alignment: {transition.alignment}")
        print(f"Direction: {transition.direction}")
        print(f"Angle: {transition.angle}")
        print(f"Scale: {transition.scale}")
        print(f"Rectangular: {transition.is_rectangular}")

from poppler import PageLayout, PageMode
print(f"Page layout: {pdf_document.page_layout}")
print(f"Page mode: {pdf_document.page_mode}")
```

--------------------------------

### Manage PDF Document Permissions

Source: https://context7.com/cbrunet/python-poppler/llms.txt

This Python script checks various permissions of a PDF document, such as the ability to print, modify, copy text, add annotations, fill forms, extract content for accessibility, assemble the document, and perform high-resolution printing. It requires the `poppler` library and optionally the owner password for protected documents. The output provides a clear indication of which permissions are granted or denied.

```python
from poppler import load_from_file, Permission

pdf_document = load_from_file("document.pdf", owner_password="owner")

can_print = pdf_document.has_permission(Permission.print)
can_modify = pdf_document.has_permission(Permission.change)
can_copy = pdf_document.has_permission(Permission.copy)
can_annotate = pdf_document.has_permission(Permission.add_notes)
can_fill_forms = pdf_document.has_permission(Permission.fill_forms)
can_extract = pdf_document.has_permission(Permission.accessibility)
can_assemble = pdf_document.has_permission(Permission.assemble)
can_print_hires = pdf_document.has_permission(Permission.print_high_resolution)

print(f"Print: {can_print}")
print(f"Modify: {can_modify}")
print(f"Copy text: {can_copy}")
print(f"Add annotations: {can_annotate}")
print(f"Fill forms: {can_fill_forms}")
print(f"Extract for accessibility: {can_extract}")
print(f"Assemble document: {can_assemble}")
print(f"High-resolution print: {can_print_hires}")

all_permissions = [
    ("Print", Permission.print),
    ("Modify", Permission.change),
    ("Copy", Permission.copy),
    ("Annotate", Permission.add_notes),
    ("Fill Forms", Permission.fill_forms),
    ("Accessibility", Permission.accessibility),
    ("Assemble", Permission.assemble),
    ("High-Res Print", Permission.print_high_resolution),
]

print("\nPermissions summary:")
for name, perm in all_permissions:
    status = "✓" if pdf_document.has_permission(perm) else "✗"
    print(f"  {status} {name}")
```

--------------------------------

### Manage PDF Document Metadata in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Illustrates how to read and modify standard and custom metadata for PDF documents using python-poppler. This includes author, title, dates, and user-defined fields. It also covers saving modified documents and accessing document IDs. Requires poppler version 0.46.0 or later for metadata modification.

```python
from poppler import load_from_file
from datetime import datetime

pdf_document = load_from_file("document.pdf", owner_password="owner")

# Read standard metadata properties
print(f"Title: {pdf_document.title}")
print(f"Author: {pdf_document.author}")
print(f"Creator: {pdf_document.creator}")
print(f"Producer: {pdf_document.producer}")
print(f"Subject: {pdf_document.subject}")
print(f"Keywords: {pdf_document.keywords}")
print(f"Creation Date: {pdf_document.creation_date}")
print(f"Modification Date: {pdf_document.modification_date}")

# Get all metadata as dictionary
infos = pdf_document.infos()
for key, value in infos.items():
    print(f"{key}: {value}")

# Modify metadata (requires poppler >= 0.46.0)
pdf_document.author = "Charles Brunet"
pdf_document.title = "Sample Document"
pdf_document.creation_date = datetime(2024, 1, 1, 12, 0, 0)
pdf_document.keywords = "python, pdf, poppler"

# Set custom metadata keys
pdf_document.set_info_key("CustomField", "Custom Value")
pdf_document.set_info_date("CustomDate", datetime.now())

# Save modified document
pdf_document.save("modified_document.pdf")

# Save a copy without modifications
pdf_document.save_a_copy("copy_document.pdf")

# Get PDF ID
pdf_id = pdf_document.pdf_id
print(f"Permanent ID: {pdf_id.permanent_id}")
print(f"Update ID: {pdf_id.update_id}")
```

--------------------------------

### Navigate PDF Table of Contents

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Provides functionality to access and traverse the table of contents (TOC) structure of a PDF document. It allows retrieving the root of the TOC and recursively printing its items, including their titles and open/closed status. Child items can also be accessed directly.

```python
from poppler import load_from_file

pdf_document = load_from_file("document.pdf")

# Get table of contents
toc = pdf_document.create_toc()

if toc:
    # Get root item
    root = toc.root

    def print_toc_item(item, level=0):
        """Recursively print TOC structure"""
        indent = "  " * level
        open_status = "[open]" if item.is_open else "[closed]"
        print(f"{indent}{item.title} {open_status}")

        # Iterate through children
        for child in item:
            print_toc_item(child, level + 1)

    # Print entire TOC
    print_toc_item(root)

    # Access children directly
    children = root.children()
    for child in children:
        print(f"TOC Item: {child.title}")
        print(f"Is Open: {child.is_open}")
else:
    print("Document has no table of contents")
```

--------------------------------

### Convert PDF Image to NumPy Array (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Explains how to convert a PDF page image into a NumPy array using the buffer protocol via `memoryview`. This allows direct access and modification of image data without copying, enabling efficient array operations. Changes to the NumPy array directly affect the image data.

```python
import numpy

# Assuming 'image' is an Image object obtained from rendering a PDF page

a = numpy.array(image.memoryview(), copy=False)
print(a[0, 0, 0])
print(image.data[0])  # Value of the first byte of the image

a[0, 0, 0] = 0
print(image.data[0])  # It is now 0
```

--------------------------------

### Enable/Disable Poppler Logging (Python)

Source: https://github.com/cbrunet/python-poppler/blob/master/docs/usage.md

Provides methods to control the logging output of the Poppler library. You can disable all error messages by calling `enable_logging(False)` and re-enable them by calling `enable_logging(True)`.

```python
# disable logging
poppler.enable_logging(False)

# enable logging to stderr again
poppler.enable_logging(True)
```

--------------------------------

### Inspect PDF Fonts

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Retrieves information about fonts used in a PDF document. It can fetch all fonts at once or iterate through them page by page. Information includes font name, type, embedding status, and subset status. Supports mapping FontType enum to human-readable names.

```python
from poppler import load_from_file, FontType

pdf_document = load_from_file("document.pdf")

# Get all fonts at once
fonts = pdf_document.fonts()
for font in fonts:
    print(f"Name: {font.name}")
    print(f"Type: {font.type}")
    print(f"Embedded: {font.is_embedded}")
    print(f"Subset: {font.is_subset}")
    print(f"File: {font.file}")

# Iterate through fonts page by page
font_iterator = pdf_document.create_font_iterator(start_page=0)
for page_num, page_fonts in font_iterator:
    print(f"\nFonts on page {page_num}:")
    for font in page_fonts:
        font_type_name = {
            FontType.unknown: "Unknown",
            FontType.type1: "Type 1",
            FontType.type1c: "Type 1C",
            FontType.type1c_ot: "Type 1C OpenType",
            FontType.type3: "Type 3",
            FontType.truetype: "TrueType",
            FontType.truetype_ot: "TrueType OpenType",
            FontType.cid_type0: "CID Type 0",
            FontType.cid_type0c: "CID Type 0C",
            FontType.cid_type0c_ot: "CID Type 0C OpenType",
            FontType.cid_truetype: "CID TrueType",
            FontType.cid_truetype_ot: "CID TrueType OpenType",
        }.get(font.type, "Unknown")

        embed_status = "embedded" if font.is_embedded else "not embedded"
        subset_status = "(subset)" if font.is_subset else "(full)"

        print(f"  - {font.name} [{font_type_name}] {embed_status} {subset_status}")

# Check current page of iterator
print(f"Current page: {font_iterator.current_page}")
print(f"Has next: {font_iterator.has_next}")
```

--------------------------------

### Extract Embedded Files from PDF

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Allows extraction and access to files embedded within PDF documents. This function loads the PDF and prepares for the retrieval of any attached files.

```python
from poppler import load_from_file

pdf_document = load_from_file("document.pdf")

# Further code to access and extract embedded files would go here.
```

--------------------------------

### Extract and Search Text from PDF Pages in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Details text extraction from specific areas or the entire page of a PDF using python-poppler. It also covers searching for text and retrieving detailed text box information, including bounding boxes and font details. Requires poppler version 0.63.0 or later for detailed text boxes and 0.89.0 for font information.

```python
from poppler import load_from_file, Rectangle, CaseSensitivity, SearchDirection

pdf_document = load_from_file("document.pdf")
page = pdf_document.create_page(0)  # Get first page (0-indexed)

# Extract all text from page
full_text = page.text()
print(full_text)

# Extract text from specific rectangle area
rect = Rectangle(100.0, 100.0, 300.0, 400.0)
region_text = page.text(rect)
print(region_text)

# Extract text with layout mode
from poppler import TextLayout
text_with_layout = page.text(layout_mode=TextLayout.physical_layout)

# Get detailed text boxes with positions (requires poppler >= 0.63.0)
text_boxes = page.text_list()
for text_box in text_boxes:
    print(f"Text: {text_box.text}")
    print(f"Bounding box: {text_box.bbox.as_tuple()}")
    print(f"Has space after: {text_box.has_space_after}")

    # Get character-level bounding boxes
    for i in range(len(text_box.text)):
        char_bbox = text_box.char_bbox(i)
        print(f"  Char '{text_box.text[i]}' at {char_bbox.as_tuple()}")

# Get font information from text boxes (requires poppler >= 0.89.0)
from poppler import Page
text_boxes = page.text_list(Page.TextListOption.text_list_include_font)
box = text_boxes[0]
if box.has_font_info:
    print(f"Font: {box.get_font_name()}")
    print(f"Size: {box.get_font_size()}")
    print(f"Writing mode: {box.get_wmode()}")
```

--------------------------------

### Extract Embedded Files from PDF Document

Source: https://context7.com/cbrunet/python-poppler/llms.txt

This Python snippet demonstrates how to check if a PDF document contains embedded files, retrieve a list of these files, and extract their data. It iterates through each embedded file, printing its metadata such as name, description, MIME type, size, checksum, dates, and validity. The extracted file data is then saved to disk. This functionality requires the `poppler` library.

```python
from poppler import load_from_file

pdf_document = load_from_file("document.pdf")

if pdf_document.has_embedded_files():
    embedded_files = pdf_document.embedded_files()

    for embedded_file in embedded_files:
        print(f"Name: {embedded_file.name}")
        print(f"Description: {embedded_file.description}")
        print(f"MIME type: {embedded_file.mime_type}")
        print(f"Size: {embedded_file.size} bytes")
        print(f"Checksum: {embedded_file.checksum}")
        print(f"Creation date: {embedded_file.creation_date}")
        print(f"Modification date: {embedded_file.modification_date}")
        print(f"Valid: {embedded_file.is_valid}")

        file_data = embedded_file.data

        with open(f"extracted_{embedded_file.name}", "wb") as f:
            f.write(file_data)
else:
    print("No embedded files found")
```

--------------------------------

### Search Text in PDF Page

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Searches for a specific text string within a PDF page. It takes the search term, a rectangle to limit the search area, a search direction, and case sensitivity as input. It returns the rectangle where the text was found or None if not found.

```python
from poppler import Rectangle, SearchDirection, CaseSensitivity

search_rect = Rectangle(0.0, 0.0, 0.0, 0.0)
found_rect = page.search(
    "searchterm",
    search_rect,
    SearchDirection.from_top,
    CaseSensitivity.case_sensitive
)
if found_rect:
    print(f"Found at: {found_rect.as_tuple()}")
else:
    print("Not found")
```

--------------------------------

### Suppress PDF Error Logging in Python

Source: https://context7.com/cbrunet/python-poppler/llms.txt

Demonstrates how to disable and re-enable logging for poppler errors. This is useful for preventing noisy stderr output when processing potentially problematic PDFs. Ensure poppler version is 0.30.0 or higher. Errors are suppressed when logging is disabled.

```python
from poppler import load_from_file, enable_logging

# Disable error logging (suppresses stderr output)
enable_logging(False)

# Load document that might have errors
pdf_document = load_from_file("problematic.pdf")
page = pdf_document.create_page(0)
text = page.text()  # No error messages printed

# Re-enable logging
enable_logging(True)

# Now errors will be printed to stderr again
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.