### Python Development Environment Setup and Testing Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Sets up a Python virtual environment, installs development dependencies, and runs unit and integration tests. Requires Python 3 and pip. Outputs test results. ```bash python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -e ".[dev]" pytest -q # unit suite pytest tests/e2e # integration tests (requires live API + fixtures) ``` -------------------------------- ### Install PDFDancer Python Client Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Instructions for installing the PDFDancer Python client using pip, including an editable install option for local development. Requires Python 3.10+. ```bash pip install pdfdancer-client-python # Editable install for local development pip install -e . ``` -------------------------------- ### Python PDFDancer Client Initialization Examples Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Shows different ways to initialize the PDFDancer client. This includes opening an existing PDF using a token from an environment variable or explicitly, and creating a new blank PDF with specified page size, orientation, and initial page count. Assumes PDFDancer, PageSize, and Orientation are imported. ```Python from pdfdancer import PDFDancer from pdfdancer.models import PageSize, Orientation # Open existing PDF with token from env var PDFDANCER_TOKEN pdf = PDFDancer.open(pdf_data="document.pdf") # Open with explicit token pdf = PDFDancer.open(pdf_data="document.pdf", token="your-token") # Create new blank PDF pdf = PDFDancer.new( page_size=PageSize.A4, orientation=Orientation.PORTRAIT, initial_page_count=5 ) ``` -------------------------------- ### Python Distribution Artifacts Build Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Builds distribution artifacts (e.g., wheels, sdists) for the Python package. Requires the 'build' package to be installed. Outputs files to the 'dist/' directory. ```bash python -m build ``` -------------------------------- ### Create New Blank PDF with PDFDancer Python Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Example demonstrating how to create a new blank PDF document using `PDFDancer.new()`. Shows how to add text paragraphs and images with specific styling and positioning. ```python from pathlib import Path from pdfdancer import Color, PDFDancer, StandardFonts with PDFDancer.new(token="your-api-token") as pdf: pdf.new_paragraph() \ .text("Quarterly Summary") \ .font(StandardFonts.TIMES_BOLD, 18) \ .color(Color(10, 10, 80)) \ .line_spacing(1.2) \ .at(page_index=0, x=72, y=730) \ .add() pdf.new_image() \ .from_file(Path("logo.png")) \ .at(page=0, x=420, y=710) \ .add() pdf.save("summary.pdf") ``` -------------------------------- ### Edit Existing PDF with PDFDancer Python Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Quick start guide demonstrating how to open an existing PDF, locate and edit paragraphs, add new paragraphs with precise positioning, and save the modified PDF. Uses `PDFDancer.open()` and includes optional token and base URL configuration. ```python from pathlib import Path from pdfdancer import Color, PDFDancer, StandardFonts with PDFDancer.open( pdf_data=Path("input.pdf"), token="your-api-token", # optional when PDFDANCER_TOKEN is set base_url="https://api.pdfdancer.com", ) as pdf: # Locate and update an existing paragraph heading = pdf.page(0).select_paragraphs_starting_with("Executive Summary")[0] heading.move_to(72, 680) with heading.edit() as editor: editor.replace("Overview") # Add a new paragraph with precise placement pdf.new_paragraph() \ .text("Generated with PDFDancer") \ .font(StandardFonts.HELVETICA, 12) \ .color(Color(70, 70, 70)) \ .line_spacing(1.4) \ .at(page_index=0, x=72, y=520) \ .add() # Persist the modified document pdf.save("output.pdf") # or keep it in memory pdf_bytes = pdf.get_bytes() ``` -------------------------------- ### Manage PDF Pages with PDF Dancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Demonstrates various page manipulation operations using the PDF Dancer library, including retrieving all pages, accessing specific pages, getting page properties like size and orientation, deleting pages, moving pages to different positions, and adding new blank pages. Requires the 'pdfdancer' library. ```python from pdfdancer import PDFDancer, PageSize, Orientation with PDFDancer.open("multi_page.pdf") as pdf: # Get all pages pages = pdf.pages() print(f"Total pages: {len(pages)}") # Access specific page first_page = pdf.page(0) print(f"Page size: {first_page.size}") print(f"Page orientation: {first_page.page_orientation}") # Get page properties page_size = first_page.page_size if page_size: print(f"Width: {page_size.width}, Height: {page_size.height}") # Delete a page page_to_remove = pdf.page(2) page_to_remove.delete() # Move page to different position page_to_move = pdf.page(3) page_to_move.move_to(target_page_index=1) # Alternative: move page by index pdf.move_page(from_page_index=4, to_page_index=0) # Add new blank page new_page = pdf.new_page() print(f"Added page at index {new_page.position.page_index}") # Select all elements on a page elements = first_page.select_elements() print(f"Page has {len(elements)} elements") pdf.save("reorganized.pdf") ``` -------------------------------- ### Work with PDF Form Fields using PDF Dancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Provides examples of how to interact with AcroForm fields in a PDF document using the PDF Dancer library. It covers selecting all fields, fields by name or on specific pages, editing field values, and selecting fields at given coordinates. Depends on the 'pdfdancer' library. ```python from pdfdancer import PDFDancer with PDFDancer.open("application_form.pdf") as pdf: # Select all form fields all_fields = pdf.select_form_fields() for field in all_fields: print(f"Field: {field.name} = {field.value}") # Select form fields by name name_fields = pdf.select_form_fields_by_name("applicant_name") if name_fields: with name_fields[0].edit() as editor: editor.value("John Doe") # Select form fields on specific page page = pdf.page(0) page_fields = page.select_form_fields() # Select form fields by name on specific page signature_fields = page.select_form_fields_by_name("signature") if signature_fields: signature_fields[0].edit().value("Signed by Jane Smith").apply() # Select form fields at coordinates fields_at_point = page.select_form_fields_at(x=150, y=300) # Update multiple fields email_field = pdf.select_form_fields_by_name("email")[0] phone_field = pdf.select_form_fields_by_name("phone")[0] with email_field.edit() as e: e.value("john@example.com") with phone_field.edit() as e: e.value("555-0123") pdf.save("filled_form.pdf") ``` -------------------------------- ### Delete PDF Objects with pdfdancer-client-python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Shows how to remove various elements from a PDF document, including paragraphs, images, and entire pages. It includes examples of deleting specific elements by selection criteria and by coordinates. ```python from pdfdancer import PDFDancer with PDFDancer.open("document.pdf") as pdf: # Delete specific paragraphs paragraphs = pdf.page(0).select_paragraphs_starting_with("DRAFT") for para in paragraphs: para.delete() # Delete images at specific coordinates page = pdf.page(1) images = page.select_images_at(x=50, y=100) for img in images: img.delete() # Delete entire page page_to_remove = pdf.page(2) page_to_remove.delete() # Delete all images on a page below certain x coordinate all_images = page.select_images() for image in all_images: if image.position.x() is not None and image.position.x() < 100: image.delete() pdf.save("cleaned.pdf") ``` -------------------------------- ### Select Paragraphs from PDF - Python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Enables the selection of paragraph objects from a PDF document based on various criteria. This includes selecting all paragraphs, paragraphs on a specific page, paragraphs starting with a given text, paragraphs matching a regular expression, or paragraphs located at specific coordinates. Each selected paragraph object provides access to its text content, page index, and precise position. ```python from pdfdancer import PDFDancer with PDFDancer.open("report.pdf") as pdf: # Select all paragraphs in document all_paragraphs = pdf.select_paragraphs() for para in all_paragraphs: print(f"Page {para.page_index}: {para.object_ref().text}") # Select paragraphs on specific page page = pdf.page(0) page_paragraphs = page.select_paragraphs() # Select paragraphs starting with specific text invoices = page.select_paragraphs_starting_with("Invoice #") for invoice in invoices: print(f"Found invoice at x={invoice.position.x()}, y={invoice.position.y()}") # Select paragraphs matching regex pattern dates = page.select_paragraphs_matching(r"\d{4}-\d{2}-\d{2}") # Select paragraphs at specific coordinates paragraphs_at_point = page.select_paragraphs_at(x=100, y=500) ``` -------------------------------- ### PDFDancer Initialization Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Demonstrates how to initialize the PDFDancer client, either by opening an existing PDF document or creating a new blank one. ```APIDOC ## PDFDancer Initialization ### Description Initialize the PDFDancer client by opening an existing PDF file or creating a new blank PDF with specified properties. ### Method `PDFDancer.open()` or `PDFDancer.new()` ### Parameters #### `PDFDancer.open()` - **pdf_data** (str) - Required - Path to the PDF file or PDF data. - **token** (str) - Optional - Authentication token for the PDFDancer API. #### `PDFDancer.new()` - **page_size** (PageSize enum) - Required - The size of the pages in the new PDF (e.g., `PageSize.A4`). - **orientation** (Orientation enum) - Required - The orientation of the pages (e.g., `Orientation.PORTRAIT`). - **initial_page_count** (int) - Optional - The number of initial blank pages to create. ### Request Example ```python from pdfdancer import PDFDancer, PageSize, Orientation # Open existing PDF with token from env var PDFDANCER_TOKEN pdf = PDFDancer.open(pdf_data="document.pdf") # Open with explicit token pdf = PDFDancer.open(pdf_data="document.pdf", token="your-token") # Create new blank PDF pdf = PDFDancer.new( page_size=PageSize.A4, orientation=Orientation.PORTRAIT, initial_page_count=5 ) ``` ### Response #### Success Response (200) - **pdf** (PDFDancer object) - An instance of the PDFDancer client. #### Response Example (No direct response body, returns a client object) ``` -------------------------------- ### Python PDFDancer Client Building New Content Patterns Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Demonstrates the fluent builder pattern in the PDFDancer client for adding new content like paragraphs and images to a PDF. Shows how to specify text, font, color, and position for paragraphs, and how to add images from files. Assumes PDFDancer, Font, Color, Position are imported. ```Python from pdfdancer import PDFDancer from pdfdancer.models import Font, Color, Position # Assume pdf is an initialized PDFDancer object # pdf = PDFDancer.open(...) or PDFDancer.new(...) page = pdf.page(0) # Assuming page 0 exists # Add paragraph to document pdf.new_paragraph() .from_string("Hello World") .with_font(Font("Helvetica", 12)) .with_color(Color(255, 0, 0)) .at_page_coordinates(0, 100, 200) .add() # Add paragraph to specific page page.new_paragraph() .from_string("Page-specific text") .with_font(Font("Arial", 14)) .at_coordinates(50, 50) .add() # Add image pdf.new_image() .from_file("logo.png") .with_width(100) .at_page_coordinates(0, 50, 50) .add() ``` -------------------------------- ### Python Release Publishing Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Publishes new releases of the pdfdancer-client-python library. Assumes the 'release.py' script is available and configured. This action typically requires authentication and network access. ```bash python release.py ``` -------------------------------- ### Create New Blank PDF Document - Python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Creates a new, blank PDF document. Users can specify page size, orientation, and the initial number of pages. The library provides constants for standard page sizes and orientations. Content can be added using a fluent builder pattern, allowing precise control over text, font, color, and positioning. The resulting PDF can be saved to a file or obtained as bytes. Custom page sizes can also be defined. ```python from pdfdancer import PDFDancer, PageSize, Orientation, Color, StandardFonts # Create blank PDF with default settings (A4, Portrait, 1 page) with PDFDancer.new() as pdf: pdf.new_paragraph() \ .text("Hello World") \ .font(StandardFonts.HELVETICA, 12) \ .color(Color(0, 0, 0)) \ .at(page_index=0, x=100, y=700) \ .add() pdf.save("blank.pdf") # Create with custom page size and multiple pages with PDFDancer.new( token="your-token", page_size=PageSize.LETTER, orientation=Orientation.LANDSCAPE, initial_page_count=5 ) as pdf: # Add content to first page page = pdf.page(0) page.new_paragraph() \ .text("Page 1 Content") \ .font(StandardFonts.TIMES_BOLD, 18) \ .at(x=72, y=500) \ .add() # Create custom page size custom_size = PageSize(name="CUSTOM", width=800, height=600) pdf_custom = PDFDancer.new(page_size=custom_size) ``` -------------------------------- ### Python PDF Manipulation with PDFDancer Client Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Demonstrates core PDF manipulation tasks using the PDFDancer Python client. Includes opening existing PDFs, creating new ones, selecting and manipulating objects like paragraphs and images, and using the builder pattern for new content. Assumes PDFDancer, PageSize, Orientation, Position, Font, and Color are imported. ```Python from pdfdancer import PDFDancer from pdfdancer.models import PageSize, Orientation, Position, Font, Color # Open existing PDF pdf = PDFDancer.open(pdf_data="document.pdf") # Create new blank PDF pdf = PDFDancer.new(page_size=PageSize.A4, orientation=Orientation.PORTRAIT) # Select and manipulate objects paragraphs = pdf.select_paragraphs() paragraphs[0].delete() images = pdf.select_images() images[0].move(Position.at_page_coordinates(0, 100, 200)) # Page-level operations page = pdf.page(0) page_paragraphs = page.select_paragraphs() page.delete() # Builder pattern for new content pdf.new_paragraph() .from_string("Text content") .with_font(Font("Arial", 12)) .at_page_coordinates(0, 100, 200) .add() # Save modified PDF pdf.save("output.pdf") ``` -------------------------------- ### Building New Content API Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Details how to use the builder pattern to construct and add new content elements like paragraphs and images to a PDF document. ```APIDOC ## Building New Content API ### Description Utilize the builder pattern to fluently construct and add new paragraphs and images to the PDF document or specific pages. ### Method Builder methods on `PDFDancer` and `Page` objects. ### Endpoint N/A (Client-side library methods) ### Parameters #### Adding Content (using builders) - **`pdf.new_paragraph()`** or **`page.new_paragraph()`**: Starts a new paragraph builder. - **`from_string(text: str)`**: Sets the text content for the paragraph. - **`with_font(font: Font)`**: Specifies the font for the paragraph. - **`with_color(color: Color)`**: Sets the color for the paragraph. - **`at_page_coordinates(page_index: int, x: float, y: float)`**: Sets the absolute position on a specific page. - **`at_coordinates(x: float, y: float)`**: Sets the position on the current page. - **`add()`**: Adds the constructed paragraph to the PDF. - **`pdf.new_image()`**: Starts a new image builder. - **`from_file(filepath: str)`**: Specifies the image file to use. - **`with_width(width: int)`**: Sets the width of the image. - **`with_height(height: int)`**: Sets the height of the image. - **`at_page_coordinates(page_index: int, x: float, y: float)`**: Sets the absolute position on a specific page. - **`at_coordinates(x: float, y: float)`**: Sets the position on the current page. - **`add()`**: Adds the constructed image to the PDF. ### Request Example ```python from pdfdancer import PDFDancer, Font, Color, PageSize, Orientation pdf = PDFDancer.new(PageSize.A4, Orientation.PORTRAIT) page = pdf.page(0) # Get the first page # Add paragraph to the document (at page 0) pdf.new_paragraph() .from_string("Hello World") .with_font(Font("Helvetica", 12)) .with_color(Color(255, 0, 0)) .at_page_coordinates(0, 100, 200) .add() # Add paragraph to a specific page page.new_paragraph() .from_string("Page-specific text") .with_font(Font("Arial", 14)) .at_coordinates(50, 50) .add() # Add image to the document (at page 0) pdf.new_image() .from_file("logo.png") .with_width(100) .at_page_coordinates(0, 50, 50) .add() ``` ### Response #### Success Response (200) - **Returns the builder object**: Allows chaining of methods. #### Response Example (No direct response body, operations add content to the PDF object in memory) ``` -------------------------------- ### Create and Use Page Sizes in PDFDancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Demonstrates creating page sizes from standard names, lists, custom dimensions, dictionaries, and using them when creating new PDFs. It also shows how to convert a page size object back into a dictionary. ```python from pdfdancer import PageSize, PDFDancer # Create from standard name letter_size = PageSize.from_name("LETTER") a4_size = PageSize.from_name("A4") # List all standard sizes standard_sizes = PageSize.standard_names() print(f"Available: {standard_sizes}") # Create custom page size custom = PageSize(name="CUSTOM", width=800, height=600) # Create from dictionary size_dict = {"width": 600, "height": 800, "name": "Custom"} custom2 = PageSize.from_dict(size_dict) # Use in new PDF creation with PDFDancer.new(page_size=PageSize.LETTER) as pdf: pass with PDFDancer.new(page_size="A4") as pdf: pass with PDFDancer.new(page_size={"width": 500, "height": 700}) as pdf: pass # Convert to dictionary page_dict = custom.to_dict() # Returns: {"name": "CUSTOM", "width": 800.0, "height": 600.0} ``` -------------------------------- ### Add New Paragraphs using Builder Pattern in pdfdancer-client-python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Illustrates how to create and add new paragraph content to PDF pages using a fluent builder interface. This includes setting font, color, line spacing, and absolute positioning on specific pages. ```python from pdfdancer import PDFDancer, Color, Font, StandardFonts with PDFDancer.open("template.pdf") as pdf: # Add paragraph to document with absolute page coordinates pdf.new_paragraph() \ .text("This is a new paragraph") \ .font(StandardFonts.HELVETICA, 12) \ .color(Color(0, 0, 0)) \ .line_spacing(1.2) \ .at(page_index=0, x=100, y=500) \ .add() # Add paragraph to specific page (shorter syntax) page = pdf.page(0) page.new_paragraph() \ .text("Page-specific paragraph") \ .font(StandardFonts.TIMES_BOLD, 14) \ .color(Color(255, 0, 0, alpha=200)) \ .at(x=72, y=400) \ .add() # Add multi-line text with custom font pdf.new_paragraph() \ .text("First line\nSecond line\nThird line") \ .font("Courier", 10) \ .line_spacing(1.5) \ .at(page_index=1, x=50, y=700) \ .add() # Use custom TTF font file pdf.new_paragraph() \ .text("Custom font text") \ .font_file("path/to/custom.ttf", 12) \ .at(page_index=0, x=200, y=300) \ .add() pdf.save("augmented.pdf") ``` -------------------------------- ### Defining Positions and Coordinates in PDF Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Creates position objects for precise element placement and selection using page coordinates, bounding boxes, and text matching patterns. Supports defining positions by page, specific coordinates, text content, element names, and manual bounding rectangles. Allows for modifying positions and retrieving their coordinates. Requires `pdfdancer`. ```python from pdfdancer import Position, Point, PositionMode, ShapeType, BoundingRect # Create position for entire page (all elements on page 0) page_position = Position.at_page(page_index=0) page_position.mode = PositionMode.CONTAINS # Create position at specific coordinates point_position = Position.at_page_coordinates(page_index=0, x=100, y=500) # Create position with text matching text_position = Position.at_page(0) text_position.with_text_starts("Invoice") # Create position by name (for form fields) name_position = Position.by_name("field_name") # Manual position construction with bounding box custom_position = Position() custom_position.page_index = 1 custom_position.shape = ShapeType.RECT custom_position.mode = PositionMode.INTERSECT custom_position.bounding_rect = BoundingRect(x=50, y=50, width=100, height=100) # Move position offset point_position.move_x(10) # Move 10 points right point_position.move_y(-5) # Move 5 points down # Get coordinates x = point_position.x() # Returns 110 y = point_position.y() # Returns 495 # Position with regex pattern for text matching pattern_position = Position.at_page(0) pattern_position.text_pattern = r"\d{4}-\d{2}-\d{2}" ``` -------------------------------- ### Python PDFDancer Client Selection and Manipulation Patterns Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Illustrates how to select and manipulate elements within a PDF using the PDFDancer client. Covers document-level and page-level selections for paragraphs, images, and form fields, including pattern-based selections. Also demonstrates object manipulation like deleting, moving, and modifying content. Assumes PDFDancer, Position are imported. ```Python from pdfdancer import PDFDancer from pdfdancer.models import Position # Assume pdf is an initialized PDFDancer object # pdf = PDFDancer.open(...) or PDFDancer.new(...) # Document-level selections paragraphs = pdf.select_paragraphs() images = pdf.select_images() form_fields = pdf.select_form_fields_by_name("fieldName") # Page-level selections page = pdf.page(0) page_paragraphs = page.select_paragraphs_starting_with("Invoice") page_images = page.select_images_at(100, 200) # Object manipulation if paragraphs: paragraphs[0].delete() paragraphs[0].move(Position.at_page_coordinates(1, 50, 50)) paragraphs[0].modify("New text content") # Pattern-based selections page_paragraphs_matching = page.select_paragraphs_matching(r"\d{4}-\d{2}-\d{2}") text_lines = page.select_text_lines_matching(r"Total: \$\d+") ``` -------------------------------- ### Configuring Page Sizes for New PDFs Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Specifies page dimensions for new PDF creation using standard sizes or custom measurements. Provides predefined constants for common page sizes like A4, Letter, Legal, Tabloid, A3, and A5. Requires `pdfdancer`. ```python from pdfdancer import PageSize, PDFDancer # Use standard page sizes a4 = PageSize.A4 # 595 x 842 points letter = PageSize.LETTER # 612 x 792 points legal = PageSize.LEGAL # 612 x 1008 points tabloid = PageSize.TABLOID # 792 x 1224 points a3 = PageSize.A3 # 842 x 1191 points a5 = PageSize.A5 # 420 x 595 points ``` -------------------------------- ### Add Images to PDF Pages with PDF Dancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Demonstrates how to add image content to PDF pages using the PDF Dancer library. It covers adding images from files and byte streams, specifying position and dimensions, and utilizes a builder pattern for fluent API calls. Dependencies include the 'pdfdancer' library and 'pathlib'. ```python from pathlib import Path from pdfdancer import PDFDancer with PDFDancer.open("document.pdf") as pdf: # Add image from file with automatic width pdf.new_image() \ .from_file(Path("logo.png")) \ .with_width(100) \ .at_page_coordinates(page=0, x=450, y=700) \ .add() # Add image from bytes with open("stamp.jpg", "rb") as f: image_bytes = f.read() pdf.new_image() \ .from_bytes(image_bytes, format="jpeg") \ .with_width(80) \ .at_page_coordinates(page=0, x=50, y=50) \ .add() # Add image with specific dimensions pdf.new_image() \ .from_file("banner.png") \ .with_width(200) \ .with_height(50) \ .at_page_coordinates(page=1, x=100, y=750) \ .add() pdf.save("with_images.pdf") ``` -------------------------------- ### Page Operations in Python Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Illustrates common page manipulation tasks. This includes retrieving all pages, accessing a specific page by index, and deleting a page. These operations are crucial for modifying PDF structures. ```python # Page operations pages = pdf.pages() page = pdf.page(0) page.delete() ``` -------------------------------- ### Work with Forms and Layout in PDFDancer Python Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/README.md Demonstrates advanced PDF manipulation, including inspecting document structure, updating form fields by name, and deleting or moving elements like images based on their coordinates. Uses `PDFDancer.open()` and object-specific methods like `edit()`, `delete()`, and `move_to()`. ```python from pdfdancer import PDFDancer with PDFDancer.open("contract.pdf") as pdf: # Inspect global document structure pages = pdf.pages() print("Total pages:", len(pages)) # Update form fields signature = pdf.select_form_fields_by_name("signature")[0] signature.edit().value("Signed by Jane Doe").apply() # Trim or move content at specific coordinates images = pdf.page(1).select_images() for image in images: x = image.position.x() if x is not None and x < 100: image.delete() ``` -------------------------------- ### Document Operations Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Provides methods for performing document-level operations, such as retrieving the PDF content as bytes. ```APIDOC ## Document Operations ### Description Perform high-level operations on the entire PDF document, such as saving the modified content. ### Method Methods on `PDFDancer` object. ### Endpoint N/A (Client-side library methods) ### Parameters #### Get PDF Bytes - **`pdf.get_bytes()`**: Retrieves the current state of the PDF document as bytes. ### Request Example ```python from pdfdancer import PDFDancer pdf = PDFDancer.open("document.pdf") # ... perform manipulations ... # Get PDF bytes pdf_bytes = pdf.get_bytes() # To save the PDF to a file: with open("output.pdf", "wb") as f: f.write(pdf_bytes) ``` ### Response #### Success Response (200) - **pdf_bytes** (bytes) - The binary content of the PDF document. #### Response Example ```python # Example of what pdf_bytes might contain (binary data) b'\x25\x50\x44\x46\x2d\x31\x2e\x37\x0a...' ``` ``` -------------------------------- ### Selection and Manipulation API Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Covers how to select various elements (paragraphs, images, form fields) from a PDF document and perform manipulation operations on them. ```APIDOC ## Selection and Manipulation API ### Description Select and manipulate elements such as paragraphs, images, and form fields within a PDF document. Operations can be performed at the document level or page level. ### Method Various methods on `PDFDancer` and `Page` objects. ### Endpoint N/A (Client-side library methods) ### Parameters #### Document-level Selections (on `PDFDancer` object) - **select_paragraphs()**: Returns all paragraphs in the document. - **select_images()**: Returns all images in the document. - **select_form_fields_by_name(name: str)**: Returns form fields by their name. #### Page-level Selections (on `Page` object obtained via `pdf.page(index)`) - **select_paragraphs_starting_with(prefix: str)**: Selects paragraphs beginning with a specific string. - **select_images_at(x: float, y: float)**: Selects images located at specific coordinates. - **select_paragraphs_matching(regex: str)**: Selects paragraphs matching a regular expression. - **select_text_lines_matching(regex: str)**: Selects text lines matching a regular expression. #### Object Manipulation (on selected objects like `ParagraphObject`, `ImageObject`) - **delete()**: Deletes the selected object. - **move(position: Position)**: Moves the selected object to a new position. - **modify(content: str)**: Modifies the content of a paragraph. ### Request Example ```python from pdfdancer import PDFDancer, Position pdf = PDFDancer.open("document.pdf") # Document-level selections paragraphs = pdf.select_paragraphs() images = pdf.select_images() form_fields = pdf.select_form_fields_by_name("fieldName") # Page-level selections page = pdf.page(0) page_paragraphs = page.select_paragraphs_starting_with("Invoice") page_images = page.select_images_at(100, 200) # Object manipulation paragraphs[0].delete() paragraphs[0].move(Position.at_page_coordinates(1, 50, 50)) paragraphs[0].modify("New text content") # Pattern-based selections paragraphs = page.select_paragraphs_matching(r"\d{4}-\d{2}-\d{2}") text_lines = page.select_text_lines_matching(r"Total: \$\d+") ``` ### Response #### Success Response (200) - **List of objects**: Returns a list of selected objects (e.g., `ParagraphObject`, `ImageObject`) or performs the manipulation in place. #### Response Example (No direct response body, operations modify the PDF object in memory) ``` -------------------------------- ### Open Existing PDF Document - Python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Opens an existing PDF document for manipulation by creating a session with the PDFDancer service. It can accept file paths, bytes, or file-like objects as input. Authentication can be handled via the PDFDANCER_TOKEN environment variable or by explicitly providing a token, base URL, and timeout. The function returns a PDFDancer object that allows further operations like selecting elements or saving the modified document. ```python from pathlib import Path from pdfdancer import PDFDancer # Open with token from PDFDANCER_TOKEN environment variable with PDFDancer.open(pdf_data="invoice.pdf") as pdf: paragraphs = pdf.select_paragraphs() print(f"Found {len(paragraphs)} paragraphs") # Open with explicit token with PDFDancer.open( pdf_data=Path("contract.pdf"), token="your-api-token", base_url="https://api.pdfdancer.com", timeout=30.0 ) as pdf: pages = pdf.pages() print(f"Document has {len(pages)} pages") pdf.save("output.pdf") # Open from bytes with open("document.pdf", "rb") as f: pdf_bytes = f.read() with PDFDancer.open(pdf_data=pdf_bytes) as pdf: modified_bytes = pdf.get_bytes() ``` -------------------------------- ### Save PDF to File in Python Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Demonstrates how to save the current PDF document to a specified file path using the pdf.save() method. This is a fundamental operation for persisting PDF content. ```python # Save to file pdf.save("output.pdf") ``` -------------------------------- ### Handle PDFDancer Exceptions Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Illustrates robust error handling for PDFDancer operations by catching specific exceptions like ValidationException, FontNotFoundException, HttpClientException, SessionException, and the base PdfDancerException. It shows how to access detailed error information. ```python from pdfdancer import ( PDFDancer, PdfDancerException, ValidationException, FontNotFoundException, HttpClientException, SessionException ) try: # Attempt to open PDF without token with PDFDancer.open("file.pdf") as pdf: pdf.save("output.pdf") except ValidationException as e: # Input validation errors (missing token, empty file, invalid coordinates) print(f"Validation error: {e}") except FontNotFoundException as e: # Requested font not available on service print(f"Font not found: {e}") except HttpClientException as e: # HTTP transport errors or server errors print(f"HTTP error: {e}") if hasattr(e, 'response') and e.response: print(f"Status code: {e.response.status_code}") except SessionException as e: # Session creation or lifecycle failures print(f"Session error: {e}") except PdfDancerException as e: # Base exception for all PDFDancer errors print(f"PDFDancer error: {e}") if hasattr(e, 'cause') and e.cause: print(f"Caused by: {e.cause}") # Validate inputs before operations try: with PDFDancer.open("doc.pdf") as pdf: pdf.new_paragraph() \ .text("") \ .font("Helvetica", 12) \ .at(page_index=0, x=100, y=700) \ .add() except ValidationException as e: print(f"Text cannot be empty: {e}") ``` -------------------------------- ### Save and Export PDFs with PDFDancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Shows how to save modified PDF documents to a file path, a specific Path object, or retrieve the PDF content as raw bytes for further processing or transmission. ```python from pathlib import Path from pdfdancer import PDFDancer with PDFDancer.open("input.pdf") as pdf: # Modify content paragraphs = pdf.select_paragraphs() if paragraphs: paragraphs[0].delete() # Save to file pdf.save("output.pdf") # Save to specific path output_path = Path("/tmp/modified_doc.pdf") pdf.save(output_path) # Get as bytes for further processing pdf_bytes = pdf.get_bytes() # Send bytes to another service # response = requests.post("https://api.example.com/upload", # files={"file": pdf_bytes}) # Write bytes manually with open("manual_save.pdf", "wb") as f: f.write(pdf_bytes) # Return from API endpoint # return Response(pdf_bytes, mimetype="application/pdf") ``` -------------------------------- ### Managing Fonts in PDF Documents Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Registers custom fonts (TTF from file or bytes) and queries available fonts for use in text operations. Supports using standard PDF fonts and finding fonts matching a pattern. Requires `pdfdancer` and `pathlib`. Custom fonts are registered and then can be applied to text elements. ```python from pathlib import Path from pdfdancer import PDFDancer, StandardFonts with PDFDancer.open("document.pdf") as pdf: # Use standard PDF fonts (always available) print("Standard fonts:") for font in StandardFonts: print(f" {font.value}") # Register custom TTF font font_name = pdf.register_font(Path("custom_font.ttf")) print(f"Registered font: {font_name}") # Register from bytes with open("another_font.ttf", "rb") as f: font_bytes = f.read() font_name2 = pdf.register_font(font_bytes) # Find available fonts matching pattern helvetica_fonts = pdf.find_fonts("Helvetica", 12) for font in helvetica_fonts: print(f"Found: {font.name} at {font.size}pt") # Use registered custom font in paragraph pdf.new_paragraph() \ .text("Text in custom font") \ .font(font_name, 14) \ .at(page_index=0, x=100, y=500) \ .add() pdf.save("custom_fonts.pdf") ``` -------------------------------- ### Python PDFDancer Client Document Operations Source: https://github.com/menschmachine/pdfdancer-client-python/blob/main/CLAUDE.md Shows how to retrieve the modified PDF content as bytes using the PDFDancer Python client. This is typically the final step after performing various manipulations on the PDF document. ```Python from pdfdancer import PDFDancer # Assume pdf is an initialized PDFDancer object and manipulations have been performed # pdf = PDFDancer.open(...) or PDFDancer.new(...) # Get PDF bytes pdf_bytes = pdf.get_bytes() ``` -------------------------------- ### Select and Manipulate Images in PDF with PDF Dancer Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Illustrates how to locate and modify existing images within a PDF document using the PDF Dancer library. This includes selecting all images, images on specific pages or at certain coordinates, moving images, and deleting images based on their position. Requires the 'pdfdancer' library. ```python from pdfdancer import PDFDancer with PDFDancer.open("brochure.pdf") as pdf: # Select all images in document all_images = pdf.select_images() print(f"Found {len(all_images)} images") # Select images on specific page page = pdf.page(0) page_images = page.select_images() # Select images at specific coordinates images_at_point = page.select_images_at(x=200, y=400) # Move image to new position if page_images: first_image = page_images[0] first_image.move_to(x=100, y=650) # Get image position info x = first_image.position.x() y = first_image.position.y() print(f"Image at ({x}, {y}) on page {first_image.page_index}") # Delete images in specific region for img in page_images: x = img.position.x() if x is not None and x > 400: img.delete() pdf.save("images_modified.pdf") ``` -------------------------------- ### Defining Colors and Styling in PDF Elements Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Defines colors with RGB and alpha values for text and visual elements in PDF documents. Supports creating standard RGB colors, colors with transparency, and validates color ranges. These colors can then be applied to text elements within paragraphs. Requires `pdfdancer`. ```python from pdfdancer import Color, PDFDancer, StandardFonts # Create RGB colors black = Color(0, 0, 0) red = Color(255, 0, 0) blue = Color(0, 0, 255) custom = Color(70, 130, 180) # Steel blue # Create color with alpha transparency semi_transparent = Color(255, 0, 0, alpha=128) fully_opaque = Color(0, 255, 0, alpha=255) # Colors validate range (0-255) try: invalid = Color(300, 0, 0) # Raises ValueError except ValueError as e: print(f"Invalid color: {e}") # Use colors in paragraphs with PDFDancer.open("doc.pdf") as pdf: pdf.new_paragraph() \ .text("Red text") \ .font(StandardFonts.HELVETICA_BOLD, 14) \ .color(Color(255, 0, 0)) \ .at(page_index=0, x=100, y=700) \ .add() pdf.new_paragraph() \ .text("Blue text with transparency") \ .font(StandardFonts.TIMES_ITALIC, 12) \ .color(Color(0, 0, 255, alpha=180)) \ .at(page_index=0, x=100, y=650) \ .add() ``` -------------------------------- ### Modify Paragraphs in PDF with pdfdancer-client-python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Demonstrates how to edit paragraph text, font properties, color, and position in a PDF document using the fluent editor API with context managers. It covers simple text replacement, formatting changes, and moving paragraphs. ```python from pdfdancer import PDFDancer, Color with PDFDancer.open("document.pdf") as pdf: # Find and modify paragraph text heading = pdf.page(0).select_paragraphs_starting_with("Summary")[0] # Simple text replacement with heading.edit() as editor: editor.replace("Executive Summary") # Replace text and change formatting with heading.edit() as editor: editor.replace("Updated Summary") \ .font("Helvetica-Bold", 16) \ .color(Color(255, 0, 0)) \ .line_spacing(1.5) # Move paragraph to new position with heading.edit() as editor: editor.replace("Moved Summary") \ .move_to(x=72, y=700) # Or use direct methods heading.move_to(x=100, y=600) pdf.save("modified.pdf") ``` -------------------------------- ### Manipulating Vector Paths in PDF Pages Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Selects and manipulates vector path objects like lines, shapes, and drawings in PDF pages. Supports selecting all paths, paths on specific pages, paths at certain coordinates, retrieving bounding boxes, moving paths, and deleting paths. Requires the `pdfdancer` library. ```python from pdfdancer import PDFDancer with PDFDancer.open("diagram.pdf") as pdf: # Select all paths in document all_paths = pdf.select_paths() print(f"Found {len(all_paths)} vector paths") # Select paths on specific page page = pdf.page(0) page_paths = page.select_paths() # Select paths at specific coordinates paths_at_point = page.select_paths_at(x=200, y=300) # Get path bounding box information for path in page_paths: bbox = path.bounding_box if bbox: print(f"Path at ({bbox.x}, {bbox.y}) size {bbox.width}x{bbox.height}") # Move path to new position if page_paths: page_paths[0].move_to(x=150, y=400) # Delete specific paths for path in paths_at_point: path.delete() pdf.save("modified_diagram.pdf") ``` -------------------------------- ### Work with Text Lines in PDF with pdfdancer-client-python Source: https://context7.com/menschmachine/pdfdancer-client-python/llms.txt Details how to select and modify individual text line objects within paragraphs for precise text manipulation. It covers selecting lines by content, patterns, and coordinates, as well as editing their content. ```python from pdfdancer import PDFDancer with PDFDancer.open("invoice.pdf") as pdf: # Select all text lines on a page page = pdf.page(0) text_lines = page.select_text_lines() for line in text_lines: print(f"Line: {line.object_ref().text}") # Select text lines starting with specific text total_lines = page.select_text_lines_starting_with("Total:") # Select text lines matching pattern amounts = page.select_text_lines_matching(r"\$\d+\.\d{2}") for amount in amounts: print(f"Found amount: {amount.object_ref().text}") # Select text lines at coordinates lines_at_point = page.select_text_lines_at(x=100, y=500) # Modify text line content if total_lines: with total_lines[0].edit() as editor: editor.replace("Total: $9,999.99") pdf.save("updated_invoice.pdf") ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.