======================== CODE SNIPPETS ======================== TITLE: Nutrient DWS Client Development Setup DESCRIPTION: Instructions for setting up the development environment for the Nutrient DWS Client Python library, including cloning the repository, installing in development mode, and running type checking and linting. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_7 LANGUAGE: bash CODE: ``` # Clone the repository git clone https://github.com/PSPDFKit/nutrient-dws-client-python.git cd nutrient-dws-client-python # Install in development mode pip install -e ".[dev]" # Run type checking mypy src/ # Run linting ruff check src/ # Run formatting ruff format src/ ``` ---------------------------------------- TITLE: Setup Python Development Environment for nutrient-dws DESCRIPTION: Instructions for forking, cloning, creating a virtual environment, and installing development dependencies for the nutrient-dws Python client library, ensuring a ready-to-develop setup. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md#_snippet_0 LANGUAGE: bash CODE: ``` git clone https://github.com/YOUR_USERNAME/nutrient-dws-client-python.git cd nutrient-dws-client-python ``` LANGUAGE: bash CODE: ``` python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` LANGUAGE: bash CODE: ``` pip install -e ".[dev]" pre-commit install ``` ---------------------------------------- TITLE: Quick Start: Initialize NutrientClient DESCRIPTION: Demonstrates the basic initialization of the NutrientClient with an API key. This is the first step to interacting with the Nutrient DWS API. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_2 LANGUAGE: Python CODE: ``` from nutrient_dws import NutrientClient client = NutrientClient(api_key='your_api_key') ``` ---------------------------------------- TITLE: Direct Methods: Document Processing Examples DESCRIPTION: Provides examples of using direct async methods of the NutrientClient for common document processing tasks such as conversion, text extraction, watermarking, and merging. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_3 LANGUAGE: Python CODE: ``` import asyncio from nutrient_dws import NutrientClient async def main(): client = NutrientClient(api_key='your_api_key') # Convert a document pdf_result = await client.convert('document.docx', 'pdf') # Extract text text_result = await client.extract_text('document.pdf') # Add a watermark watermarked_doc = await client.watermark_text('document.pdf', 'CONFIDENTIAL') # Merge multiple documents merged_pdf = await client.merge(['doc1.pdf', 'doc2.pdf', 'doc3.pdf']) asyncio.run(main()) ``` ---------------------------------------- TITLE: Set Up Nutrient DWS Client for Development DESCRIPTION: Provides instructions for setting up the Nutrient DWS client project for local development, including cloning the repository, installing dependencies in development mode, and running initial checks. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_16 LANGUAGE: bash CODE: ``` # Clone the repository git clone https://github.com/jdrhyne/nutrient-dws-client-python.git cd nutrient-dws-client-python # Install in development mode pip install -e ".[dev]" # Run tests pytest # Run linting ruff check . # Run type checking mypy src tests ``` ---------------------------------------- TITLE: Install Nutrient DWS Python Client DESCRIPTION: Installs the Nutrient DWS Python client library using pip. This is the primary method for adding the library to your Python environment. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_0 LANGUAGE: Bash CODE: ``` pip install nutrient-dws ``` ---------------------------------------- TITLE: Implementing Multi-Step Workflows with Builder API in Python DESCRIPTION: This example illustrates the use of the fluent Builder API in the `nutrient-dws-client-python` library. It demonstrates how to chain multiple processing steps, such as converting a DOCX to PDF and rotating it, into a single API call. The example also includes basic error handling for `APIError`. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md#_snippet_5 LANGUAGE: python CODE: ``` from nutrient_dws import APIError # User Story: Convert a DOCX to PDF and rotate it (Builder version) try: client.build(input_file="path/to/document.docx") \ .add_step(tool="rotate-pages", options={"degrees": 90}) \ .execute(output_path="path/to/final_document.pdf") print("Workflow complete. File saved to path/to/final_document.pdf") except APIError as e: print(f"An API error occurred: Status {e.status_code}, Response: {e.response_body}") ``` ---------------------------------------- TITLE: Example Python Unit Test with Pytest DESCRIPTION: A basic example of a unit test function written in Python using the pytest framework, demonstrating how to instantiate a client and assert expected behavior for new features. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md#_snippet_3 LANGUAGE: python CODE: ``` def test_new_feature(): """Test description.""" client = NutrientClient(api_key="test-key") result = client.new_feature() assert result == expected_value ``` ---------------------------------------- TITLE: Install Nutrient DWS Python Client DESCRIPTION: Installs the Nutrient Document Web Services (DWS) Python client library using pip, the standard package installer for Python. This command fetches the latest version of the library from PyPI. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_NOTES.md#_snippet_0 LANGUAGE: bash CODE: ``` pip install nutrient-dws ``` ---------------------------------------- TITLE: Verify PyPI Package Installation DESCRIPTION: Installs a specific version of the `nutrient-dws` package from PyPI using pip. This command is used to verify that the package has been successfully published and is installable. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md#_snippet_2 LANGUAGE: Shell CODE: ``` pip install nutrient-dws==1.0.x ``` ---------------------------------------- TITLE: Quick Start with Nutrient DWS Python Client DESCRIPTION: Demonstrates how to initialize the Nutrient DWS client and perform various document processing operations. It showcases both the Direct API for single operations and the Builder API for complex, multi-step workflows, including automatic Office document conversion and merging of different document types. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_NOTES.md#_snippet_1 LANGUAGE: python CODE: ``` from nutrient_dws import NutrientClient # Initialize client client = NutrientClient(api_key="your-api-key") # Direct API - Single operation client.rotate_pages("document.pdf", output_path="rotated.pdf", degrees=90) # Convert Office document to PDF (automatic!) client.convert_to_pdf("report.docx", output_path="report.pdf") # Builder API - Complex workflow client.build(input_file="scan.pdf") \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "CONFIDENTIAL"}) \ .add_step("flatten-annotations") \ .execute(output_path="processed.pdf") # Merge PDFs and Office documents together client.merge_pdfs([ "chapter1.pdf", "chapter2.docx", "appendix.xlsx" ], output_path="complete_document.pdf") ``` ---------------------------------------- TITLE: Displaying Repository Metrics with README Badges (Markdown) DESCRIPTION: This markdown snippet provides examples of common badges used in GitHub README files to display project information such as Python version, test coverage, license, and PyPI package version. These badges help convey key project stats at a glance and improve repository visibility. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/GITHUB_ABOUT.md#_snippet_0 LANGUAGE: markdown CODE: ``` ![Python](https://img.shields.io/badge/python-3.8+-blue.svg) ![Coverage](https://img.shields.io/badge/coverage-92%25-brightgreen.svg) ![License](https://img.shields.io/badge/license-MIT-green.svg) ![PyPI](https://img.shields.io/pypi/v/nutrient-dws.svg) ``` ---------------------------------------- TITLE: Complex Multi-step Workflow DESCRIPTION: A complex workflow example that adds specific pages from a PDF, merges another PDF, applies OCR, adds a watermark, creates redactions, and outputs the result in PDF/A format with optimization. Includes a progress callback. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_70 LANGUAGE: python CODE: ``` def progress_callback(current: int, total: int) -> None: print(f'Processing step {current} of {total}') result = await (client .workflow() .add_file_part('document.pdf', {'pages': {'start': 0, 'end': 5}}) .add_file_part('appendix.pdf') .apply_actions([ BuildActions.ocr({'language': 'english'}), BuildActions.watermark_text('CONFIDENTIAL'), BuildActions.create_redactions_preset('email-address', 'apply') ]) .output_pdfa({ 'level': 'pdfa-2b', 'optimize': { 'mrcCompression': True } }) .execute(on_progress=progress_callback)) ``` ---------------------------------------- TITLE: Using Direct API for Document Conversion and Rotation in Python DESCRIPTION: This example demonstrates the direct API design pattern for the `nutrient-dws-client-python` library. It shows how to convert a DOCX file to PDF and then rotate the pages of the resulting PDF using separate, direct method calls. It highlights the use of keyword-only arguments for tool-specific parameters and handling of in-memory byte streams. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md#_snippet_3 LANGUAGE: python CODE: ``` # User Story: Convert a DOCX to PDF and rotate it. # Step 1: Convert DOCX to PDF pdf_bytes = client.convert_to_pdf( input_file="path/to/document.docx" ) # Step 2: Rotate the newly created PDF from memory client.rotate_pages( input_file=pdf_bytes, output_path="path/to/rotated_document.pdf", degrees=90 # keyword-only argument ) print("File saved to path/to/rotated_document.pdf") ``` ---------------------------------------- TITLE: Nutrient DWS Python Client Builder API Reference DESCRIPTION: This section describes the fluent interface of the Builder API for multi-step document processing workflows. It outlines the core methods used to construct and execute a workflow, including starting a build, adding processing steps, executing the workflow, and setting output options. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md#_snippet_4 LANGUAGE: APIDOC CODE: ``` client.build(input_file) - Starts a multi-step workflow. - Parameters: - input_file: The initial input file for the workflow (str, Path, bytes, or file-like object). - Returns: A builder object for chaining. .add_step(tool, options=None) - Adds a processing step to the current workflow. - Parameters: - tool: The name of the processing tool (e.g., "rotate-pages"). - options: Optional dictionary of tool-specific parameters. - Returns: The builder object for chaining. .execute(output_path=None) - Executes the defined workflow. - Parameters: - output_path: Optional path to save the final output file. If not provided, returns bytes. - Returns: Bytes of the processed file if output_path is None, otherwise None. .set_output_options(**options) - Sets global output metadata or optimization options for the workflow. - Parameters: - options: Keyword arguments for output settings. - Returns: The builder object for chaining. ``` ---------------------------------------- TITLE: Basic Document Conversion (DOCX to PDF) DESCRIPTION: Converts a DOCX document to PDF format using the workflow client. This is a fundamental example of adding a file part and specifying the output format. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_66 LANGUAGE: python CODE: ``` result = await (client .workflow() .add_file_part('document.docx') .output_pdf() .execute()) ``` ---------------------------------------- TITLE: Builder API: Build Complex Document Processing Pipeline DESCRIPTION: Illustrates the use of the Builder API to construct a complex, chained document processing workflow. This example combines OCR, page rotation, watermarking, annotation flattening, and output options like metadata and optimization into a single execution. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_8 LANGUAGE: python CODE: ``` # Complex document processing pipeline result = client.build(input_file="raw-scan.pdf") \ .add_step("ocr-pdf", {"language": "en"}) \ .add_step("rotate-pages", {"degrees": -90, "page_indexes": [0]}) \ .add_step("watermark-pdf", { "text": "PROCESSED", "opacity": 0.3, "position": "top-right" }) \ .add_step("flatten-annotations") \ .set_output_options( metadata={"title": "Processed Document", "author": "DWS Client"}, optimize=True ) \ .execute(output_path="final.pdf") ``` ---------------------------------------- TITLE: Nutrient DWS Python Client Error Handling DESCRIPTION: Demonstrates how to handle various exceptions raised by the Nutrient DWS Python client, including validation, authentication, API, and network errors. Provides examples for accessing error messages and details. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_33 LANGUAGE: python CODE: ``` from nutrient_dws import ( NutrientError, ValidationError, APIError, AuthenticationError, NetworkError ) try: result = await client.convert('file.docx', 'pdf') except ValidationError as error: # Invalid input parameters print(f'Invalid input: {error.message} - Details: {error.details}') except AuthenticationError as error: # Authentication failed print(f'Auth error: {error.message} - Status: {error.status_code}') except APIError as error: # API returned an error print(f'API error: {error.message} - Status: {error.status_code} - Details: {error.details}') except NetworkError as error: # Network request failed print(f'Network error: {error.message} - Details: {error.details}') ``` ---------------------------------------- TITLE: Get Account Information DESCRIPTION: Retrieves account information associated with the current API key. The returned data includes subscription details. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_3 LANGUAGE: python CODE: ``` account_info = await client.get_account_info() # Access subscription information print(account_info['subscriptionType']) ``` ---------------------------------------- TITLE: Configure Output Options and Page Labels with Python Builder API DESCRIPTION: This example demonstrates how to use the PSPDFKit Python client's Builder API to set document metadata and page labels during the document processing workflow. It shows combining a processing step with output configuration. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md#_snippet_10 LANGUAGE: python CODE: ``` client.build("document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .set_output_options(metadata={"title": "My Document"}) \ .set_page_labels([{"pages": {"start": 0}, "label": "Chapter 1"}]) \ .execute("output.pdf") ``` ---------------------------------------- TITLE: Error Handling in Workflows DESCRIPTION: Provides a robust example of error handling for workflow execution. It includes a try-except block to catch unexpected errors and iterates through specific workflow errors reported in the result. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_72 LANGUAGE: python CODE: ``` try: result = await (client .workflow() .add_file_part('document.pdf') .output_pdf() .execute()) if not result['success']: # Handle workflow errors for error in result.get('errors', []): print(f"Step {error['step']}: {error['error']['message']}") except Exception as error: # Handle unexpected errors print(f'Workflow execution failed: {error}') ``` ---------------------------------------- TITLE: Set Page Labels and Rotate Pages with Python Builder API DESCRIPTION: This example demonstrates how to use the PSPDFKit Python client's Builder API to rotate pages and set custom page labels for a PDF document. It initializes a build process, adds a rotation step, defines page label ranges, and executes the operation to an output file. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md#_snippet_7 LANGUAGE: python CODE: ``` client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .set_page_labels([ {"pages": {"start": 0, "end": 2}, "label": "Introduction"}, {"pages": {"start": 3}, "label": "Content"} ]) \ .execute(output_path="labeled_document.pdf") ``` ---------------------------------------- TITLE: NutrientClient Constructor Parameters DESCRIPTION: Details the parameters for initializing the NutrientClient, including the API key, optional base URL, and request timeout. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_2 LANGUAGE: APIDOC CODE: ``` NutrientClient(api_key: str | Callable[[], Awaitable[str] | str], base_url: str | None = None, timeout: int | None = None) Parameters: - `api_key` (required): Your API key string or async function returning a token - `base_url` (optional): Custom API base URL (defaults to `https://api.nutrient.io`) - `timeout` (optional): Request timeout in milliseconds ``` ---------------------------------------- TITLE: Run Tests for Nutrient DWS Client DESCRIPTION: Details how to execute tests for the Nutrient DWS client project, covering running all tests, generating coverage reports, and executing specific test files. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_17 LANGUAGE: bash CODE: ``` # Run all tests pytest # Run with coverage pytest --cov=nutrient --cov-report=html # Run specific test file pytest tests/unit/test_client.py ``` ---------------------------------------- TITLE: NutrientClient Initialization with API Key DESCRIPTION: Initializes the NutrientClient by providing your API key directly. This is the most straightforward way to authenticate with the Nutrient DWS service. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_0 LANGUAGE: python CODE: ``` from nutrient_dws import NutrientClient client = NutrientClient(api_key='your_api_key') ``` ---------------------------------------- TITLE: Running Nutrient DWS Client Tests DESCRIPTION: Provides commands for running tests using pytest, including running all tests, generating coverage reports, and executing specific test suites (unit and integration). SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_6 LANGUAGE: bash CODE: ``` # Run all tests python -m pytest # Run with coverage report python -m pytest --cov=nutrient_dws --cov-report=html # Run only unit tests python -m pytest tests/unit/ # Run integration tests (requires API key) NUTRIENT_API_KEY=your_key python -m pytest tests/test_integration.py ``` ---------------------------------------- TITLE: Create Nutrient DWS Python Client Workflow DESCRIPTION: Illustrates different methods for creating a document processing workflow using the Nutrient DWS Python Client. Shows how to initialize a workflow from a client instance, override timeouts, and create a workflow independently with an API key. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_34 LANGUAGE: python CODE: ``` # Creating Workflow from a client workflow = client.workflow() # Override the client timeout workflow = client.workflow(60000) # Create a workflow without a client from nutrient_dws.builder.builder import StagedWorkflowBuilder workflow = StagedWorkflowBuilder({ 'apiKey': 'your-api-key' }) ``` ---------------------------------------- TITLE: Initialize Nutrient DWS Python Client DESCRIPTION: Demonstrates how to initialize the `NutrientClient` class, providing an API key and an optional timeout. It also shows support for context managers for automatic resource management. The `AuthenticationError` is raised if the API key is invalid. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md#_snippet_0 LANGUAGE: python CODE: ``` from nutrient_dws import NutrientClient, AuthenticationError # API key from parameter (takes precedence) or NUTRIENT_API_KEY env var client = NutrientClient(api_key="YOUR_DWS_API_KEY", timeout=300) # Context manager support with NutrientClient() as client: result = client.convert_to_pdf("document.docx") ``` ---------------------------------------- TITLE: Workflow System: Document Processing Pipeline DESCRIPTION: Illustrates how to use the fluent builder pattern (workflow system) to construct and execute complex document processing pipelines. It shows adding files, applying actions like watermarking, and setting output options. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_4 LANGUAGE: Python CODE: ``` from nutrient_dws.builder.constant import BuildActions async def main(): client = NutrientClient(api_key='your_api_key') result = await (client .workflow() .add_file_part('document.pdf') .add_file_part('appendix.pdf') .apply_action(BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.5, 'fontSize': 48 })) .output_pdf({ 'optimize': { 'mrcCompression': True, 'imageOptimizationQuality': 2 } }) .execute()) asyncio.run(main()) ``` ---------------------------------------- TITLE: Nutrient DWS Python Client API Reference DESCRIPTION: Comprehensive reference for the `NutrientClient` class and its associated APIs, including initialization, Direct API methods for single operations, and the Builder API for constructing multi-step workflows. It outlines the core components and their interactions with the Nutrient DWS API endpoints. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md#_snippet_1 LANGUAGE: APIDOC CODE: ``` NutrientClient Class: __init__(api_key: str, timeout: int = 300) - Description: Initializes the Nutrient DWS client for API interactions. - Parameters: - api_key (str): Your DWS API key. This parameter takes precedence over the NUTRIENT_API_KEY environment variable. - timeout (int): The maximum time in seconds to wait for API calls to complete (default: 300). - Error Handling: Raises `AuthenticationError` if the provided API key is invalid upon the first API call. Context Manager Support: - Description: The `NutrientClient` supports Python's context manager protocol, ensuring proper resource cleanup. - Usage: with NutrientClient() as client: # Perform API operations within this block result = client.convert_to_pdf("document.docx") Direct API Methods (on NutrientClient): - Description: A collection of static methods directly accessible on the `NutrientClient` object, each corresponding to a specific document processing tool. These methods abstract the `POST /process/{tool}` endpoint. - Signature: `client.tool_name(input_file: Union[str, bytes, IO], ..., output_path: Optional[str] = None)` - `input_file`: The input document, specified as a file path, bytes, or a file-like object. - `output_path` (Optional[str]): If provided, the processed file will be saved to this path. If `None`, the processed file's bytes are returned. - Example Methods: - `client.rotate_pages(input_file: str, degrees: int, output_path: Optional[str] = None)` - Description: Rotates pages of an input document by a specified degree. - Parameters: - `input_file` (str): Path to the input document (e.g., 'path/to/doc.pdf'). - `degrees` (int): The rotation angle (e.g., 90, 180, 270). - `client.convert_to_pdf(input_file: str, output_path: Optional[str] = None)` - Description: Converts an input document to PDF format. - Parameters: - `input_file` (str): Path to the input document (e.g., 'document.docx'). Builder API (BuildAPIWrapper): - Description: A separate class, instantiated via `client.build()`, providing a fluent, chainable interface for composing and executing complex, multi-step document processing workflows. It abstracts the `POST /build` endpoint. - Instantiation: `client.build(input_file: Union[str, bytes, IO]) -> BuildAPIWrapper` - Parameters: - `input_file`: The initial input document for the workflow. - Methods on `BuildAPIWrapper`: - `add_step(tool: str, options: Dict) -> BuildAPIWrapper` - Description: Adds a processing step to the current workflow chain. - Parameters: - `tool` (str): The name of the tool to apply (e.g., 'rotate-pages', 'ocr-pdf'). - `options` (Dict): A dictionary containing options specific to the chosen tool. - `execute(output_path: Optional[str] = None) -> Union[bytes, None]` - Description: Compiles the chained workflow into a `multipart/form-data` request and sends it to the `/build` endpoint. - Parameters: - `output_path` (Optional[str]): Path to save the processed file. If `None`, the processed file's bytes are returned. - Returns: Processed file bytes if `output_path` is `None`, otherwise `None`. - Usage Example: client.build(input_file='doc.docx')\ .add_step(tool='rotate-pages', options={'degrees': 90})\ .add_step(tool='ocr-pdf', options={})\ .execute(output_path='processed_doc.pdf') ``` ---------------------------------------- TITLE: NutrientClient Initialization with Async Token Provider DESCRIPTION: Initializes the NutrientClient using an asynchronous token provider function. This is useful for fetching tokens securely from a remote source. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_1 LANGUAGE: python CODE: ``` import httpx from nutrient_dws import NutrientClient async def get_token(): async with httpx.AsyncClient() as http_client: response = await http_client.get('/api/get-nutrient-token') data = response.json() return data['token'] client = NutrientClient(api_key=get_token) ``` ---------------------------------------- TITLE: Handle Various File Input Types in Python DESCRIPTION: Illustrates different methods for providing input files to the Nutrient DWS client, including file paths (string or Path object), raw bytes, file-like objects, and URLs for supported operations. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_10 LANGUAGE: python CODE: ``` # File path (string or Path object) client.convert_to_pdf("document.docx") client.convert_to_pdf(Path("document.docx")) # Bytes with open("document.docx", "rb") as f: file_bytes = f.read() client.convert_to_pdf(file_bytes) # File-like object with open("document.docx", "rb") as f: client.convert_to_pdf(f) # URL (for supported operations) client.import_from_url("https://example.com/document.pdf") ``` ---------------------------------------- TITLE: Apply Actions to Workflow DESCRIPTION: Shows how to apply single or multiple actions to a workflow using the PSPDFKit Python client. Supports actions like watermarking and OCR, with options for customization. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_36 LANGUAGE: python CODE: ``` workflow.apply_action(BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.5, 'fontSize': 48 })) ``` LANGUAGE: python CODE: ``` workflow.apply_action( BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.3, 'rotation': 45 }) ) workflow.apply_action(BuildActions.ocr('english')) ``` LANGUAGE: python CODE: ``` workflow.apply_actions([ BuildActions.watermark_text('DRAFT', {'opacity': 0.5}), BuildActions.ocr('english'), BuildActions.flatten() ]) ``` ---------------------------------------- TITLE: Create and Push Git Release Tag DESCRIPTION: Creates a new Git tag for the specified release version and immediately pushes it to the remote origin. This action typically triggers automated release workflows. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md#_snippet_1 LANGUAGE: Shell CODE: ``` git tag v1.0.x && git push origin v1.0.x ``` ---------------------------------------- TITLE: Staged Workflow Builder DESCRIPTION: Demonstrates building a workflow dynamically using a staged approach. Allows for conditional addition of file parts and application of actions, and setting the output format based on variables. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_71 LANGUAGE: python CODE: ``` # Create a staged workflow workflow = client.workflow() # Add parts workflow.add_file_part('document.pdf') # Conditionally add more parts if include_appendix: workflow.add_file_part('appendix.pdf') # Conditionally apply actions if needs_watermark: workflow.apply_action(BuildActions.watermark_text('CONFIDENTIAL')) # Set output format based on user preference if output_format == 'pdf': workflow.output_pdf() elif output_format == 'docx': workflow.output_office('docx') else: workflow.output_image('png') # Execute the workflow result = await workflow.execute() ``` ---------------------------------------- TITLE: Git Commands for Tagging and Pushing Releases DESCRIPTION: Commands used to tag a new version release in Git and push the tags to the remote repository as part of the official release process. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md#_snippet_4 LANGUAGE: bash CODE: ``` git tag v0.1.0 ``` LANGUAGE: bash CODE: ``` git push --tags ``` ---------------------------------------- TITLE: Run Tests, Type Checking, and Linting for Python Project DESCRIPTION: Commands to execute unit tests, perform static type analysis, and run linting checks during the development workflow to ensure code quality and adherence to coding standards. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md#_snippet_2 LANGUAGE: bash CODE: ``` pytest ``` LANGUAGE: bash CODE: ``` mypy src/ ``` LANGUAGE: bash CODE: ``` ruff check src/ tests/ ``` ---------------------------------------- TITLE: Integrate with Coding Agents DESCRIPTION: Commands to add code rules for various coding agents to help them understand and utilize the Nutrient DWS Python Client library effectively. This prevents hallucination and ensures proper feature usage. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_1 LANGUAGE: Bash CODE: ``` # Adding code rule to Claude Code dws-add-claude-code-rule ``` LANGUAGE: Bash CODE: ``` # Adding code rule to GitHub Copilot dws-add-github-copilot-rule ``` LANGUAGE: Bash CODE: ``` # Adding code rule to Junie (Jetbrains) dws-add-junie-rule ``` LANGUAGE: Bash CODE: ``` # Adding code rule to Cursor dws-add-cursor-rule ``` LANGUAGE: Bash CODE: ``` # Adding code rule to Windsurf dws-add-windsurf-rule ``` ---------------------------------------- TITLE: Apply Annotations from Instant JSON DESCRIPTION: Applies annotations to the document from a specified Instant JSON file. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_49 LANGUAGE: python CODE: ``` # Apply annotations from Instant JSON file workflow.apply_action(BuildActions.apply_instant_json('/path/to/annotations.json')) ``` ---------------------------------------- TITLE: apply_action Method DESCRIPTION: Applies a single action to the entire workflow, such as watermarking or OCR. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_42 LANGUAGE: APIDOC CODE: ``` apply_action(action: BuildAction) -> WorkflowWithActionsStage Applies a single action to the workflow. Parameters: action: The action to apply to the workflow. Returns: The workflow builder instance for method chaining. ``` ---------------------------------------- TITLE: Commit Release Preparation Changes DESCRIPTION: Commits the version and changelog updates, preparing the repository for a new release tag. This commit message follows a conventional commit format. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md#_snippet_0 LANGUAGE: Shell CODE: ``` git commit -m "chore: prepare release v1.0.x" ``` ---------------------------------------- TITLE: Git Commands for Feature Branching and Committing DESCRIPTION: Standard Git commands for creating a new feature branch, staging changes, and committing them with a conventional commit message, adhering to project contribution guidelines. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md#_snippet_1 LANGUAGE: bash CODE: ``` git checkout -b feature/your-feature-name ``` LANGUAGE: bash CODE: ``` git add . ``` LANGUAGE: bash CODE: ``` git commit -m "feat: add new feature" ``` ---------------------------------------- TITLE: Output PDF/UA Format DESCRIPTION: Configures the output to be in PDF/UA format, ensuring universal accessibility. Supports metadata, password protection, and optimization. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_57 LANGUAGE: APIDOC CODE: ``` output_pdfua(options?) Sets the output format to PDF/UA (Universal Accessibility). Parameters: - options: dict[str, Any] | None - Additional options for PDF/UA output (optional): - metadata: dict[str, Any] - Document metadata properties like title, author. - labels: list[dict[str, Any]] - Custom labels to add to the document for organization and categorization. - user_password: str - Password required to open the document. When set, the PDF will be encrypted. - owner_password: str - Password required to modify the document. Provides additional security beyond the user password. - user_permissions: list[str] - List of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc. - optimize: dict[str, Any] - PDF optimization settings to reduce file size and improve performance. - mrcCompression: bool - When True, applies Mixed Raster Content compression to reduce file size. - imageOptimizationQuality: int - Controls the quality of image optimization (1-5, where 1 is highest quality). Returns: WorkflowWithOutputStage - The workflow builder instance for method chaining. Example: ```python # Set output format to PDF/UA with default options workflow.output_pdfua() # Set output format to PDF/UA with specific options workflow.output_pdfua({ 'metadata': { 'title': 'Accessible Document', 'author': 'Document System' }, 'optimize': { 'mrcCompression': True, 'imageOptimizationQuality': 3 } }) ``` ``` ---------------------------------------- TITLE: apply_actions Method DESCRIPTION: Applies a list of actions to the workflow, enabling batch processing of document modifications. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_43 LANGUAGE: APIDOC CODE: ``` apply_actions(actions: list[BuildAction]) -> WorkflowWithActionsStage Applies multiple actions to the workflow. Parameters: actions: A list of actions to apply to the workflow. Returns: The workflow builder instance for method chaining. ``` ---------------------------------------- TITLE: Output Document as Office File DESCRIPTION: Sets the output format to an Office document format (DOCX, XLSX, PPTX). Allows conversion to Word, Excel, or PowerPoint formats. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_59 LANGUAGE: python CODE: ``` workflow.output_office('docx') workflow.output_office('xlsx') workflow.output_office('pptx') ``` ---------------------------------------- TITLE: Output PDF/A Format DESCRIPTION: Configures the output to be in PDF/A format, suitable for long-term archival. Supports various conformance levels and optimization options. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_56 LANGUAGE: APIDOC CODE: ``` output_pdfa(options?) Sets the output format to PDF/A (archival PDF). Parameters: - options: dict[str, Any] | None - Additional options for PDF/A output (optional): - conformance: str - The PDF/A conformance level to target. Options include 'pdfa-1b', 'pdfa-1a', 'pdfa-2b', 'pdfa-2a', 'pdfa-3b', 'pdfa-3a'. Different levels have different requirements for long-term archiving. - vectorization: bool - When True, attempts to convert raster content to vector graphics where possible, improving quality and reducing file size. - rasterization: bool - When True, converts vector graphics to raster images, which can help with compatibility in some cases. - metadata: dict[str, Any] - Document metadata properties like title, author. - labels: list[dict[str, Any]] - Custom labels to add to the document for organization and categorization. - user_password: str - Password required to open the document. When set, the PDF will be encrypted. - owner_password: str - Password required to modify the document. Provides additional security beyond the user password. - user_permissions: list[str] - List of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc. - optimize: dict[str, Any] - PDF optimization settings to reduce file size and improve performance. - mrcCompression: bool - When True, applies Mixed Raster Content compression to reduce file size. - imageOptimizationQuality: int - Controls the quality of image optimization (1-5, where 1 is highest quality). Returns: WorkflowWithOutputStage - The workflow builder instance for method chaining. Example: ```python # Set output format to PDF/A with default options workflow.output_pdfa() # Set output format to PDF/A with specific options workflow.output_pdfa({ 'conformance': 'pdfa-2b', 'vectorization': True, 'metadata': { 'title': 'Archive Document', 'author': 'Document System' }, 'optimize': { 'mrcCompression': True } }) ``` ``` ---------------------------------------- TITLE: Implement Error Handling for Nutrient DWS Client in Python DESCRIPTION: Shows how to catch and handle specific exceptions provided by the Nutrient DWS client library, such as authentication, validation, API, timeout, and file processing errors, to build robust applications. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_11 LANGUAGE: python CODE: ``` from nutrient_dws import ( NutrientError, AuthenticationError, APIError, ValidationError, TimeoutError, FileProcessingError ) try: client.convert_to_pdf("document.docx") except AuthenticationError: print("Invalid API key") except ValidationError as e: print(f"Invalid parameters: {e.errors}") except APIError as e: print(f"API error: {e.status_code} - {e.message}") except TimeoutError: print("Request timed out") except FileProcessingError as e: print(f"File processing failed: {e}") ``` ---------------------------------------- TITLE: Nutrient DWS Client Available Operations API Reference DESCRIPTION: Comprehensive reference for the core PDF manipulation, enhancement, and security operations available through the Nutrient DWS client library, including their purpose and general functionality. Also details the Builder API for chaining operations. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_14 LANGUAGE: APIDOC CODE: ``` PDF Manipulation Operations: - merge_pdfs: Merge multiple PDFs into one - rotate_pages: Rotate PDF pages (all or specific pages) - flatten_annotations: Flatten form fields and annotations PDF Enhancement Operations: - ocr_pdf: Add searchable text layer (English and German) - watermark_pdf: Add text or image watermarks PDF Security Operations: - apply_redactions: Apply existing redaction annotations Builder API: The Builder API allows chaining multiple operations. Example Usage: client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "DRAFT", "width": 200, "height": 100}) \ .execute(output_path="processed.pdf") ``` ---------------------------------------- TITLE: Output Document as HTML DESCRIPTION: Sets the output format to HTML. Supports 'page' layout for preserving original structure or 'reflow' for a continuous text flow. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_60 LANGUAGE: python CODE: ``` workflow.output_html('page') ``` ---------------------------------------- TITLE: Output Document as Image DESCRIPTION: Sets the output format to an image format (PNG, JPEG, WEBP). Supports various options for resolution, page selection, and quality. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_58 LANGUAGE: python CODE: ``` workflow.output_image('png', {'dpi': 300}) workflow.output_image('jpeg', { 'dpi': 300, 'pages': {'start': 1, 'end': 3} }) workflow.output_image('webp', { 'width': 1200, 'height': 800, 'dpi': 150 }) ``` ---------------------------------------- TITLE: NutrientClient Direct API Methods DESCRIPTION: Comprehensive documentation for the `NutrientClient` instance methods, covering various document processing functionalities such as conversion, annotation flattening, page rotation, OCR, watermarking, redaction, and merging, with support for implicit Office document conversion. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md#_snippet_1 LANGUAGE: APIDOC CODE: ``` convert_to_pdf(input_file, output_path=None) - Converts Office documents to PDF format using implicit conversion. - Parameters: - input_file: Office document (DOCX, XLSX, PPTX) - output_path: Optional path to save output - Example: # Convert DOCX to PDF client.convert_to_pdf("document.docx", "document.pdf") # Convert and get bytes pdf_bytes = client.convert_to_pdf("spreadsheet.xlsx") - Note: HTML files are not currently supported. flatten_annotations(input_file, output_path=None) - Flattens all annotations and form fields in a PDF, converting them to static page content. - Parameters: - input_file: PDF or Office document - output_path: Optional path to save output - Example: client.flatten_annotations("document.pdf", "flattened.pdf") # Works with Office docs too! client.flatten_annotations("form.docx", "flattened.pdf") rotate_pages(input_file, output_path=None, degrees=0, page_indexes=None) - Rotates pages in a PDF or converts Office document to PDF and rotates. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - degrees: Rotation angle (90, 180, 270, or -90) - page_indexes: Optional list of page indexes to rotate (0-based) - Example: # Rotate all pages 90 degrees client.rotate_pages("document.pdf", "rotated.pdf", degrees=90) # Works with Office documents too! client.rotate_pages("presentation.pptx", "rotated.pdf", degrees=180) # Rotate specific pages client.rotate_pages("document.pdf", "rotated.pdf", degrees=180, page_indexes=[0, 2]) ocr_pdf(input_file, output_path=None, language="english") - Applies OCR to make a PDF searchable. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - language: OCR language - supported values: "english" or "eng", "deu" or "german" - Example: client.ocr_pdf("scanned.pdf", "searchable.pdf", language="english") # Convert DOCX to searchable PDF client.ocr_pdf("document.docx", "searchable.pdf", language="eng") watermark_pdf(input_file, output_path=None, text=None, image_url=None, width=200, height=100, opacity=1.0, position="center") - Adds a watermark to all pages of a PDF. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - text: Text for watermark (either text or image_url required) - image_url: URL of image for watermark - width: Width in points (required) - height: Height in points (required) - opacity: Opacity from 0.0 to 1.0 - position: One of: "top-left", "top-center", "top-right", "center", "bottom-left", "bottom-center", "bottom-right" - Example: # Text watermark client.watermark_pdf( "document.pdf", "watermarked.pdf", text="CONFIDENTIAL", width=300, height=150, opacity=0.5, position="center" ) apply_redactions(input_file, output_path=None) - Applies redaction annotations to permanently remove content. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document with redaction annotations - output_path: Optional output path - Example: client.apply_redactions("document_with_redactions.pdf", "redacted.pdf") merge_pdfs(input_files, output_path=None) - Merges multiple files into one PDF. Automatically converts Office documents to PDF before merging. - Parameters: - input_files: List of files to merge (PDFs and/or Office documents) - output_path: Optional output path - Example: # Merge PDFs only client.merge_pdfs( ["document1.pdf", "document2.pdf", "document3.pdf"], "merged.pdf" ) # Mix PDFs and Office documents - they'll be converted automatically! client.merge_pdfs( ["report.pdf", "spreadsheet.xlsx", "presentation.pptx"], "combined.pdf" ) ``` ---------------------------------------- TITLE: add_new_page Method DESCRIPTION: Adds a new blank page to the workflow, with options for page layout and actions. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_40 LANGUAGE: APIDOC CODE: ``` add_new_page(options: NewPagePartOptions | None = None, actions: list[BuildAction] | None = None) -> WorkflowWithPartsStage Adds a new blank page to the workflow. Parameters: options: Additional options for the new page, such as page size, orientation, etc. (optional) actions: Actions to apply to the new page (optional) Returns: The workflow builder instance for method chaining. ``` ---------------------------------------- TITLE: PSPDFKit Builder API Output Configuration DESCRIPTION: This section describes the methods available for configuring output options when using the PSPDFKit Builder API, such as setting metadata and page labels. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md#_snippet_9 LANGUAGE: APIDOC CODE: ``` set_output_options() - Description: Configures general output settings for the document, including metadata and optimization. - Parameters: - metadata: Dictionary, document metadata (e.g., {"title": "My Document"}). - optimization: Dictionary, optimization settings (details not provided). set_page_labels() - Description: Assigns custom labels to specific page ranges within the output document. - Parameters: - labels: Array of objects, each with "pages" (start, end) and "label" (string). - pages: Object, defines the page range. - start: Integer, starting page index (0-based). - end: Optional, Integer, ending page index (inclusive). - label: String, the label for the page range. ``` ---------------------------------------- TITLE: Chain Multiple Operations using Builder API in Python DESCRIPTION: Demonstrates how to chain multiple PDF processing operations using the Nutrient DWS client's Builder API, allowing for sequential application of transformations like rotation, OCR, and watermarking. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md#_snippet_15 LANGUAGE: python CODE: ``` client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "DRAFT", "width": 200, "height": 100}) \ .execute(output_path="processed.pdf") ``` ---------------------------------------- TITLE: BuildActions.ocr Method DESCRIPTION: Details the BuildActions.ocr method for creating an OCR action in PSPDFKit workflows. It specifies the language(s) for text extraction from documents. SOURCE: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md#_snippet_37 LANGUAGE: APIDOC CODE: ``` BuildActions.ocr(language: str | list[str]) Creates an OCR (Optical Character Recognition) action to extract text from images or scanned documents. Parameters: language: Language(s) for OCR. Can be a single language or a list of languages. Example: workflow.apply_action(BuildActions.ocr('english')) ```