### Nutrient DWS Client Development Setup Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Instructions for setting up the development environment for the Nutrient DWS Client Python library, including cloning the repository, installing in development mode, and running type checking and linting. ```bash # Clone the repository git clone https://github.com/PSPDFKit/nutrient-dws-client-python.git cd nutrient-dws-client-python # Install in development mode pip install -e ".[dev]" # Run type checking mypy src/ # Run linting ruff check src/ # Run formatting ruff format src/ ``` -------------------------------- ### Setup Python Development Environment for nutrient-dws Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md Instructions for forking, cloning, creating a virtual environment, and installing development dependencies for the nutrient-dws Python client library, ensuring a ready-to-develop setup. ```bash git clone https://github.com/YOUR_USERNAME/nutrient-dws-client-python.git cd nutrient-dws-client-python ``` ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` ```bash pip install -e ".[dev]" pre-commit install ``` -------------------------------- ### Quick Start: Initialize NutrientClient Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Demonstrates the basic initialization of the NutrientClient with an API key. This is the first step to interacting with the Nutrient DWS API. ```Python from nutrient_dws import NutrientClient client = NutrientClient(api_key='your_api_key') ``` -------------------------------- ### Direct Methods: Document Processing Examples Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Provides examples of using direct async methods of the NutrientClient for common document processing tasks such as conversion, text extraction, watermarking, and merging. ```Python import asyncio from nutrient_dws import NutrientClient async def main(): client = NutrientClient(api_key='your_api_key') # Convert a document pdf_result = await client.convert('document.docx', 'pdf') # Extract text text_result = await client.extract_text('document.pdf') # Add a watermark watermarked_doc = await client.watermark_text('document.pdf', 'CONFIDENTIAL') # Merge multiple documents merged_pdf = await client.merge(['doc1.pdf', 'doc2.pdf', 'doc3.pdf']) asyncio.run(main()) ``` -------------------------------- ### Set Up Nutrient DWS Client for Development Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Provides instructions for setting up the Nutrient DWS client project for local development, including cloning the repository, installing dependencies in development mode, and running initial checks. ```bash # Clone the repository git clone https://github.com/jdrhyne/nutrient-dws-client-python.git cd nutrient-dws-client-python # Install in development mode pip install -e ".[dev]" # Run tests pytest # Run linting ruff check . # Run type checking mypy src tests ``` -------------------------------- ### Install Nutrient DWS Python Client Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Installs the Nutrient DWS Python client library using pip. This is the primary method for adding the library to your Python environment. ```Bash pip install nutrient-dws ``` -------------------------------- ### Implementing Multi-Step Workflows with Builder API in Python Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md This example illustrates the use of the fluent Builder API in the `nutrient-dws-client-python` library. It demonstrates how to chain multiple processing steps, such as converting a DOCX to PDF and rotating it, into a single API call. The example also includes basic error handling for `APIError`. ```python from nutrient_dws import APIError # User Story: Convert a DOCX to PDF and rotate it (Builder version) try: client.build(input_file="path/to/document.docx") \ .add_step(tool="rotate-pages", options={"degrees": 90}) \ .execute(output_path="path/to/final_document.pdf") print("Workflow complete. File saved to path/to/final_document.pdf") except APIError as e: print(f"An API error occurred: Status {e.status_code}, Response: {e.response_body}") ``` -------------------------------- ### Example Python Unit Test with Pytest Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md A basic example of a unit test function written in Python using the pytest framework, demonstrating how to instantiate a client and assert expected behavior for new features. ```python def test_new_feature(): """Test description.""" client = NutrientClient(api_key="test-key") result = client.new_feature() assert result == expected_value ``` -------------------------------- ### Install Nutrient DWS Python Client Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_NOTES.md Installs the Nutrient Document Web Services (DWS) Python client library using pip, the standard package installer for Python. This command fetches the latest version of the library from PyPI. ```bash pip install nutrient-dws ``` -------------------------------- ### Verify PyPI Package Installation Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md Installs a specific version of the `nutrient-dws` package from PyPI using pip. This command is used to verify that the package has been successfully published and is installable. ```Shell pip install nutrient-dws==1.0.x ``` -------------------------------- ### Quick Start with Nutrient DWS Python Client Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_NOTES.md Demonstrates how to initialize the Nutrient DWS client and perform various document processing operations. It showcases both the Direct API for single operations and the Builder API for complex, multi-step workflows, including automatic Office document conversion and merging of different document types. ```python from nutrient_dws import NutrientClient # Initialize client client = NutrientClient(api_key="your-api-key") # Direct API - Single operation client.rotate_pages("document.pdf", output_path="rotated.pdf", degrees=90) # Convert Office document to PDF (automatic!) client.convert_to_pdf("report.docx", output_path="report.pdf") # Builder API - Complex workflow client.build(input_file="scan.pdf") \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "CONFIDENTIAL"}) \ .add_step("flatten-annotations") \ .execute(output_path="processed.pdf") # Merge PDFs and Office documents together client.merge_pdfs([ "chapter1.pdf", "chapter2.docx", "appendix.xlsx" ], output_path="complete_document.pdf") ``` -------------------------------- ### Displaying Repository Metrics with README Badges (Markdown) Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/GITHUB_ABOUT.md This markdown snippet provides examples of common badges used in GitHub README files to display project information such as Python version, test coverage, license, and PyPI package version. These badges help convey key project stats at a glance and improve repository visibility. ```markdown ![Python](https://img.shields.io/badge/python-3.8+-blue.svg) ![Coverage](https://img.shields.io/badge/coverage-92%25-brightgreen.svg) ![License](https://img.shields.io/badge/license-MIT-green.svg) ![PyPI](https://img.shields.io/pypi/v/nutrient-dws.svg) ``` -------------------------------- ### Complex Multi-step Workflow Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md A complex workflow example that adds specific pages from a PDF, merges another PDF, applies OCR, adds a watermark, creates redactions, and outputs the result in PDF/A format with optimization. Includes a progress callback. ```python def progress_callback(current: int, total: int) -> None: print(f'Processing step {current} of {total}') result = await (client .workflow() .add_file_part('document.pdf', {'pages': {'start': 0, 'end': 5}}) .add_file_part('appendix.pdf') .apply_actions([ BuildActions.ocr({'language': 'english'}), BuildActions.watermark_text('CONFIDENTIAL'), BuildActions.create_redactions_preset('email-address', 'apply') ]) .output_pdfa({ 'level': 'pdfa-2b', 'optimize': { 'mrcCompression': True } }) .execute(on_progress=progress_callback)) ``` -------------------------------- ### Using Direct API for Document Conversion and Rotation in Python Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md This example demonstrates the direct API design pattern for the `nutrient-dws-client-python` library. It shows how to convert a DOCX file to PDF and then rotate the pages of the resulting PDF using separate, direct method calls. It highlights the use of keyword-only arguments for tool-specific parameters and handling of in-memory byte streams. ```python # User Story: Convert a DOCX to PDF and rotate it. # Step 1: Convert DOCX to PDF pdf_bytes = client.convert_to_pdf( input_file="path/to/document.docx" ) # Step 2: Rotate the newly created PDF from memory client.rotate_pages( input_file=pdf_bytes, output_path="path/to/rotated_document.pdf", degrees=90 # keyword-only argument ) print("File saved to path/to/rotated_document.pdf") ``` -------------------------------- ### Nutrient DWS Python Client Builder API Reference Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md This section describes the fluent interface of the Builder API for multi-step document processing workflows. It outlines the core methods used to construct and execute a workflow, including starting a build, adding processing steps, executing the workflow, and setting output options. ```APIDOC client.build(input_file) - Starts a multi-step workflow. - Parameters: - input_file: The initial input file for the workflow (str, Path, bytes, or file-like object). - Returns: A builder object for chaining. .add_step(tool, options=None) - Adds a processing step to the current workflow. - Parameters: - tool: The name of the processing tool (e.g., "rotate-pages"). - options: Optional dictionary of tool-specific parameters. - Returns: The builder object for chaining. .execute(output_path=None) - Executes the defined workflow. - Parameters: - output_path: Optional path to save the final output file. If not provided, returns bytes. - Returns: Bytes of the processed file if output_path is None, otherwise None. .set_output_options(**options) - Sets global output metadata or optimization options for the workflow. - Parameters: - options: Keyword arguments for output settings. - Returns: The builder object for chaining. ``` -------------------------------- ### Basic Document Conversion (DOCX to PDF) Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Converts a DOCX document to PDF format using the workflow client. This is a fundamental example of adding a file part and specifying the output format. ```python result = await (client .workflow() .add_file_part('document.docx') .output_pdf() .execute()) ``` -------------------------------- ### Builder API: Build Complex Document Processing Pipeline Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Illustrates the use of the Builder API to construct a complex, chained document processing workflow. This example combines OCR, page rotation, watermarking, annotation flattening, and output options like metadata and optimization into a single execution. ```python # Complex document processing pipeline result = client.build(input_file="raw-scan.pdf") \ .add_step("ocr-pdf", {"language": "en"}) \ .add_step("rotate-pages", {"degrees": -90, "page_indexes": [0]}) \ .add_step("watermark-pdf", { "text": "PROCESSED", "opacity": 0.3, "position": "top-right" }) \ .add_step("flatten-annotations") \ .set_output_options( metadata={"title": "Processed Document", "author": "DWS Client"}, optimize=True ) \ .execute(output_path="final.pdf") ``` -------------------------------- ### Nutrient DWS Python Client Error Handling Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Demonstrates how to handle various exceptions raised by the Nutrient DWS Python client, including validation, authentication, API, and network errors. Provides examples for accessing error messages and details. ```python from nutrient_dws import ( NutrientError, ValidationError, APIError, AuthenticationError, NetworkError ) try: result = await client.convert('file.docx', 'pdf') except ValidationError as error: # Invalid input parameters print(f'Invalid input: {error.message} - Details: {error.details}') except AuthenticationError as error: # Authentication failed print(f'Auth error: {error.message} - Status: {error.status_code}') except APIError as error: # API returned an error print(f'API error: {error.message} - Status: {error.status_code} - Details: {error.details}') except NetworkError as error: # Network request failed print(f'Network error: {error.message} - Details: {error.details}') ``` -------------------------------- ### Get Account Information Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Retrieves account information associated with the current API key. The returned data includes subscription details. ```python account_info = await client.get_account_info() # Access subscription information print(account_info['subscriptionType']) ``` -------------------------------- ### Configure Output Options and Page Labels with Python Builder API Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md This example demonstrates how to use the PSPDFKit Python client's Builder API to set document metadata and page labels during the document processing workflow. It shows combining a processing step with output configuration. ```python client.build("document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .set_output_options(metadata={"title": "My Document"}) \ .set_page_labels([{"pages": {"start": 0}, "label": "Chapter 1"}]) \ .execute("output.pdf") ``` -------------------------------- ### Error Handling in Workflows Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Provides a robust example of error handling for workflow execution. It includes a try-except block to catch unexpected errors and iterates through specific workflow errors reported in the result. ```python try: result = await (client .workflow() .add_file_part('document.pdf') .output_pdf() .execute()) if not result['success']: # Handle workflow errors for error in result.get('errors', []): print(f"Step {error['step']}: {error['error']['message']}") except Exception as error: # Handle unexpected errors print(f'Workflow execution failed: {error}') ``` -------------------------------- ### Set Page Labels and Rotate Pages with Python Builder API Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md This example demonstrates how to use the PSPDFKit Python client's Builder API to rotate pages and set custom page labels for a PDF document. It initializes a build process, adds a rotation step, defines page label ranges, and executes the operation to an output file. ```python client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .set_page_labels([ {"pages": {"start": 0, "end": 2}, "label": "Introduction"}, {"pages": {"start": 3}, "label": "Content"} ]) \ .execute(output_path="labeled_document.pdf") ``` -------------------------------- ### NutrientClient Constructor Parameters Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Details the parameters for initializing the NutrientClient, including the API key, optional base URL, and request timeout. ```APIDOC NutrientClient(api_key: str | Callable[[], Awaitable[str] | str], base_url: str | None = None, timeout: int | None = None) Parameters: - `api_key` (required): Your API key string or async function returning a token - `base_url` (optional): Custom API base URL (defaults to `https://api.nutrient.io`) - `timeout` (optional): Request timeout in milliseconds ``` -------------------------------- ### Run Tests for Nutrient DWS Client Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Details how to execute tests for the Nutrient DWS client project, covering running all tests, generating coverage reports, and executing specific test files. ```bash # Run all tests pytest # Run with coverage pytest --cov=nutrient --cov-report=html # Run specific test file pytest tests/unit/test_client.py ``` -------------------------------- ### NutrientClient Initialization with API Key Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Initializes the NutrientClient by providing your API key directly. This is the most straightforward way to authenticate with the Nutrient DWS service. ```python from nutrient_dws import NutrientClient client = NutrientClient(api_key='your_api_key') ``` -------------------------------- ### Running Nutrient DWS Client Tests Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Provides commands for running tests using pytest, including running all tests, generating coverage reports, and executing specific test suites (unit and integration). ```bash # Run all tests python -m pytest # Run with coverage report python -m pytest --cov=nutrient_dws --cov-report=html # Run only unit tests python -m pytest tests/unit/ # Run integration tests (requires API key) NUTRIENT_API_KEY=your_key python -m pytest tests/test_integration.py ``` -------------------------------- ### Create Nutrient DWS Python Client Workflow Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Illustrates different methods for creating a document processing workflow using the Nutrient DWS Python Client. Shows how to initialize a workflow from a client instance, override timeouts, and create a workflow independently with an API key. ```python # Creating Workflow from a client workflow = client.workflow() # Override the client timeout workflow = client.workflow(60000) # Create a workflow without a client from nutrient_dws.builder.builder import StagedWorkflowBuilder workflow = StagedWorkflowBuilder({ 'apiKey': 'your-api-key' }) ``` -------------------------------- ### Initialize Nutrient DWS Python Client Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md Demonstrates how to initialize the `NutrientClient` class, providing an API key and an optional timeout. It also shows support for context managers for automatic resource management. The `AuthenticationError` is raised if the API key is invalid. ```python from nutrient_dws import NutrientClient, AuthenticationError # API key from parameter (takes precedence) or NUTRIENT_API_KEY env var client = NutrientClient(api_key="YOUR_DWS_API_KEY", timeout=300) # Context manager support with NutrientClient() as client: result = client.convert_to_pdf("document.docx") ``` -------------------------------- ### Workflow System: Document Processing Pipeline Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Illustrates how to use the fluent builder pattern (workflow system) to construct and execute complex document processing pipelines. It shows adding files, applying actions like watermarking, and setting output options. ```Python from nutrient_dws.builder.constant import BuildActions async def main(): client = NutrientClient(api_key='your_api_key') result = await (client .workflow() .add_file_part('document.pdf') .add_file_part('appendix.pdf') .apply_action(BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.5, 'fontSize': 48 })) .output_pdf({ 'optimize': { 'mrcCompression': True, 'imageOptimizationQuality': 2 } }) .execute()) asyncio.run(main()) ``` -------------------------------- ### Nutrient DWS Python Client API Reference Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SPECIFICATION.md Comprehensive reference for the `NutrientClient` class and its associated APIs, including initialization, Direct API methods for single operations, and the Builder API for constructing multi-step workflows. It outlines the core components and their interactions with the Nutrient DWS API endpoints. ```APIDOC NutrientClient Class: __init__(api_key: str, timeout: int = 300) - Description: Initializes the Nutrient DWS client for API interactions. - Parameters: - api_key (str): Your DWS API key. This parameter takes precedence over the NUTRIENT_API_KEY environment variable. - timeout (int): The maximum time in seconds to wait for API calls to complete (default: 300). - Error Handling: Raises `AuthenticationError` if the provided API key is invalid upon the first API call. Context Manager Support: - Description: The `NutrientClient` supports Python's context manager protocol, ensuring proper resource cleanup. - Usage: with NutrientClient() as client: # Perform API operations within this block result = client.convert_to_pdf("document.docx") Direct API Methods (on NutrientClient): - Description: A collection of static methods directly accessible on the `NutrientClient` object, each corresponding to a specific document processing tool. These methods abstract the `POST /process/{tool}` endpoint. - Signature: `client.tool_name(input_file: Union[str, bytes, IO], ..., output_path: Optional[str] = None)` - `input_file`: The input document, specified as a file path, bytes, or a file-like object. - `output_path` (Optional[str]): If provided, the processed file will be saved to this path. If `None`, the processed file's bytes are returned. - Example Methods: - `client.rotate_pages(input_file: str, degrees: int, output_path: Optional[str] = None)` - Description: Rotates pages of an input document by a specified degree. - Parameters: - `input_file` (str): Path to the input document (e.g., 'path/to/doc.pdf'). - `degrees` (int): The rotation angle (e.g., 90, 180, 270). - `client.convert_to_pdf(input_file: str, output_path: Optional[str] = None)` - Description: Converts an input document to PDF format. - Parameters: - `input_file` (str): Path to the input document (e.g., 'document.docx'). Builder API (BuildAPIWrapper): - Description: A separate class, instantiated via `client.build()`, providing a fluent, chainable interface for composing and executing complex, multi-step document processing workflows. It abstracts the `POST /build` endpoint. - Instantiation: `client.build(input_file: Union[str, bytes, IO]) -> BuildAPIWrapper` - Parameters: - `input_file`: The initial input document for the workflow. - Methods on `BuildAPIWrapper`: - `add_step(tool: str, options: Dict) -> BuildAPIWrapper` - Description: Adds a processing step to the current workflow chain. - Parameters: - `tool` (str): The name of the tool to apply (e.g., 'rotate-pages', 'ocr-pdf'). - `options` (Dict): A dictionary containing options specific to the chosen tool. - `execute(output_path: Optional[str] = None) -> Union[bytes, None]` - Description: Compiles the chained workflow into a `multipart/form-data` request and sends it to the `/build` endpoint. - Parameters: - `output_path` (Optional[str]): Path to save the processed file. If `None`, the processed file's bytes are returned. - Returns: Processed file bytes if `output_path` is `None`, otherwise `None`. - Usage Example: client.build(input_file='doc.docx')\ .add_step(tool='rotate-pages', options={'degrees': 90})\ .add_step(tool='ocr-pdf', options={})\ .execute(output_path='processed_doc.pdf') ``` -------------------------------- ### NutrientClient Initialization with Async Token Provider Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Initializes the NutrientClient using an asynchronous token provider function. This is useful for fetching tokens securely from a remote source. ```python import httpx from nutrient_dws import NutrientClient async def get_token(): async with httpx.AsyncClient() as http_client: response = await http_client.get('/api/get-nutrient-token') data = response.json() return data['token'] client = NutrientClient(api_key=get_token) ``` -------------------------------- ### Handle Various File Input Types in Python Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Illustrates different methods for providing input files to the Nutrient DWS client, including file paths (string or Path object), raw bytes, file-like objects, and URLs for supported operations. ```python # File path (string or Path object) client.convert_to_pdf("document.docx") client.convert_to_pdf(Path("document.docx")) # Bytes with open("document.docx", "rb") as f: file_bytes = f.read() client.convert_to_pdf(file_bytes) # File-like object with open("document.docx", "rb") as f: client.convert_to_pdf(f) # URL (for supported operations) client.import_from_url("https://example.com/document.pdf") ``` -------------------------------- ### Apply Actions to Workflow Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Shows how to apply single or multiple actions to a workflow using the PSPDFKit Python client. Supports actions like watermarking and OCR, with options for customization. ```python workflow.apply_action(BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.5, 'fontSize': 48 })) ``` ```python workflow.apply_action( BuildActions.watermark_text('CONFIDENTIAL', { 'opacity': 0.3, 'rotation': 45 }) ) workflow.apply_action(BuildActions.ocr('english')) ``` ```python workflow.apply_actions([ BuildActions.watermark_text('DRAFT', {'opacity': 0.5}), BuildActions.ocr('english'), BuildActions.flatten() ]) ``` -------------------------------- ### Create and Push Git Release Tag Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md Creates a new Git tag for the specified release version and immediately pushes it to the remote origin. This action typically triggers automated release workflows. ```Shell git tag v1.0.x && git push origin v1.0.x ``` -------------------------------- ### Staged Workflow Builder Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Demonstrates building a workflow dynamically using a staged approach. Allows for conditional addition of file parts and application of actions, and setting the output format based on variables. ```python # Create a staged workflow workflow = client.workflow() # Add parts workflow.add_file_part('document.pdf') # Conditionally add more parts if include_appendix: workflow.add_file_part('appendix.pdf') # Conditionally apply actions if needs_watermark: workflow.apply_action(BuildActions.watermark_text('CONFIDENTIAL')) # Set output format based on user preference if output_format == 'pdf': workflow.output_pdf() elif output_format == 'docx': workflow.output_office('docx') else: workflow.output_image('png') # Execute the workflow result = await workflow.execute() ``` -------------------------------- ### Git Commands for Tagging and Pushing Releases Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md Commands used to tag a new version release in Git and push the tags to the remote repository as part of the official release process. ```bash git tag v0.1.0 ``` ```bash git push --tags ``` -------------------------------- ### Run Tests, Type Checking, and Linting for Python Project Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md Commands to execute unit tests, perform static type analysis, and run linting checks during the development workflow to ensure code quality and adherence to coding standards. ```bash pytest ``` ```bash mypy src/ ``` ```bash ruff check src/ tests/ ``` -------------------------------- ### Integrate with Coding Agents Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Commands to add code rules for various coding agents to help them understand and utilize the Nutrient DWS Python Client library effectively. This prevents hallucination and ensures proper feature usage. ```Bash # Adding code rule to Claude Code dws-add-claude-code-rule ``` ```Bash # Adding code rule to GitHub Copilot dws-add-github-copilot-rule ``` ```Bash # Adding code rule to Junie (Jetbrains) dws-add-junie-rule ``` ```Bash # Adding code rule to Cursor dws-add-cursor-rule ``` ```Bash # Adding code rule to Windsurf dws-add-windsurf-rule ``` -------------------------------- ### Apply Annotations from Instant JSON Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Applies annotations to the document from a specified Instant JSON file. ```python # Apply annotations from Instant JSON file workflow.apply_action(BuildActions.apply_instant_json('/path/to/annotations.json')) ``` -------------------------------- ### apply_action Method Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Applies a single action to the entire workflow, such as watermarking or OCR. ```APIDOC apply_action(action: BuildAction) -> WorkflowWithActionsStage Applies a single action to the workflow. Parameters: action: The action to apply to the workflow. Returns: The workflow builder instance for method chaining. ``` -------------------------------- ### Commit Release Preparation Changes Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/RELEASE_PROCESS.md Commits the version and changelog updates, preparing the repository for a new release tag. This commit message follows a conventional commit format. ```Shell git commit -m "chore: prepare release v1.0.x" ``` -------------------------------- ### Git Commands for Feature Branching and Committing Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/CONTRIBUTING.md Standard Git commands for creating a new feature branch, staging changes, and committing them with a conventional commit message, adhering to project contribution guidelines. ```bash git checkout -b feature/your-feature-name ``` ```bash git add . ``` ```bash git commit -m "feat: add new feature" ``` -------------------------------- ### Output PDF/UA Format Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Configures the output to be in PDF/UA format, ensuring universal accessibility. Supports metadata, password protection, and optimization. ```APIDOC output_pdfua(options?) Sets the output format to PDF/UA (Universal Accessibility). Parameters: - options: dict[str, Any] | None - Additional options for PDF/UA output (optional): - metadata: dict[str, Any] - Document metadata properties like title, author. - labels: list[dict[str, Any]] - Custom labels to add to the document for organization and categorization. - user_password: str - Password required to open the document. When set, the PDF will be encrypted. - owner_password: str - Password required to modify the document. Provides additional security beyond the user password. - user_permissions: list[str] - List of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc. - optimize: dict[str, Any] - PDF optimization settings to reduce file size and improve performance. - mrcCompression: bool - When True, applies Mixed Raster Content compression to reduce file size. - imageOptimizationQuality: int - Controls the quality of image optimization (1-5, where 1 is highest quality). Returns: WorkflowWithOutputStage - The workflow builder instance for method chaining. Example: ```python # Set output format to PDF/UA with default options workflow.output_pdfua() # Set output format to PDF/UA with specific options workflow.output_pdfua({ 'metadata': { 'title': 'Accessible Document', 'author': 'Document System' }, 'optimize': { 'mrcCompression': True, 'imageOptimizationQuality': 3 } }) ``` ``` -------------------------------- ### apply_actions Method Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Applies a list of actions to the workflow, enabling batch processing of document modifications. ```APIDOC apply_actions(actions: list[BuildAction]) -> WorkflowWithActionsStage Applies multiple actions to the workflow. Parameters: actions: A list of actions to apply to the workflow. Returns: The workflow builder instance for method chaining. ``` -------------------------------- ### Output Document as Office File Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Sets the output format to an Office document format (DOCX, XLSX, PPTX). Allows conversion to Word, Excel, or PowerPoint formats. ```python workflow.output_office('docx') workflow.output_office('xlsx') workflow.output_office('pptx') ``` -------------------------------- ### Output PDF/A Format Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Configures the output to be in PDF/A format, suitable for long-term archival. Supports various conformance levels and optimization options. ```APIDOC output_pdfa(options?) Sets the output format to PDF/A (archival PDF). Parameters: - options: dict[str, Any] | None - Additional options for PDF/A output (optional): - conformance: str - The PDF/A conformance level to target. Options include 'pdfa-1b', 'pdfa-1a', 'pdfa-2b', 'pdfa-2a', 'pdfa-3b', 'pdfa-3a'. Different levels have different requirements for long-term archiving. - vectorization: bool - When True, attempts to convert raster content to vector graphics where possible, improving quality and reducing file size. - rasterization: bool - When True, converts vector graphics to raster images, which can help with compatibility in some cases. - metadata: dict[str, Any] - Document metadata properties like title, author. - labels: list[dict[str, Any]] - Custom labels to add to the document for organization and categorization. - user_password: str - Password required to open the document. When set, the PDF will be encrypted. - owner_password: str - Password required to modify the document. Provides additional security beyond the user password. - user_permissions: list[str] - List of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc. - optimize: dict[str, Any] - PDF optimization settings to reduce file size and improve performance. - mrcCompression: bool - When True, applies Mixed Raster Content compression to reduce file size. - imageOptimizationQuality: int - Controls the quality of image optimization (1-5, where 1 is highest quality). Returns: WorkflowWithOutputStage - The workflow builder instance for method chaining. Example: ```python # Set output format to PDF/A with default options workflow.output_pdfa() # Set output format to PDF/A with specific options workflow.output_pdfa({ 'conformance': 'pdfa-2b', 'vectorization': True, 'metadata': { 'title': 'Archive Document', 'author': 'Document System' }, 'optimize': { 'mrcCompression': True } }) ``` ``` -------------------------------- ### Implement Error Handling for Nutrient DWS Client in Python Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Shows how to catch and handle specific exceptions provided by the Nutrient DWS client library, such as authentication, validation, API, timeout, and file processing errors, to build robust applications. ```python from nutrient_dws import ( NutrientError, AuthenticationError, APIError, ValidationError, TimeoutError, FileProcessingError ) try: client.convert_to_pdf("document.docx") except AuthenticationError: print("Invalid API key") except ValidationError as e: print(f"Invalid parameters: {e.errors}") except APIError as e: print(f"API error: {e.status_code} - {e.message}") except TimeoutError: print("Request timed out") except FileProcessingError as e: print(f"File processing failed: {e}") ``` -------------------------------- ### Nutrient DWS Client Available Operations API Reference Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Comprehensive reference for the core PDF manipulation, enhancement, and security operations available through the Nutrient DWS client library, including their purpose and general functionality. Also details the Builder API for chaining operations. ```APIDOC PDF Manipulation Operations: - merge_pdfs: Merge multiple PDFs into one - rotate_pages: Rotate PDF pages (all or specific pages) - flatten_annotations: Flatten form fields and annotations PDF Enhancement Operations: - ocr_pdf: Add searchable text layer (English and German) - watermark_pdf: Add text or image watermarks PDF Security Operations: - apply_redactions: Apply existing redaction annotations Builder API: The Builder API allows chaining multiple operations. Example Usage: client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "DRAFT", "width": 200, "height": 100}) \ .execute(output_path="processed.pdf") ``` -------------------------------- ### Output Document as HTML Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Sets the output format to HTML. Supports 'page' layout for preserving original structure or 'reflow' for a continuous text flow. ```python workflow.output_html('page') ``` -------------------------------- ### Output Document as Image Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Sets the output format to an image format (PNG, JPEG, WEBP). Supports various options for resolution, page selection, and quality. ```python workflow.output_image('png', {'dpi': 300}) workflow.output_image('jpeg', { 'dpi': 300, 'pages': {'start': 1, 'end': 3} }) workflow.output_image('webp', { 'width': 1200, 'height': 800, 'dpi': 150 }) ``` -------------------------------- ### NutrientClient Direct API Methods Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md Comprehensive documentation for the `NutrientClient` instance methods, covering various document processing functionalities such as conversion, annotation flattening, page rotation, OCR, watermarking, redaction, and merging, with support for implicit Office document conversion. ```APIDOC convert_to_pdf(input_file, output_path=None) - Converts Office documents to PDF format using implicit conversion. - Parameters: - input_file: Office document (DOCX, XLSX, PPTX) - output_path: Optional path to save output - Example: # Convert DOCX to PDF client.convert_to_pdf("document.docx", "document.pdf") # Convert and get bytes pdf_bytes = client.convert_to_pdf("spreadsheet.xlsx") - Note: HTML files are not currently supported. flatten_annotations(input_file, output_path=None) - Flattens all annotations and form fields in a PDF, converting them to static page content. - Parameters: - input_file: PDF or Office document - output_path: Optional path to save output - Example: client.flatten_annotations("document.pdf", "flattened.pdf") # Works with Office docs too! client.flatten_annotations("form.docx", "flattened.pdf") rotate_pages(input_file, output_path=None, degrees=0, page_indexes=None) - Rotates pages in a PDF or converts Office document to PDF and rotates. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - degrees: Rotation angle (90, 180, 270, or -90) - page_indexes: Optional list of page indexes to rotate (0-based) - Example: # Rotate all pages 90 degrees client.rotate_pages("document.pdf", "rotated.pdf", degrees=90) # Works with Office documents too! client.rotate_pages("presentation.pptx", "rotated.pdf", degrees=180) # Rotate specific pages client.rotate_pages("document.pdf", "rotated.pdf", degrees=180, page_indexes=[0, 2]) ocr_pdf(input_file, output_path=None, language="english") - Applies OCR to make a PDF searchable. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - language: OCR language - supported values: "english" or "eng", "deu" or "german" - Example: client.ocr_pdf("scanned.pdf", "searchable.pdf", language="english") # Convert DOCX to searchable PDF client.ocr_pdf("document.docx", "searchable.pdf", language="eng") watermark_pdf(input_file, output_path=None, text=None, image_url=None, width=200, height=100, opacity=1.0, position="center") - Adds a watermark to all pages of a PDF. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document - output_path: Optional output path - text: Text for watermark (either text or image_url required) - image_url: URL of image for watermark - width: Width in points (required) - height: Height in points (required) - opacity: Opacity from 0.0 to 1.0 - position: One of: "top-left", "top-center", "top-right", "center", "bottom-left", "bottom-center", "bottom-right" - Example: # Text watermark client.watermark_pdf( "document.pdf", "watermarked.pdf", text="CONFIDENTIAL", width=300, height=150, opacity=0.5, position="center" ) apply_redactions(input_file, output_path=None) - Applies redaction annotations to permanently remove content. Converts Office documents to PDF first if needed. - Parameters: - input_file: PDF or Office document with redaction annotations - output_path: Optional output path - Example: client.apply_redactions("document_with_redactions.pdf", "redacted.pdf") merge_pdfs(input_files, output_path=None) - Merges multiple files into one PDF. Automatically converts Office documents to PDF before merging. - Parameters: - input_files: List of files to merge (PDFs and/or Office documents) - output_path: Optional output path - Example: # Merge PDFs only client.merge_pdfs( ["document1.pdf", "document2.pdf", "document3.pdf"], "merged.pdf" ) # Mix PDFs and Office documents - they'll be converted automatically! client.merge_pdfs( ["report.pdf", "spreadsheet.xlsx", "presentation.pptx"], "combined.pdf" ) ``` -------------------------------- ### add_new_page Method Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/src/nutrient_dws_scripts/LLM_DOC.md Adds a new blank page to the workflow, with options for page layout and actions. ```APIDOC add_new_page(options: NewPagePartOptions | None = None, actions: list[BuildAction] | None = None) -> WorkflowWithPartsStage Adds a new blank page to the workflow. Parameters: options: Additional options for the new page, such as page size, orientation, etc. (optional) actions: Actions to apply to the new page (optional) Returns: The workflow builder instance for method chaining. ``` -------------------------------- ### PSPDFKit Builder API Output Configuration Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/SUPPORTED_OPERATIONS.md This section describes the methods available for configuring output options when using the PSPDFKit Builder API, such as setting metadata and page labels. ```APIDOC set_output_options() - Description: Configures general output settings for the document, including metadata and optimization. - Parameters: - metadata: Dictionary, document metadata (e.g., {"title": "My Document"}). - optimization: Dictionary, optimization settings (details not provided). set_page_labels() - Description: Assigns custom labels to specific page ranges within the output document. - Parameters: - labels: Array of objects, each with "pages" (start, end) and "label" (string). - pages: Object, defines the page range. - start: Integer, starting page index (0-based). - end: Optional, Integer, ending page index (inclusive). - label: String, the label for the page range. ``` -------------------------------- ### Chain Multiple Operations using Builder API in Python Source: https://github.com/pspdfkit/nutrient-dws-client-python/blob/main/README.md Demonstrates how to chain multiple PDF processing operations using the Nutrient DWS client's Builder API, allowing for sequential application of transformations like rotation, OCR, and watermarking. ```python client.build(input_file="document.pdf") \ .add_step("rotate-pages", {"degrees": 90}) \ .add_step("ocr-pdf", {"language": "english"}) \ .add_step("watermark-pdf", {"text": "DRAFT", "width": 200, "height": 100}) \ .execute(output_path="processed.pdf") ```