### Example Layout File Format Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/layout-format.md This example demonstrates the structure of a tab-delimited layout file used by PyFixWidth. It includes a title, comments, and field definitions with width, converter type, and field name. ```text employees # records on workers and their salaries 6 int employee_id 15 str job_title 8 float salary # negative values denote fields to skip when reading data -3 str blank 10 date hire_date ``` -------------------------------- ### Execute pyfixwidth CLI Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cli.md Demonstrates how to run the pyfixwidth tool using either the installed script or the Python module syntax. It processes one or more input files based on a provided layout schema. ```bash pyfixwidth example/data.layout example/data1.txt python -m fixwidth example/data.layout example/data1.txt ``` -------------------------------- ### Install pyfixwidth via pip Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/index.md Installs the pyfixwidth package using the Python package manager. ```bash pip install pyfixwidth ``` -------------------------------- ### Command-Line Interface for Fixed-Width File Conversion Source: https://context7.com/poliquin/pyfixwidth/llms.txt Shows various ways to use the pyfixwidth command-line interface to convert fixed-width files to delimited formats. Includes options for specifying delimiters, output files, error handling, and running as a module. ```bash # Basic usage - output to stdout pyfixwidth example/data.layout example/data1.txt # Or run as a module python -m fixwidth example/data.layout example/data1.txt example/data2.txt # Output: # employee_id\ job_title\ salary\ hire_date # 100001\ CEO\ 15000.0\ 1995-08-23 # 100002\ Programmer\ 8500.0\ 2002-11-10 # 100003\ Data Scientist\ 10000.0\ 2005-07-01 # ... # Write to file with comma delimiter, ignoring type errors pyfixwidth -d ',' -o output.csv -i example/data.layout example/data1.txt # Skip blank lines and suppress warning logs pyfixwidth -s --nolog example/data.layout example/data1.txt ``` -------------------------------- ### Parse fixed-width files via CLI Source: https://github.com/poliquin/pyfixwidth/blob/master/README.md Demonstrates how to use the pyfixwidth command-line tool to parse a data file using a specified layout file. ```bash python -m fixwidth example/data.layout example/data1.txt example/data2.txt # Or using the installed command: pyfixwidth example/data.layout example/data1.txt example/data2.txt ``` -------------------------------- ### Parse layout and data files using Python API Source: https://github.com/poliquin/pyfixwidth/blob/master/README.md Shows how to load a layout file and parse a corresponding data file into Python objects using the library's primary API functions. ```python from fixwidth import read_file_format, parse_file title, layout = read_file_format('example/data.layout') print(title) rows = parse_file('example/data1.txt', spec=layout, type_errors='ignore') for row in rows: print('Salary for {} is {}'.format(row['employee_id'], row['salary'])) ``` -------------------------------- ### Skip Record Padding in Python Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cookbook.md Illustrates how to use negative widths in a layout schema to consume and discard specific bytes within a fixed-width record. ```python layout = [ (6, 'int', 'employee_id'), (-2, 'str', 'padding'), (10, 'date', 'hire_date'), ] ``` -------------------------------- ### Use DictReader for stream processing Source: https://github.com/poliquin/pyfixwidth/blob/master/README.md Demonstrates using the DictReader class to process binary file objects, either by referencing a layout file or providing a list of tuples directly. ```python import fixwidth # Using a layout file with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader( fh, fieldinfo='example/data.layout', skip_blank_lines=True, ) first_row = next(reader) print(first_row['job_title']) # Using a list of tuples layout = [ (6, 'int', 'employee_id'), (15, 'str', 'job_title'), (8, 'float', 'salary'), (-3, 'str', 'blank'), (10, 'date', 'hire_date'), ] with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader(fh, layout) print(next(reader)) ``` -------------------------------- ### read_file_format Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Loads a tab-delimited layout file and returns the title and a list of FieldInfo objects representing the specification. ```APIDOC ## `read_file_format(path)` ### Description Loads a tab-delimited layout file and returns `(title, spec)`, where `spec` is a list of `FieldInfo(width, datatype, name)` values. ### Method GET (conceptual, as it reads from a file) ### Endpoint N/A (Function call) ### Parameters #### Path Parameters - **path** (str) - Required - The file path to the layout file. ### Request Example ```python from fixwidth import read_file_format title, spec = read_file_format('example/data.layout') print(title) ``` ### Response #### Success Response (200) - **title** (str) - The title of the layout. - **spec** (list[FieldInfo]) - A list of FieldInfo objects, where each FieldInfo contains width, datatype, and name. ``` -------------------------------- ### Register and Use Custom Type Converters in Python Source: https://context7.com/poliquin/pyfixwidth/llms.txt Demonstrates how to register custom type converters for specific data formats (e.g., uppercase strings, currency) and then use these custom types within a layout definition for parsing fixed-width data streams. ```python from pyfixwidth import register_type, parse_lines from io import BytesIO # Register a custom converter for uppercase strings @register_type('uppercase') def convert_uppercase(value): return value.strip().upper() # Register a custom converter for currency values @register_type('currency') def convert_currency(value): # Remove $ and commas, convert to float cleaned = value.strip().replace('$', '').replace(',', '') return float(cleaned) if cleaned else None # Use custom types in layouts layout = [ (10, 'uppercase', 'name'), (12, 'currency', 'amount') ] data = BytesIO(b'john smith $1,234.56 \n') rows = parse_lines(data, layout) row = next(rows) print(f"Name: {row['name']}, Amount: {row['amount']}") # Output: Name: JOHN SMITH, Amount: 1234.56 ``` -------------------------------- ### Layout File Format Definition Source: https://context7.com/poliquin/pyfixwidth/llms.txt Illustrates the structure and syntax of a layout file used by pyfixwidth. This includes defining the overall title, field specifications (width, type, name), and handling comments and negative widths for skipping data. ```text employees # records on workers and their salaries 6\tint\temployee_id 15\tstr\tjob_title 8\tfloat\tsalary # negative values denote fields to skip when reading data -3\tstr\tblank 10\tdate\thire_date ``` -------------------------------- ### Read file format layout Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Loads a tab-delimited layout file and returns the title and a list of FieldInfo specifications. This is required for subsequent parsing operations. ```python from fixwidth import read_file_format title, spec = read_file_format('example/data.layout') print(title) ``` -------------------------------- ### Parse a File with an Inline Layout in Python Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cookbook.md Shows how to parse a fixed-width file by defining a layout schema as a list of tuples. Each tuple specifies the field width, data type, and field name. ```python from fixwidth import parse_file layout = [ (6, 'int', 'employee_id'), (15, 'str', 'job_title'), (8, 'float', 'salary'), (-3, 'str', 'blank'), (10, 'date', 'hire_date'), ] for row in parse_file('example/data1.txt', spec=layout): print(row['employee_id'], row['hire_date']) ``` -------------------------------- ### Parse fixed-width binary files Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Opens a binary file and yields parsed rows as OrderedDict objects based on a provided specification. Supports error handling and blank line skipping configurations. ```python from fixwidth import parse_file, read_file_format _, spec = read_file_format('example/data.layout') for row in parse_file('example/data1.txt', spec=spec, type_errors='ignore'): print(row['employee_id'], row['salary']) ``` -------------------------------- ### Configure Encoding for File Parsing in Python Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cookbook.md Demonstrates how to specify character encoding when parsing files that contain non-ASCII characters, ensuring correct data interpretation. ```python rows = parse_file('records.txt', spec=layout, encoding='utf-8') ``` -------------------------------- ### Register custom converters Source: https://github.com/poliquin/pyfixwidth/blob/master/README.md Shows how to extend the library's functionality by registering a custom converter function using the @register_type decorator. ```python from fixwidth.converters import register_type @register_type('uppercase') def convert_uppercase(value): return value.strip().upper() ``` -------------------------------- ### DictReader for Fixed-Width Binary Files Source: https://context7.com/poliquin/pyfixwidth/llms.txt Provides a DictReader interface for binary file objects, similar to Python's csv.DictReader. It accepts either a path to a layout file or an inline layout definition. The file object must be opened in binary read mode. ```python import fixwidth # Using a layout file path with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader( fh, fieldinfo='example/data.layout', skip_blank_lines=True ) for row in reader: print(f"{row['job_title'].strip()}: hired {row['hire_date']}") # Output: # CEO: hired 1995-08-23 # Programmer: hired 2002-11-10 # Data Scientist: hired 2005-07-01 # Using an inline layout definition layout = [ (6, 'int', 'employee_id'), (15, 'str', 'job_title'), (8, 'float', 'salary'), (-3, 'str', 'blank'), # Negative width skips bytes (10, 'date', 'hire_date'), ] with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader(fh, layout) first_row = next(reader) print(f"Line {reader.line_num}: {first_row}") # Output: # Line 1: OrderedDict([('employee_id', 100001), ('job_title', 'CEO'), ...]) ``` -------------------------------- ### parse_file Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Opens a file in binary mode and yields parsed rows as OrderedDict objects. ```APIDOC ## `parse_file(path, spec, ...)` ### Description Opens a file in binary mode and yields parsed rows as `OrderedDict` objects. Allows customization of encoding, type error handling, and skipping blank lines. ### Method GET (conceptual, as it reads from a file) ### Endpoint N/A (Function call) ### Parameters #### Path Parameters - **path** (str) - Required - The file path to the data file. - **spec** (list[FieldInfo] or str) - Required - The field specification, either as a list of FieldInfo objects or a path to a layout file. #### Query Parameters - **encoding** (str) - Optional - The encoding to use for reading the file. Defaults to 'ascii'. - **type_errors** (str) - Optional - How to handle type conversion errors. Options: 'raise' (default), 'ignore'. - **skip_blank_lines** (bool) - Optional - Whether to skip blank lines. Defaults to False. ### Request Example ```python from fixwidth import parse_file, read_file_format _, spec = read_file_format('example/data.layout') for row in parse_file('example/data1.txt', spec=spec, type_errors='ignore'): print(row['employee_id'], row['salary']) ``` ### Response #### Success Response (200) - **Yields** (OrderedDict) - Each yielded item is an OrderedDict representing a parsed row. ``` -------------------------------- ### Load Fixed-Width Layout Specification Source: https://context7.com/poliquin/pyfixwidth/llms.txt Loads a tab-delimited layout file to define the structure of fixed-width records. It returns the layout title and a list of FieldInfo objects, each specifying width, datatype, and name for a field. ```python from fixwidth import read_file_format # Load a layout file from disk title, spec = read_file_format('example/data.layout') print(f"Layout title: {title}") # Output: Layout title: employees # The spec contains field definitions for field in spec: print(f"Field: {field.name}, Width: {field.width}, Type: {field.datatype}") # Output: # Field: employee_id, Width: 6, Type: int # Field: job_title, Width: 15, Type: str # Field: salary, Width: 8, Type: float # Field: blank, Width: -3, Type: str # Field: hire_date, Width: 10, Type: date ``` -------------------------------- ### Parse fixed-width files via CLI Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/index.md Uses the pyfixwidth module to process fixed-width text files based on a provided layout file. The output is printed to the console as tab-separated values. ```python python -m fixwidth example/data.layout example/data1.txt example/data2.txt ``` -------------------------------- ### Parse Fixed-Width Data File Source: https://context7.com/poliquin/pyfixwidth/llms.txt Opens and parses a fixed-width data file from disk, yielding rows as OrderedDict objects. It requires a layout specification and supports options for encoding and error handling during type conversion. ```python from fixwidth import read_file_format, parse_file # Load the layout specification _, spec = read_file_format('example/data.layout') # Parse a data file, ignoring type conversion errors for row in parse_file('example/data1.txt', spec=spec, type_errors='ignore'): print(f"Employee {row['employee_id']}: {row['job_title']} - ${row['salary']}") # Output: # Employee 100001: CEO - $15000.0 # Employee 100002: Programmer - $8500.0 # Employee 100003: Data Scientist - $10000.0 # Parse with explicit encoding for non-ASCII files for row in parse_file('records.txt', spec=spec, encoding='utf-8', skip_blank_lines=True): print(row) ``` -------------------------------- ### Register a Custom Converter in Python Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cookbook.md Demonstrates how to define and register a custom data type converter using the @register_type decorator. This allows for post-processing of field values, such as converting strings to uppercase, during the parsing process. ```python from fixwidth import register_type, parse_lines from io import BytesIO @register_type('uppercase') def convert_uppercase(value): return value.strip().upper() layout = [(5, 'uppercase', 'name')] rows = parse_lines(BytesIO(b'alice\n'), layout) print(next(rows)) ``` -------------------------------- ### DictReader Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Provides a csv.DictReader-like interface for binary file objects. ```APIDOC ## `DictReader(fileobj, fieldinfo, ...)` ### Description Provides a `csv.DictReader`-like interface for binary file objects. Allows reading fixed-width data with a familiar dictionary-based access. ### Method GET (conceptual, as it reads from a file object) ### Endpoint N/A (Class instantiation) ### Parameters #### Path Parameters - **fileobj** (binary file object) - Required - The file object opened in binary read mode. - **fieldinfo** (str or list[FieldInfo]) - Required - The field specification, either as a path to a layout file or a sequence of layout tuples. #### Query Parameters - **encoding** (str) - Optional - The encoding to use for reading the file. Defaults to 'ascii'. - **type_errors** (str) - Optional - How to handle type conversion errors. Options: 'raise' (default), 'ignore'. - **skip_blank_lines** (bool) - Optional - Whether to skip blank lines. Defaults to False. ### Request Example ```python import fixwidth with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader(fh, 'example/data.layout') first = next(reader) print(first['job_title']) ``` ### Response #### Success Response (200) - **Instance** (DictReader) - An iterator yielding rows as dictionaries. - **line_num** (int) - Increments as rows are consumed. ``` -------------------------------- ### DictReader interface for binary files Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Provides a csv.DictReader-like interface for binary file objects. Requires the file to be opened in binary read mode. ```python import fixwidth with open('example/data1.txt', 'rb') as fh: reader = fixwidth.DictReader(fh, 'example/data.layout') first = next(reader) print(first['job_title']) ``` -------------------------------- ### Parse Fixed-Width File with pyfixwidth Source: https://github.com/poliquin/pyfixwidth/blob/master/AGENTS.md Demonstrates typical usage of pyfixwidth to read a file format specification and then parse a fixed-width data file. It iterates through the parsed rows and prints them. Requires the 'fixwidth' library and specified layout and data files. ```python from fixwidth import read_file_format, parse_file _, spec = read_file_format('example/data.layout') for row in parse_file('example/data1.txt', spec=spec): print(row) ``` -------------------------------- ### Python Representation of Layout Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/layout-format.md This snippet shows how to represent a PyFixWidth layout directly in Python code using a list of tuples. Each tuple defines a field's width, converter type, and name. ```python layout = [ (6, 'int', 'employee_id'), (15, 'str', 'job_title'), (8, 'float', 'salary'), (-3, 'str', 'blank'), (10, 'date', 'hire_date'), ] ``` -------------------------------- ### parse_lines Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Parses an iterable of binary lines, useful for streams and in-memory data. ```APIDOC ## `parse_lines(lines, spec, ...)` ### Description Parses an iterable of binary lines. This is useful for tests, streams, and in-memory data. Allows customization of encoding, type error handling, and skipping blank lines. ### Method POST (conceptual, as it processes provided data) ### Endpoint N/A (Function call) ### Parameters #### Path Parameters - **lines** (iterable[bytes]) - Required - An iterable of binary lines to parse. - **spec** (list[FieldInfo] or str) - Required - The field specification, either as a list of FieldInfo objects or a path to a layout file. #### Query Parameters - **encoding** (str) - Optional - The encoding to use for reading the lines. Defaults to 'utf-8'. - **type_errors** (str) - Optional - How to handle type conversion errors. Options: 'raise' (default), 'ignore'. - **skip_blank_lines** (bool) - Optional - Whether to skip blank lines. Defaults to False. ### Request Example ```python from io import BytesIO from fixwidth import parse_lines layout = [(2, 'int', 'row_id'), (5, 'str', 'name')] rows = parse_lines(BytesIO(b'01Bob \n02Susan\n'), layout) print(next(rows)) ``` ### Response #### Success Response (200) - **Yields** (OrderedDict) - Each yielded item is an OrderedDict representing a parsed row. ``` -------------------------------- ### register_type Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Registers a custom converter function in the global converter registry. ```APIDOC ## `register_type(name)` ### Description Registers a custom converter function in the global converter registry, allowing it to be used in layout specifications. ### Method POST (conceptual, as it registers a function) ### Endpoint N/A (Function call) ### Parameters #### Path Parameters - **name** (str) - Required - The name to register the custom type under. ### Request Example ```python from fixwidth import register_type @register_type('uppercase') def convert_uppercase(value): return value.strip().upper() ``` ### Response #### Success Response (200) - **None** - The function modifies the global registry. ``` -------------------------------- ### Parse Fixed-Width Lines from Iterable Source: https://context7.com/poliquin/pyfixwidth/llms.txt Parses an iterable of binary lines, suitable for streams or in-memory data. It takes a layout definition and handles type conversion errors. This function is useful for testing and processing data from sources like network sockets. ```python from io import BytesIO from fixwidth import parse_lines # Define an inline layout: (width, datatype, name) layout = [ (2, 'int', 'row_id'), (5, 'str', 'name') ] # Parse binary data from a BytesIO stream data = BytesIO(b'01Bob \n02Susan\n03Alice\n') rows = parse_lines(data, layout) for row in rows: print(f"ID: {row['row_id']}, Name: {row['name'].strip()}") # Output: # ID: 1, Name: Bob # ID: 2, Name: Susan # ID: 3, Name: Alice # Handle invalid values gracefully bad_data = BytesIO(b'xxAlice\n03Bob \n') rows = parse_lines(bad_data, layout, type_errors='ignore') for row in rows: print(row) # Output: # OrderedDict([('row_id', None), ('name', 'Alice')]) # OrderedDict([('row_id', 3), ('name', 'Bob ')]) ``` -------------------------------- ### Parse binary lines iterable Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Parses an iterable of binary lines, useful for handling streams, in-memory data, or unit testing. Accepts a list of layout tuples to define the structure. ```python from io import BytesIO from fixwidth import parse_lines layout = [(2, 'int', 'row_id'), (5, 'str', 'name')] rows = parse_lines(BytesIO(b'01Bob \n02Susan\n'), layout) print(next(rows)) ``` -------------------------------- ### Ignore Parsing Errors in Python Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/cookbook.md Explains how to handle malformed data by setting the type_errors parameter to 'ignore'. Invalid field values are returned as None instead of raising an exception. ```python from fixwidth import parse_lines from io import BytesIO layout = [(2, 'int', 'row_id'), (5, 'str', 'name')] rows = parse_lines(BytesIO(b'xxAlice\n03Bob \n'), layout, type_errors='ignore') for row in rows: print(row) ``` -------------------------------- ### Register Custom Type Converter Source: https://context7.com/poliquin/pyfixwidth/llms.txt Registers a custom data type converter function globally. This allows you to define and use your own type conversions within layout specifications, extending pyfixwidth's capabilities. ```python from fixwidth import register_type, parse_lines from io import BytesIO ``` -------------------------------- ### Register custom type converter Source: https://github.com/poliquin/pyfixwidth/blob/master/docs/python-api.md Registers a custom converter function in the global registry using a decorator. Once registered, the type name can be used within layout specifications. ```python from fixwidth import register_type @register_type('uppercase') def convert_uppercase(value): return value.strip().upper() ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.