### Command-Line Interface Usage Source: https://context7.com/matiasb/python-unidiff/llms.txt Provides examples of using the unidiff CLI tool to analyze diffs from standard input or files, supporting various version control systems. ```bash git diff | unidiff unidiff -f changes.diff git diff | unidiff --show-diff hg diff | unidiff svn diff | unidiff ``` -------------------------------- ### Load Unified Diff from Local File - Python Source: https://github.com/matiasb/python-unidiff/blob/master/README.rst Shows how to load unified diff data from a local file using `PatchSet.from_filename`. It includes an example of specifying the file encoding. An alternative method using `codecs.open` for more control over file handling is also presented. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8') print(patch) ``` ```python import codecs from unidiff import PatchSet with codecs.open('tests/samples/bzr.diff', 'r', encoding='utf-8') as diff: patch = PatchSet(diff) print(patch) ``` -------------------------------- ### Command-Line Usage of Unidiff - Shell Source: https://github.com/matiasb/python-unidiff/blob/master/README.rst Shows how to use the `unidiff` command-line tool to display a summary of diff data piped from standard input, such as `git diff`. This provides a quick overview of changes, additions, and deletions. ```shell $ git diff | unidiff Summary ------- README.md: +6 additions, -0 deletions 1 modified file(s), 0 added file(s), 0 removed file(s) Total: 6 addition(s), 0 deletion(s) ``` -------------------------------- ### Parse Diff from File-like Object with Unidiff Source: https://context7.com/matiasb/python-unidiff/llms.txt Demonstrates how to initialize a PatchSet object from an open file handle. This method allows for reading diff content and extracting aggregated statistics like total additions and deletions. ```python from unidiff import PatchSet with open('changes.diff', 'r', encoding='utf-8') as diff_file: patch = PatchSet(diff_file) print(f"Total files changed: {len(patch)}") print(f"Total additions: {patch.added}") print(f"Total deletions: {patch.removed}") ``` -------------------------------- ### Efficiently Load Diff Metadata - Python Source: https://github.com/matiasb/python-unidiff/blob/master/README.rst Demonstrates how to load only the metadata from a diff file using `PatchSet.from_filename` with the `metadata_only=True` argument. This can improve parsing efficiency when the full diff content is not required. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8', metadata_only=True) ``` -------------------------------- ### Load Diff from File Path with Unidiff Source: https://context7.com/matiasb/python-unidiff/llms.txt Shows how to use the from_filename class method to load diff data directly from a file path. It includes handling for custom encodings, error replacement, and specific newline character settings. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/git.diff', encoding='utf-8') # Handle encoding errors and line endings patch = PatchSet.from_filename( 'windows.diff', encoding='utf-8', errors='replace', newline='\n' ) for patched_file in patch: print(f"{patched_file.path}: +{patched_file.added}, -{patched_file.removed}") ``` -------------------------------- ### Instantiate PatchSet from Iterable - Python Source: https://github.com/matiasb/python-unidiff/blob/master/README.rst Illustrates how to create a `PatchSet` object from an iterable, such as a list of strings obtained from reading a file line by line. This method is suitable when the diff content is already in memory. ```python from unidiff import PatchSet with open('tests/samples/bzr.diff', 'r') as diff: data = diff.readlines() patch = PatchSet(data) print(patch) ``` -------------------------------- ### Reconstruct and Validate Diff Output Source: https://context7.com/matiasb/python-unidiff/llms.txt Demonstrates how to parse a diff file into a PatchSet object and convert it back into a string. This ensures round-trip integrity of the diff data. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/git.diff', encoding='utf-8') diff_output = str(patch) print(diff_output[:500]) reparsed = PatchSet.from_string(str(patch)) assert str(patch) == str(reparsed) print("Round-trip validation: PASSED") ``` -------------------------------- ### Parse Unified Diff from URL - Python Source: https://github.com/matiasb/python-unidiff/blob/master/README.rst Demonstrates how to parse unified diff data directly from a URL using the `PatchSet` class. It handles fetching the diff content and specifying the character encoding for accurate parsing. This is useful for processing diffs from web sources like GitHub pull requests. ```python import urllib.request from unidiff import PatchSet diff = urllib.request.urlopen('https://github.com/matiasb/python-unidiff/pull/3.diff') encoding = diff.headers.get_charsets()[0] patch = PatchSet(diff, encoding=encoding) print(patch) ``` -------------------------------- ### Perform Fast Metadata-only Parsing with Unidiff Source: https://context7.com/matiasb/python-unidiff/llms.txt Demonstrates the use of the metadata_only=True flag to improve parsing performance. This mode is ideal for gathering statistics when the full line-by-line content of the diff is not required. ```python from unidiff import PatchSet # Metadata-only parsing (faster, no line content) patch_meta = PatchSet.from_filename( 'large.diff', encoding='utf-8', metadata_only=True ) print(f"Total additions: {patch_meta.added}") print(f"Total deletions: {patch_meta.removed}") ``` -------------------------------- ### Accessing Hunk Details Source: https://context7.com/matiasb/python-unidiff/llms.txt Shows how to iterate over hunks within a patched file to retrieve line positions, lengths, section headers, and validation status. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/sample0.diff', encoding='utf-8') for patched_file in patch: print(f"\nFile: {patched_file.path}") for i, hunk in enumerate(patched_file): print(f"\n Hunk {i + 1}:") print(f" Source: line {hunk.source_start}, {hunk.source_length} lines") print(f" Target: line {hunk.target_start}, {hunk.target_length} lines") print(f" Added: {hunk.added}, Removed: {hunk.removed}") if hunk.section_header: print(f" Section: {hunk.section_header}") if hunk.is_valid(): print(" Valid: YES") ``` -------------------------------- ### Handle Malformed Diffs with UnidiffParseError Source: https://context7.com/matiasb/python-unidiff/llms.txt Illustrates how to catch UnidiffParseError exceptions when dealing with invalid or truncated diff content to ensure application stability. ```python from unidiff import PatchSet from unidiff.errors import UnidiffParseError def safe_parse_diff(diff_content): try: return PatchSet.from_string(diff_content) except UnidiffParseError as e: print(f"Warning: Could not parse diff - {e}") return None ``` -------------------------------- ### Accessing PatchedFile Metadata Source: https://context7.com/matiasb/python-unidiff/llms.txt Demonstrates how to iterate through a PatchSet to access file-level information such as path, change status (added, removed, modified), and binary/rename status. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/git.diff', encoding='utf-8') for patched_file in patch: print(f"\nFile: {patched_file.path}") print(f" Source: {patched_file.source_file}") print(f" Target: {patched_file.target_file}") print(f" Added lines: {patched_file.added}") print(f" Removed lines: {patched_file.removed}") print(f" Number of hunks: {len(patched_file)}") if patched_file.is_added_file: print(" Status: NEW FILE") elif patched_file.is_removed_file: print(" Status: DELETED") elif patched_file.is_modified_file: print(" Status: MODIFIED") if patched_file.is_binary_file: print(" Type: BINARY") if patched_file.is_rename: print(f" Renamed from: {patched_file.source_file}") ``` -------------------------------- ### Parse Diff from String with Unidiff Source: https://context7.com/matiasb/python-unidiff/llms.txt Illustrates how to parse diff content directly from a string or bytes object. This is useful for processing diffs retrieved from APIs or databases without needing a physical file. ```python from unidiff import PatchSet diff_string = """diff --git a/example.py b/example.py\n--- a/example.py\n+++ b/example.py\n@@ -1,3 +1,4 @@\n def hello():\n- print(\"Hello\")\n+ print(\"Hello, World!\")\n+ return True\n""" patch = PatchSet.from_string(diff_string) print(f"Files changed: {len(patch)}") # Parse from bytes diff_bytes = diff_string.encode('utf-8') patch = PatchSet.from_string(diff_bytes, encoding='utf-8') ``` -------------------------------- ### Parse Remote Diff from URL Source: https://context7.com/matiasb/python-unidiff/llms.txt Shows how to fetch a diff from a remote HTTP source and parse it directly into a PatchSet object. This is useful for analyzing pull requests or remote patches. ```python import urllib.request from unidiff import PatchSet url = 'https://github.com/matiasb/python-unidiff/pull/3.diff' diff = urllib.request.urlopen(url) encoding = diff.headers.get_charsets()[0] patch = PatchSet(diff, encoding=encoding) for f in patch: status = "new" if f.is_added_file else "modified" if f.is_modified_file else "deleted" print(f" [{status}] {f.path}: +{f.added}, -{f.removed}") ``` -------------------------------- ### Filtering Lines by Origin Source: https://context7.com/matiasb/python-unidiff/llms.txt Uses source_lines() and target_lines() generators to separate lines based on their origin (source vs target file version) and access content as lists. ```python from unidiff import PatchSet patch = PatchSet.from_filename('tests/samples/git.diff', encoding='utf-8') modified_file = patch.modified_files[0] hunk = modified_file[0] for line in hunk.source_lines(): print(f" {line.source_line_no}: {line.value.rstrip()}") for line in hunk.target_lines(): print(f" {line.target_line_no}: {line.value.rstrip()}") source_content = hunk.source target_content = hunk.target ``` -------------------------------- ### PatchedFile - Access Individual File Changes Source: https://context7.com/matiasb/python-unidiff/llms.txt Represents a single file within a diff, providing metadata about its status (added, removed, modified) and access to its associated hunks. ```APIDOC ## PatchedFile Object ### Description Represents a single file in a diff. Provides properties to determine the file's change status and access its hunks. ### Properties - **path** (string) - The file path with VCS prefixes stripped. - **is_added_file** (boolean) - True if the file was added. - **is_removed_file** (boolean) - True if the file was removed. - **is_modified_file** (boolean) - True if the file was modified. - **is_binary_file** (boolean) - True if the file is binary. - **is_rename** (boolean) - True if the file was renamed. - **added** (int) - Number of added lines. - **removed** (int) - Number of removed lines. ``` -------------------------------- ### Accessing Individual Line Changes Source: https://context7.com/matiasb/python-unidiff/llms.txt Iterates through lines in a hunk to determine their type (added, removed, context) and retrieve specific line content and line numbers. ```python from unidiff import PatchSet, LINE_TYPE_ADDED, LINE_TYPE_REMOVED, LINE_TYPE_CONTEXT patch = PatchSet.from_filename('tests/samples/git.diff', encoding='utf-8') modified_file = patch.modified_files[0] hunk = modified_file[0] for line in hunk: if line.is_added: prefix = "+" line_info = f"target line {line.target_line_no}" elif line.is_removed: prefix = "-" line_info = f"source line {line.source_line_no}" elif line.is_context: prefix = " " line_info = f"src:{line.source_line_no} tgt:{line.target_line_no}" else: prefix = line.line_type line_info = "marker" print(f"[{line_info:20}] {prefix}{line.value.rstrip()}") ``` -------------------------------- ### Line - Access Individual Line Changes Source: https://context7.com/matiasb/python-unidiff/llms.txt Provides detailed information about a single line in a diff, including its type and content. ```APIDOC ## Line Object ### Description Represents a single line in the diff, including its type (added, removed, context), content, and line numbers. ### Properties - **value** (string) - The content of the line. - **is_added** (boolean) - True if the line is an addition. - **is_removed** (boolean) - True if the line is a removal. - **is_context** (boolean) - True if the line is context. - **source_line_no** (int) - Line number in the source file. - **target_line_no** (int) - Line number in the target file. ``` -------------------------------- ### Hunk - Access Changed Blocks Source: https://context7.com/matiasb/python-unidiff/llms.txt Represents a contiguous block of changes within a file, including line positions and section headers. ```APIDOC ## Hunk Object ### Description A Hunk represents a contiguous block of changes within a file. It provides source/target line positions, lengths, and section headers. ### Properties - **source_start** (int) - Starting line number in the source file. - **source_length** (int) - Number of lines in the source hunk. - **target_start** (int) - Starting line number in the target file. - **target_length** (int) - Number of lines in the target hunk. - **section_header** (string) - The context header for the hunk. ### Methods - **is_valid()** (boolean) - Validates hunk consistency. ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.