### Import olefile Module in Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This code snippet demonstrates how to import the olefile library into a Python application. Ensure the olefile package is installed before using this import statement. This is a prerequisite for using any other olefile functionalities. ```python import olefile ``` -------------------------------- ### Handle Malformed OLE Files with Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This example illustrates how to configure `olefile.OleFileIO` to be more strict about OLE file conformance using the `raise_defects` option. It also shows how to access the `parsing_issues` attribute to review any non-fatal issues detected during parsing. ```python import olefile try: ole = olefile.OleFileIO('myfile.doc', raise_defects=olefile.DEFECT_INCORRECT) print('Non-fatal issues raised during parsing:') if ole.parsing_issues: for exctype, msg in ole.parsing_issues: print('- %s: %s' % (exctype.__name__, msg)) else: print('None') except Exception as e: print(f'An error occurred: {e}') ``` -------------------------------- ### Get Size of OLE Stream or Storage with olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Illustrates how to retrieve the size in bytes of a given stream or storage within an OLE file using the `get_size` method. ```python s = ole.get_size('WordDocument') ``` -------------------------------- ### Get Timestamps of OLE Stream or Storage with olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Demonstrates retrieving the creation and modification timestamps of an OLE stream or storage using `getctime` and `getmtime`. These timestamps are UTC aware but may not always be present. ```python c = ole.getctime('WordDocument') m = ole.getmtime('WordDocument') c = ole.root.getctime() m = ole.root.getmtime() ``` -------------------------------- ### Check Existence of Stream or Storage in OLE File with olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Provides an example of using the `exists` method to determine if a specific stream or storage is present within an OLE file. This method is case-insensitive. ```python if ole.exists('worddocument'): print("This is a Word document.") if ole.exists('macros/vba'): print("This document seems to contain VBA macros.") ``` -------------------------------- ### Get Type of OLE Entry (Stream/Storage) using olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Shows how to determine the type of an OLE entry (stream, storage, or root) using the `get_type` method. It returns predefined constants or `False` if the path does not exist. ```python t = ole.get_type('WordDocument') ``` -------------------------------- ### Get CLSID Information from OLE Files in Python Source: https://context7.com/decalage2/olefile/llms.txt Explains how to retrieve the Class Identifier (CLSID) for the root entry and specific storages within an OLE file using the olefile library. It includes an example of checking if a file is a Microsoft Word document by comparing its root CLSID to a predefined constant. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Get CLSID of root entry root_clsid = ole.getclsid("") print(f"Root CLSID: {root_clsid}") # Check if file is a Word document if root_clsid == olefile.WORD_CLSID: print("This is a Microsoft Word document") # Get CLSID of specific storage if ole.exists("Macros"): macros_clsid = ole.getclsid("Macros") print(f"Macros CLSID: {macros_clsid}") ``` -------------------------------- ### Get OLE File Stream Timestamps in Python Source: https://context7.com/decalage2/olefile/llms.txt Demonstrates how to retrieve modification and creation timestamps for streams within an OLE file using the olefile library. It shows how to check for the existence of these timestamps and compare them to calculate the time difference between file creation and modification. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Get modification time of a stream mtime = ole.getmtime("WordDocument") if mtime: print(f"Last modified: {mtime}") print(f"Year: {mtime.year}, Month: {mtime.month}, Day: {mtime.day}") else: print("No modification time available") # Get creation time ctime = ole.getctime("WordDocument") if ctime: print(f"Created: {ctime}") else: print("No creation time available") # Compare timestamps if mtime and ctime: time_diff = mtime - ctime print(f"File was modified {time_diff.days} days after creation") ``` -------------------------------- ### Read OLE File Stream Contents in Python Source: https://context7.com/decalage2/olefile/llms.txt Demonstrates how to open and read data from streams within an OLE file using the olefile library. It covers reading the entire stream, reading in chunks, getting stream size, extracting streams to files, and accessing nested streams. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Open stream and read all contents stream = ole.openstream("WordDocument") data = stream.read() print(f"Stream size: {len(data)} bytes") # Read stream in chunks stream = ole.openstream("WordDocument") chunk_size = 4096 while True: chunk = stream.read(chunk_size) if not chunk: break # Process chunk print(f"Read {len(chunk)} bytes") # Get stream size before reading size = ole.get_size("WordDocument") print(f"WordDocument size: {size} bytes") # Extract stream to file with ole.openstream("WordDocument") as stream: with open("extracted_content.bin", "wb") as output: output.write(stream.read()) # Using list syntax for nested paths stream = ole.openstream(["Macros", "VBA", "Module1"]) vba_code = stream.read() ``` -------------------------------- ### olefile Path Syntaxes and Encoding Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Explains the two allowed syntaxes for stream and storage paths (list of strings and slash-separated strings) and discusses encoding considerations for Python 2 and Python 3. ```APIDOC ## olefile Path Syntaxes and Encoding ### Description This section details the two supported syntaxes for specifying stream and storage paths within OLE files: a list of strings and a single string with slashes. It also covers encoding rules for stream and storage names, particularly the differences between Python 2 and Python 3. ### Path Syntaxes 1. **List of Strings**: An array where each element is a part of the path (storage or stream name). Example: `['Macros', 'VBA', 'ThisDocument']`. 2. **Slash-Separated String**: A single string with '/' as a delimiter, similar to Unix paths. Example: `'Macros/VBA/ThisDocument'`. * Note: This syntax may fail if stream/storage names contain slashes, though this is not allowed by Microsoft specifications. Both syntaxes are case-insensitive. ### Encoding * **Stream and Storage Names**: Stored in Unicode format in OLE files. * **Python 2.x**: Paths are handled as byte strings. Uses **UTF-8 encoding** by default. To use Unicode, set `path_encoding=None` when creating `OleFileIO`. Prior to v0.42, Latin-1 was used. * **Python 3.x**: Paths are handled as Unicode strings by default, without explicit encoding. ``` -------------------------------- ### Use OleFile.py as a Script Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Demonstrates using the 'olefile.py' script from the command-line to display OLE file structure and metadata. Options like '-c' for stream checking and '-d' for verbose debugging information are available. ```shell olefile.py myfile.doc olefile.py myfile.doc -c olefile.py myfile.doc -d olefile.py myfile.doc -l debug ``` -------------------------------- ### Open OLE File from Disk using Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This code illustrates how to open an OLE file from the disk using the `olefile.OleFileIO` class. It shows both the traditional instantiation and the recommended context manager approach using a 'with' statement, which ensures the file is properly closed. The file path is passed as an argument. ```python import olefile # Traditional way ole = olefile.OleFileIO('myfile.doc') # Recommended way using context manager with olefile.OleFileIO('myfile.doc') as ole: # perform all operations on the ole object ``` -------------------------------- ### List Streams and Storages in OLE File using olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Shows how to retrieve a list of all streams and storages within an OLE file using the `listdir` method. It also illustrates how to optionally include or exclude storages and streams from the output. ```python print(ole.listdir()) ole.listdir(streams=False, storages=True) ``` -------------------------------- ### Convert Path Formats in olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Demonstrates the conversion between list-based paths and slash-separated string paths for OLE streams and storages. This is useful for adapting to different syntax requirements or improving readability. ```python slash_path = '/'.join(list_path) list_path = slash_path.split('/') ``` -------------------------------- ### olefile.OleFileIO.listdir Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Retrieves a list of all streams and storages within the OLE file. Options are available to include or exclude streams and storages. ```APIDOC ## GET /olefile/listdir ### Description Retrieves a list of all streams and storages contained within the OLE file. Each entry is represented as a list of strings corresponding to its path from the root. ### Method GET ### Endpoint /olefile/listdir ### Parameters #### Query Parameters * **streams** (boolean) - Optional - If True (default), includes streams in the output. If False, excludes streams. * **storages** (boolean) - Optional - If True (default), includes storages in the output. If False, excludes storages. ### Request Example ``` GET /olefile/listdir?streams=true&storages=true ``` ### Response #### Success Response (200) - **list** (list) - A list where each element represents a stream or storage path. Example: `[['\x01CompObj'], ['Macros', 'VBA', 'ThisDocument']]`. #### Response Example ```json { "streams_and_storages": [ ["\x01CompObj"], ["\x05DocumentSummaryInformation"], ["\x05SummaryInformation"], ["1Table"], ["Macros", "PROJECT"], ["Macros", "PROJECTwm"], ["Macros", "VBA", "Module1"], ["Macros", "VBA", "ThisDocument"], ["Macros", "VBA", "_VBA_PROJECT"], ["Macros", "VBA", "dir"], ["ObjectPool"], ["WordDocument"] ] } ``` ``` -------------------------------- ### Initialize OleFileIO with Different Defect Handling Modes Source: https://context7.com/decalage2/olefile/llms.txt Demonstrates initializing an OleFileIO object with various levels of strictness for handling parsing defects. This allows developers to choose the appropriate error handling strategy based on their application's needs, from strict validation to more lenient processing. ```python import olefile # DEFECT_FATAL: Stop on any fatal error (default, recommended for most apps) ole_strict = olefile.OleFileIO("document.doc", raise_defects=olefile.DEFECT_FATAL) # DEFECT_INCORRECT: Stop on any format violation (for security applications) ole_secure = olefile.OleFileIO("document.doc", raise_defects=olefile.DEFECT_INCORRECT) # DEFECT_POTENTIAL: More lenient, continues on potential issues ole_lenient = olefile.OleFileIO("document.doc", raise_defects=olefile.DEFECT_POTENTIAL) try: # Check for parsing issues that didn't raise exceptions if ole_lenient.parsing_issues: print("Parsing issues found:") for issue_type, message in ole_lenient.parsing_issues: print(f" {issue_type.__name__}: {message}") # Process file normally streams = ole_lenient.listdir() except olefile.OleFileError as e: print(f"OLE file error: {e}") except olefile.NotOleFileError as e: print(f"Not a valid OLE file: {e}") finally: ole_lenient.close() ``` -------------------------------- ### olefile.OleFileIO Stream/Storage Metadata Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Provides methods to retrieve the size, type, creation timestamp, and modification timestamp for streams and storages. ```APIDOC ## GET /olefile/metadata ### Description Retrieves metadata (size, type, timestamps) for a given stream or storage. Note that timestamp information may not always be available. ### Method GET ### Endpoint /olefile/metadata ### Parameters #### Query Parameters * **path** (string) - Required - The case-insensitive path to the stream or storage. ### Request Example ``` GET /olefile/metadata?path=WordDocument ``` ### Response #### Success Response (200) - **path** (string) - The requested path. - **size** (integer) - The size of the stream/storage in bytes. Returns `False` if the path does not exist. - **type** (string) - The type of the entry. Possible values: `STGTY_STREAM`, `STGTY_STORAGE`, `STGTY_ROOT`, or `False` if the path does not exist. - **creation_time** (string) - The creation timestamp (UTC) as an ISO format string, or `None` if not available. - **modification_time** (string) - The modification timestamp (UTC) as an ISO format string, or `None` if not available. #### Response Example ```json { "path": "WordDocument", "size": 10240, "type": "STGTY_STREAM", "creation_time": "2023-01-15T10:30:00Z", "modification_time": "2023-10-20T14:45:15Z" } ``` ### Special Case: Root Storage Metadata To get the creation and modification timestamps for the root storage, access the `root` attribute of the `OleFileIO` object directly: ```python # Assuming 'ole' is an instance of OleFileIO root_creation_time = ole.root.getctime() root_modification_time = ole.root.getmtime() ``` #### Root Metadata Response Example (Conceptual) ```json { "type": "STGTY_ROOT", "creation_time": "2023-01-15T10:00:00Z", "modification_time": "2023-10-20T14:00:00Z" } ``` ``` -------------------------------- ### olefile.OleFileIO.openstream Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Opens a specified stream as a file-like object, allowing data to be read from it. The stream data is loaded into memory. ```APIDOC ## GET /olefile/openstream ### Description Opens a specified stream within the OLE file and returns it as a file-like object (an instance of `olefile.OleStream` based on `io.BytesIO`). The stream's content is read into memory. ### Method GET ### Endpoint /olefile/openstream ### Parameters #### Query Parameters * **path** (string) - Required - The case-insensitive path to the stream to open (e.g., 'Pictures', 'WordDocument'). ### Request Example ``` GET /olefile/openstream?path=Pictures ``` ### Response #### Success Response (200) - **stream_data** (bytes) - The raw binary data read from the stream. #### Response Example ```json { "stream_path": "Pictures", "stream_data": "" } ``` ``` -------------------------------- ### Write Sector in OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Writes data to a specified sector in an OLE file. If the provided data is smaller than the sector size, it is padded with null characters. Sector 0 is the second sector; use -1 to write to the first sector. ```python ole.write_sect(0x17, b'TEST') ``` -------------------------------- ### Open and Parse OLE Files with olefile Source: https://context7.com/decalage2/olefile/llms.txt Provides methods for opening and parsing OLE2 files using olefile.OleFileIO. It highlights the recommended context manager approach for automatic resource management, manual open/close, opening from in-memory bytes, and opening in write mode for modifications. Proper file handling is essential for preventing resource leaks. ```python import olefile # Method 1: Using context manager (recommended) with olefile.OleFileIO("document.doc") as ole: print(f"Root entry name: {ole.get_rootentry_name()}") # File is automatically closed when exiting the context # Method 2: Manual open and close ole = olefile.OleFileIO("document.doc") try: print(f"Root entry name: {ole.get_rootentry_name()}") finally: ole.close() # Method 3: Open from bytes in memory with open("document.doc", "rb") as f: data = f.read() ole = olefile.OleFileIO(data) ode.close() # Method 4: Open in write mode (for modifications) ole = olefile.OleFileIO("document.doc", write_mode=True) # ... perform modifications ... ole.close() ``` -------------------------------- ### List Streams and Storages in OLE Files with olefile Source: https://context7.com/decalage2/olefile/llms.txt Illustrates how to list the directory structure of an OLE2 file using olefile.OleFileIO.listdir(). It shows options for listing only streams, only storages, or both, and how to retrieve the type of each entry. This is useful for understanding the file's internal organization. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # List all streams (default behavior) streams = ole.listdir() print("Streams in file:") for stream in streams: print(f" {'/'.join(stream)}") # List both streams and storages all_entries = ole.listdir(streams=True, storages=True) print("\nAll entries (streams and storages):") for entry in all_entries: entry_type = ole.get_type(entry) type_name = "stream" if entry_type == olefile.STGTY_STREAM else "storage" print(f" {'/'.join(entry)} ({type_name})") # List only storages storages = ole.listdir(streams=False, storages=True) print("\nStorages only:") for storage in storages: print(f" {'/'.join(storage)}") # Example output: # Streams in file: # WordDocument # 1Table # Data # ... ``` -------------------------------- ### Open OLE File from File-like Object in Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This code shows how to open an OLE file from a file-like object (e.g., an open file handle or a stream) that has read and seek methods. If seek is unavailable, the file content can be read into a bytes string first and then passed to `OleFileIO`. ```python import olefile # Assuming 'f' is a file-like object with read and seek methods ole = olefile.OleFileIO(f) # If file-like object lacks seek method data = f.read() ole = olefile.OleFileIO(data) ``` -------------------------------- ### Read Stream Data from OLE File using olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Demonstrates how to open and read data from a specific stream within an OLE file using the `openstream` method. The returned stream object behaves like a file-like object. ```python pics = ole.openstream('Pictures') data = pics.read() ``` -------------------------------- ### Enable Logging in OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Enables logging for the OleFile.io library. This function can be used to control the verbosity of messages, useful for debugging and monitoring file operations. ```python olefile.enable_logging() ``` -------------------------------- ### Generate Debug Output with olefile Source: https://github.com/decalage2/olefile/blob/master/doc/Contribute.rst This command generates debugging output for the olefile Python library. It is useful for diagnosing issues and providing detailed information when reporting bugs. The output is redirected to a file named 'debug.txt'. ```shell olefile.py -d -c file >debug.txt ``` -------------------------------- ### Enable and Configure Debug Logging for OleFile Source: https://context7.com/decalage2/olefile/llms.txt Shows how to enable detailed debug logging for the olefile library and configure Python's logging module to display these messages. This is useful for understanding the library's internal operations and troubleshooting parsing issues. ```python import olefile import logging # Enable olefile logging olefile.enable_logging() # Configure logging to see debug output logging.basicConfig(level=logging.DEBUG) # Now all olefile operations will produce debug output with olefile.OleFileIO("document.doc") as ole: streams = ole.listdir() # Debug messages will be printed to console # Example debug output: # DEBUG:olefile:File size: 52736 bytes (CE00h) # DEBUG:olefile:sector size: 512 bytes # DEBUG:olefile:mini sector size: 64 bytes # ... ``` -------------------------------- ### Open and Parse an OLE File Source: https://context7.com/decalage2/olefile/llms.txt Opens and parses an OLE file, providing access to its contents. Supports context manager for automatic cleanup. ```APIDOC ## OleFileIO ### Description Opens and parses an OLE2 file, providing an interface to access its internal structure and streams. It is recommended to use this class as a context manager for automatic resource management. ### Method `olefile.OleFileIO(filename, write_mode=False)` ### Parameters #### Path Parameters - **filename** (str) - The path to the OLE file. #### Query Parameters - **write_mode** (bool) - If True, opens the file in write mode for modifications. Defaults to False. #### Request Body None ### Request Example ```python import olefile # Method 1: Using context manager (recommended) with olefile.OleFileIO("document.doc") as ole: print(f"Root entry name: {ole.get_rootentry_name()}") # File is automatically closed when exiting the context # Method 2: Manual open and close ole = olefile.OleFileIO("document.doc") try: print(f"Root entry name: {ole.get_rootentry_name()}") finally: ole.close() # Method 3: Open from bytes in memory with open("document.doc", "rb") as f: data = f.read() ole = olefile.OleFileIO(data) ope.close() # Method 4: Open in write mode (for modifications) ole = olefile.OleFileIO("document.doc", write_mode=True) # ... perform modifications ... ole.close() ``` ### Response #### Success Response (OleFileIO Object) - **OleFileIO object** - An instance of OleFileIO providing methods to interact with the OLE file. #### Response Example ```json { "root_entry_name": "\u0005SummaryInformation" } ``` ``` -------------------------------- ### Extract Metadata from OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Extracts standard metadata properties from an OLE file. It returns an OleMetadata object containing properties like author, title, and creation date. The 'dump()' method can print all available metadata. ```python meta = ole.get_metadata() print('Author:', meta.author) print('Title:', meta.title) print('Creation date:', meta.create_time) meta.dump() ``` -------------------------------- ### List Streams and Storages in OLE File Source: https://context7.com/decalage2/olefile/llms.txt Lists the directory structure of an OLE file, including streams and storages. ```APIDOC ## listdir ### Description Lists the contents of the OLE file, returning a list of streams and/or storages. ### Method `ole.listdir(streams=True, storages=True)` ### Parameters #### Path Parameters None #### Query Parameters - **streams** (bool) - Whether to include streams in the output. Defaults to True. - **storages** (bool) - Whether to include storages in the output. Defaults to True. #### Request Body None ### Request Example ```python import olefile with olefile.OleFileIO("document.doc") as ole: # List all streams (default behavior) streams = ole.listdir() print("Streams in file:") for stream in streams: print(f" {'/'.join(stream)}") # List both streams and storages all_entries = ole.listdir(streams=True, storages=True) print("\nAll entries (streams and storages):") for entry in all_entries: entry_type = ole.get_type(entry) type_name = "stream" if entry_type == olefile.STGTY_STREAM else "storage" print(f" {'/'.join(entry)} ({type_name})") # List only storages storages = ole.listdir(streams=False, storages=True) print("\nStorages only:") for storage in storages: print(f" {'/'.join(storage)}") ``` ### Response #### Success Response (List of entries) - **List of tuples** - Each tuple represents an entry (stream or storage) as a path. #### Response Example ```json [ ["WordDocument"], ["1Table"], ["Data"] ] ``` ``` -------------------------------- ### Open OLE File in Write Mode using Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This snippet demonstrates how to open an OLE file in read/write mode for modification using the `write_mode=True` option in `olefile.OleFileIO`. This feature is newer and requires thorough testing. Changes made in write mode are not yet fully documented. ```python import olefile # Open file in read/write mode ole = olefile.OleFileIO('test.doc', write_mode=True) ``` -------------------------------- ### Open OLE File from Bytes String in Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This snippet demonstrates opening an OLE file directly from a bytes string in memory. The `olefile.OleFileIO` constructor intelligently determines if the input is a file path or file content based on its size (OLE files are >= 1536 bytes). ```python import olefile s = b'...' # Your OLE file content as bytes ole = olefile.OleFileIO(s) ``` -------------------------------- ### olefile.OleFileIO.write_sect Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Allows overwriting a specific sector of the OLE file. This is an advanced method for file modification. ```APIDOC ## PUT /olefile/write_sector ### Description This advanced method allows overwriting a specific sector within the OLE file. Use with caution as it directly modifies the file structure. ### Method PUT ### Endpoint /olefile/write_sector ### Parameters #### Request Body - **sector_index** (integer) - Required - The index of the sector to overwrite. - **data** (bytes) - Required - The new data to write into the sector. Must be the exact size of a sector. ### Request Example ```json { "sector_index": 5, "data": "" } ``` ### Response #### Success Response (200) - **message** (string) - Confirmation of successful sector overwrite. #### Response Example ```json { "message": "Sector 5 successfully overwritten." } ``` ``` -------------------------------- ### Extract OLE File Metadata Properties in Python Source: https://context7.com/decalage2/olefile/llms.txt Shows how to extract standard and document summary information properties from an OLE file using the olefile Python library. It demonstrates accessing properties like title, author, creation time, modification time, company, and document statistics, and includes a method to dump all available metadata. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Parse and extract all standard metadata meta = ole.get_metadata() # Summary Information properties print(f"Title: {meta.title}") print(f"Subject: {meta.subject}") print(f"Author: {meta.author}") print(f"Keywords: {meta.keywords}") print(f"Comments: {meta.comments}") print(f"Last saved by: {meta.last_saved_by}") print(f"Created: {meta.create_time}") print(f"Modified: {meta.last_saved_time}") # Document Summary Information properties print(f"Company: {meta.company}") print(f"Manager: {meta.manager}") print(f"Category: {meta.category}") # Statistics print(f"Number of pages: {meta.num_pages}") print(f"Number of words: {meta.num_words}") print(f"Number of characters: {meta.num_chars}") # Display all available metadata meta.dump() ``` -------------------------------- ### Test if a File is an OLE Container in Python Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst This snippet shows how to use the `olefile.isOleFile` function to determine if a given file is a valid OLE container. It checks the file's magic bytes before attempting to open it. The function accepts a file path or a file-like object. For in-memory data, use the `data` argument. ```python import olefile import sys # Test with file path if olefile.isOleFile('myfile.doc'): # open the file with OleFileIO pass else: sys.exit('Not an OLE file') # Test with in-memory data file_in_memory = b'...' # your file content as bytes if olefile.isOleFile(data=file_in_memory): # open the file with OleFileIO pass ``` -------------------------------- ### olefile.OleFileIO.exists Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Checks if a specified stream or storage exists within the OLE file. The check is case-insensitive. ```APIDOC ## GET /olefile/exists ### Description Checks for the existence of a given stream or storage within the OLE file. The path comparison is case-insensitive. ### Method GET ### Endpoint /olefile/exists ### Parameters #### Query Parameters * **path** (string) - Required - The path to the stream or storage to check (e.g., 'WordDocument', 'macros/vba'). ### Request Example ``` GET /olefile/exists?path=WordDocument ``` ### Response #### Success Response (200) - **exists** (boolean) - True if the stream/storage exists, False otherwise. #### Response Example ```json { "path": "WordDocument", "exists": true } ``` ``` -------------------------------- ### Overwrite Stream in OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Overwrites an existing stream within an OLE file. The new stream data must be the exact same size as the existing one. This method works on streams of any size. ```python ole = olefile.OleFileIO('test.doc', write_mode=True) data = ole.openstream('WordDocument').read() data = data.replace(b'foo', b'bar') ole.write_stream('WordDocument', data) ole.close() ``` -------------------------------- ### Close OleFile.io Object Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Closes an opened OleFileIO object, releasing the file handle on disk. This is an important step to ensure proper file management, especially in applications that process multiple OLE files. ```python ole.close() ``` -------------------------------- ### Parse User-Defined Properties in OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Parses streams containing user-defined properties. It can convert timestamp properties to datetime objects. The result is a list of dictionaries, each containing 'property_name' and 'value'. ```python variables = ole.get_userdefined_properties(streamname, convert_time=True) if len(variables): for index, variable in enumerate(variables): print('\t{} {}: {}'.format(index, variable['property_name'],variable['value'])) ``` -------------------------------- ### Parse Property Stream in OleFile.io Source: https://github.com/decalage2/olefile/blob/master/doc/Howto.rst Parses a property stream from an OLE file. It returns a dictionary of properties. Timestamps can be converted to datetime objects using 'convert_time=True'. Specific properties can be excluded from conversion using 'no_conversion'. ```python p = ole.getproperties('specialprops') p = ole.getproperties('specialprops', convert_time=True, no_conversion=[10]) ``` -------------------------------- ### Check OLE File Format with olefile Source: https://context7.com/decalage2/olefile/llms.txt Demonstrates how to check if a file or data is in OLE2 format using the olefile.isOleFile() function. It supports checking via file path, in-memory bytes, or a file-like object. This function is crucial for validating input before parsing. ```python import olefile # Check using file path is_ole = olefile.isOleFile("document.doc") print(f"Is OLE file: {is_ole}") # Output: Is OLE file: True # Check using file contents in memory with open("document.doc", "rb") as f: data = f.read() is_ole = olefile.isOleFile(data=data) print(f"Is OLE file: {is_ole}") # Output: Is OLE file: True # Check using file-like object with open("document.doc", "rb") as f: is_ole = olefile.isOleFile(f) print(f"Is OLE file: {is_ole}") # Output: Is OLE file: True ``` -------------------------------- ### Write and Modify OLE File Stream Contents in Python Source: https://context7.com/decalage2/olefile/llms.txt Provides Python code for modifying existing streams within an OLE file using the olefile library. It shows how to open the file in write mode, read an existing stream, modify its content (ensuring size consistency), and write the changes back. Error handling for OLE file operations is included. ```python import olefile # Open file in write mode ole = olefile.OleFileIO("document.doc", write_mode=True) try: # Read existing stream stream = ole.openstream("CustomData") original_data = stream.read() original_size = len(original_data) # Modify data (must be same size as original) modified_data = original_data[:10] + b"MODIFIED" + original_data[18:] # Ensure data is same size if len(modified_data) == original_size: ole.write_stream("CustomData", modified_data) print("Stream modified successfully") else: print(f"Error: Size mismatch. Original: {original_size}, Modified: {len(modified_data)}") except olefile.OleFileError as e: print(f"Error: {e}") finally: ole.close() ``` -------------------------------- ### Check for Stream/Storage Existence in OLE Files with olefile Source: https://context7.com/decalage2/olefile/llms.txt Demonstrates how to check for the existence of specific streams or storages within an OLE2 file using olefile.OleFileIO.exists() and olefile.OleFileIO.get_type(). It supports checking by name string or list of path components, and retrieving the entry type (stream, storage, or not found). ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Check if a stream exists (case-insensitive) if ole.exists("WordDocument"): print("WordDocument stream exists") # Check with path-like syntax if ole.exists("Macros/VBA/Module1"): print("VBA module exists") # Check with list syntax if ole.exists(["Macros", "VBA", "Module1"]): print("VBA module exists") # Get type of entry entry_type = ole.get_type("WordDocument") if entry_type == olefile.STGTY_STREAM: print("WordDocument is a stream") elif entry_type == olefile.STGTY_STORAGE: print("WordDocument is a storage") elif entry_type == False: print("WordDocument does not exist") ``` -------------------------------- ### Extract User-Defined Properties from OLE Files in Python Source: https://context7.com/decalage2/olefile/llms.txt Details how to extract user-defined properties from an OLE file's summary information stream using the olefile library. It covers checking for the existence of the properties stream, extracting properties with optional timestamp conversion, and handling potential errors during the process. ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Check if user-defined properties stream exists properties_path = "\x05DocumentSummaryInformation" if ole.exists(properties_path): try: # Extract user-defined properties with timestamp conversion props = ole.get_userdefined_properties( properties_path, convert_time=True, no_conversion=[10] # Don't convert property ID 10 ) print("User-defined properties:") for prop_id, value in props.items(): print(f" Property {prop_id}: {value}") except Exception as e: print(f"Error reading properties: {e}") ``` -------------------------------- ### Check if a File is OLE Format Source: https://context7.com/decalage2/olefile/llms.txt This function checks if a given file path, in-memory data, or file-like object is a valid OLE file. ```APIDOC ## isOleFile ### Description Checks if a file is a valid OLE2 file. ### Method `olefile.isOleFile(*, filename=None, data=None, file=None)` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python import olefile # Check using file path is_ole = olefile.isOleFile("document.doc") print(f"Is OLE file: {is_ole}") # Check using file contents in memory with open("document.doc", "rb") as f: data = f.read() is_ole = olefile.isOleFile(data=data) print(f"Is OLE file: {is_ole}") # Check using file-like object with open("document.doc", "rb") as f: is_ole = olefile.isOleFile(f) print(f"Is OLE file: {is_ole}") ``` ### Response #### Success Response (True/False) - **is_ole** (boolean) - True if the file is an OLE file, False otherwise. #### Response Example ```json { "is_ole": true } ``` ``` -------------------------------- ### Check if Stream or Storage Exists Source: https://context7.com/decalage2/olefile/llms.txt Checks if a specific stream or storage exists within the OLE file and returns its type. ```APIDOC ## exists & get_type ### Description Checks for the existence of a stream or storage within the OLE file and retrieves its type. ### Method `ole.exists(path)` `ole.get_type(path)` ### Parameters #### Path Parameters - **path** (str or list) - The path to the stream or storage. Can be a string (e.g., "WordDocument") or a list of path components (e.g., ["Macros", "VBA", "Module1"]). #### Query Parameters None #### Request Body None ### Request Example ```python import olefile with olefile.OleFileIO("document.doc") as ole: # Check if a stream exists (case-insensitive) if ole.exists("WordDocument"): print("WordDocument stream exists") # Check with path-like syntax if ole.exists("Macros/VBA/Module1"): print("VBA module exists") # Check with list syntax if ole.exists(["Macros", "VBA", "Module1"]): print("VBA module exists") # Get type of entry entry_type = ole.get_type("WordDocument") if entry_type == olefile.STGTY_STREAM: print("WordDocument is a stream") elif entry_type == olefile.STGTY_STORAGE: print("WordDocument is a storage") elif entry_type == False: print("WordDocument does not exist") ``` ### Response #### Success Response (exists) - **exists** (boolean) - True if the path exists, False otherwise. #### Success Response (get_type) - **entry_type** (int or False) - Returns `olefile.STGTY_STREAM` for streams, `olefile.STGTY_STORAGE` for storages, or `False` if the path does not exist. #### Response Example ```json { "exists": true, "entry_type": 2 } ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.