### Basic WSGI Application Setup Source: https://github.com/kludex/python-multipart/blob/main/docs/index.md A minimal WSGI application to demonstrate basic server setup and response. This serves as a starting point before integrating multipart parsing. ```python import python_multipart def simple_app(environ, start_response): start_response('200 OK', [('Content-type', 'text/plain')]) return ['Hashes:\n'] from wsgiref.simple_server import make_server httpd = make_server('', 8123, simple_app) print("Serving on port 8123...") httpd.serve_forever() ``` -------------------------------- ### Full Example: Using create_form_parser Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/helper-functions.md A comprehensive example showing how to set up callbacks for fields and files, create a parser using `create_form_parser` with custom configuration, and then process request body chunks. This demonstrates a complete parsing workflow. ```python from python_multipart import create_form_parser # Collect results fields = [] files = [] def on_field(field): fields.append({ 'name': field.field_name.decode(), 'value': field.value.decode(), }) def on_file(file): files.append({ 'field_name': file.field_name.decode(), 'file_name': file.file_name.decode(), 'size': file.size, 'content_type': file.content_type, }) # From a web request headers = { "Content-Type": b"multipart/form-data; boundary=----WebKit", "Content-Length": b"5000", } parser = create_form_parser( headers, on_field, on_file, config={"MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024} ) # Feed data parser.write(request_body_chunk_1) parser.write(request_body_chunk_2) parser.finalize() print(f"Fields: {fields}") print(f"Files: {files}") ``` -------------------------------- ### Example: Create Form Parser with Headers Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/helper-functions.md Demonstrates how to use `create_form_parser` by providing a dictionary of HTTP headers, including 'Content-Type' with a boundary. This simplifies parser setup for multipart requests. ```python headers = { "Content-Type": b"multipart/form-data; boundary=----WebKit", "Content-Length": b"12345", } parser = create_form_parser(headers, on_field, on_file) ``` -------------------------------- ### Configuration Precedence Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md Demonstrates how default configuration, user-provided config, and parser-specific parameters are applied. ```python default = FormParser.DEFAULT_CONFIG # MAX_MEMORY_FILE_SIZE = 1 MiB user_config = {"MAX_MEMORY_FILE_SIZE": 50 * 1024 * 1024} parser = FormParser( content_type="multipart/form-data", boundary=b"boundary", on_field=None, on_file=None, config=user_config, # Overrides defaults ) print(parser.config["MAX_MEMORY_FILE_SIZE"]) # 52428800 (50 MiB) ``` -------------------------------- ### Callback Invocation Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/base-parser.md Shows how to define a custom callback function and register it using `set_callback`. The example then demonstrates invoking this callback with sample data. ```python def my_callback(data, start, end): chunk = data[start:end] print(f"Got chunk: {chunk}") parser = OctetStreamParser() parser.set_callback("data", my_callback) parser.callback("data", b"hello", 0, 5) # Calls my_callback(b"hello", 0, 5) ``` -------------------------------- ### Fuzzer Output Example Source: https://github.com/kludex/python-multipart/blob/main/fuzz/README.md Example output from a running fuzzer, showing initialization, new test cases found, coverage, and execution statistics. This output helps monitor the fuzzing progress. ```text #2 INITED cov: 32 ft: 32 corp: 1/1b exec/s: 0 rss: 49Mb #3 NEW cov: 33 ft: 33 corp: 2/2b lim: 4 exec/s: 0 rss: 49Mb L: 1/1 MS: 1 ChangeByte- #4 NEW cov: 97 ft: 97 corp: 3/4b lim: 4 exec/s: 0 rss: 49Mb L: 2/2 MS: 1 InsertByte- #11 NEW cov: 116 ft: 119 corp: 4/5b lim: 4 exec/s: 0 rss: 49Mb L: 1/2 MS: 2 ChangeBinInt-EraseBytes- #30 NEW cov: 131 ft: 134 corp: 5/8b lim: 4 exec/s: 0 rss: 49Mb L: 3/3 MS: 4 ChangeByte-ChangeBit-InsertByte-CopyPart- #31 NEW cov: 135 ft: 138 corp: 6/11b lim: 4 exec/s: 0 rss: 49Mb L: 3/3 MS: 1 CrossOver- #39 NEW cov: 135 ft: 142 corp: 7/15b lim: 4 exec/s: 0 rss: 49Mb L: 4/4 MS: 3 ChangeBit-CrossOver-CopyPart- ``` -------------------------------- ### Full Configuration Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md This snippet demonstrates how to set up a comprehensive configuration dictionary for the FormParser, covering file upload behavior, size limits, and validation rules. ```python from python_multipart import FormParser config = { # File upload settings "UPLOAD_DIR": "/var/uploads", "UPLOAD_DELETE_TMP": False, # Keep files on disk "UPLOAD_KEEP_FILENAME": True, "UPLOAD_KEEP_EXTENSIONS": True, # Size limits "MAX_BODY_SIZE": 500 * 1024 * 1024, # 500 MiB total "MAX_MEMORY_FILE_SIZE": 20 * 1024 * 1024, # 20 MiB in memory # Header limits "MAX_HEADER_COUNT": 16, "MAX_HEADER_SIZE": 8192, # Validation "UPLOAD_ERROR_ON_BAD_CTE": True, # Strict mode } parser = FormParser( content_type="multipart/form-data", boundary=boundary, on_field=on_field, on_file=on_file, config=config, ) ``` -------------------------------- ### File Constructor Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Demonstrates creating a File instance with custom configuration and content type. Shows how to write data and check file properties after writing. ```python from python_multipart.multipart import File config = { "UPLOAD_DIR": "/tmp/uploads", "UPLOAD_KEEP_FILENAME": True, "MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024, # 5 MiB } file = File( b"profile.jpg", field_name=b"avatar", config=config, content_type="image/jpeg" ) # Write data (automatic spillover to disk when exceeding MAX_MEMORY_FILE_SIZE) file.write(image_data) file.finalize() print(file.size) # bytes written print(file.in_memory) # False if spilled print(file.actual_file_name) # b"/tmp/uploads/profile.jpg" or temp path ``` -------------------------------- ### QuerystringParser Usage Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/types.md Demonstrates how to instantiate and use the QuerystringParser with custom callback functions. ```python from python_multipart.multipart import QuerystringParser callbacks: QuerystringCallbacks = { "on_field_name": lambda d, s, e: print(d[s:e]), "on_field_end": lambda: print("field done"), } parser = QuerystringParser(callbacks=callbacks) ``` -------------------------------- ### BaseParser Initialization Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/base-parser.md Demonstrates how to instantiate the BaseParser. Note that this class is typically not instantiated directly in practice. ```python from python_multipart.multipart import BaseParser # Not typically instantiated directly parser = BaseParser() ``` -------------------------------- ### FormParser Configuration Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Shows how to configure the FormParser with various settings like body size limits, upload directory, and filename handling. ```python config = { "MAX_BODY_SIZE": 50 * 1024 * 1024, # 50 MiB limit "MAX_MEMORY_FILE_SIZE": 10 * 1024 * 1024, # 10 MiB in-memory "UPLOAD_DIR": "/var/uploads", "UPLOAD_KEEP_FILENAME": True, "UPLOAD_KEEP_EXTENSIONS": True, "MAX_HEADER_COUNT": 16, "MAX_HEADER_SIZE": 8192, } parser = FormParser( content_type="multipart/form-data", boundary=boundary, on_field=on_field, on_file=on_file, config=config, ) ``` -------------------------------- ### Application/octet-stream Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Demonstrates an application/octet-stream request, where the entire body is treated as a single file. ```http PUT /upload/document.pdf HTTP/1.1 Content-Type: application/octet-stream [binary PDF data] ``` -------------------------------- ### FormParser Configuration Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/types.md Shows how to configure the FormParser with custom file upload settings. Ensure UPLOAD_DIR is set if UPLOAD_KEEP_FILENAME is True. ```python from python_multipart.multipart import FormParser config: FileConfig = { "UPLOAD_DIR": "/var/uploads", "UPLOAD_KEEP_FILENAME": True, "MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024, } parser = FormParser( content_type="multipart/form-data", boundary=boundary, on_field=None, on_file=None, config=config, ) ``` -------------------------------- ### Application/x-www-form-urlencoded Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Shows the format for an application/x-www-form-urlencoded request, where data is sent as a query string. ```http POST /login HTTP/1.1 Content-Type: application/x-www-form-urlencoded username=alice&password=secret&remember=true ``` -------------------------------- ### FormParser Configuration Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/types.md Demonstrates how to configure FormParser with custom settings for body size, header limits, and upload options. Ensure all necessary FileConfig keys are also provided. ```python from python_multipart.multipart import FormParser config: FormParserConfig = { "MAX_BODY_SIZE": 50 * 1024 * 1024, # 50 MiB "MAX_HEADER_COUNT": 16, "MAX_HEADER_SIZE": 8192, "UPLOAD_DIR": "/var/uploads", "UPLOAD_KEEP_FILENAME": True, "MAX_MEMORY_FILE_SIZE": 10 * 1024 * 1024, } parser = FormParser( content_type="multipart/form-data", boundary=boundary, on_field=on_field, on_file=on_file, config=config, ) ``` -------------------------------- ### OctetStreamParser Basic Usage Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Demonstrates basic instantiation and writing of binary data to an OctetStreamParser. ```python parser = OctetStreamParser(max_size=1024*1024) parser.write(b"binary data") parser.finalize() ``` -------------------------------- ### FormParser Initialization and Usage Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Demonstrates how to instantiate FormParser with custom callbacks for fields, files, and completion, and how to feed data to it. Includes setting a maximum memory file size configuration. ```python from python_multipart import FormParser fields = [] files = [] def on_field(field): fields.append(field) print(f"Field: {field.field_name.decode()} = {field.value.decode()}") def on_file(file): files.append(file) print(f"File: {file.file_name.decode()} ({file.size} bytes)") def on_end(): print(f"Parsing complete: {len(fields)} fields, {len(files)} files") parser = FormParser( content_type="multipart/form-data", boundary=b"----WebKit", on_field=on_field, on_file=on_file, on_end=on_end, config={"MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024} ) # Feed data parser.write(b"... multipart data ...") parser.finalize() ``` -------------------------------- ### File Write Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Shows how to write data chunks to a File instance. The write method returns the number of bytes successfully written. ```python file = File(b"document.pdf") written = file.write(pdf_data_chunk) print(f"Wrote {written} bytes") ``` -------------------------------- ### Multipart/form-data Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Illustrates the structure of a multipart/form-data request, including boundary, Content-Disposition, and Content-Type headers for fields and files. ```http POST /upload HTTP/1.1 Content-Type: multipart/form-data; boundary=----WebKit ------WebKit Content-Disposition: form-data; name="username" alice ------WebKit Content-Disposition: form-data; name="avatar"; filename="pic.jpg" Content-Type: image/jpeg [binary image data] ------WebKit-- ``` -------------------------------- ### Example: Parse URL Encoded Form Data Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/helper-functions.md Demonstrates using `parse_form` to parse URL-encoded data from a stream. This example shows how to set up the necessary headers, create a byte stream, and define a callback for processing parsed fields. The `chunk_size` can be adjusted for performance. ```python from python_multipart import parse_form from io import BytesIO # Simulated request stream body = b"username=alice&password=secret" stream = BytesIO(body) fields = [] def on_field(field): fields.append(field) headers = { "Content-Type": b"application/x-www-form-urlencoded", "Content-Length": str(len(body)).encode(), } parse_form(headers, stream, on_field, None, chunk_size=8192) print(f"Parsed {len(fields)} fields") ``` -------------------------------- ### Usage Example for Default Constants Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/types.md Demonstrates how to import and use the default constants for header count and boundary length. This shows practical application of the defined constants. ```python from python_multipart.multipart import DEFAULT_MAX_HEADER_COUNT, MAX_BOUNDARY_LENGTH print(f"Default headers: {DEFAULT_MAX_HEADER_COUNT}") print(f"Max boundary: {MAX_BOUNDARY_LENGTH}") ``` -------------------------------- ### Create Form Parser and Process Files Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Example demonstrating how to create a form parser and handle incoming files during multipart/form-data parsing. The on_file callback processes each received file. ```python from python_multipart import create_form_parser files = [] def on_file(file): files.append(file) print(f"Received file: {file.file_name}") print(f"Size: {file.size} bytes") if not file.in_memory: print(f"Saved to: {file.actual_file_name}") headers = { "Content-Type": b"multipart/form-data; boundary=----WebKit" } parser = create_form_parser(headers, None, on_file) # ... feed multipart data ... parser.finalize() ``` -------------------------------- ### OctetStreamParser Streaming Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Shows how to stream binary data from a socket-like source using OctetStreamParser and its on_data callback. ```python from python_multipart import OctetStreamParser def on_data(data, start, end): chunk = data[start:end] print(f"Received {end - start} bytes") parser = OctetStreamParser( callbacks={"on_data": on_data}, max_size=10*1024*1024 # 10 MiB ) # Stream binary data (e.g., from socket) while True: chunk = socket.recv(8192) if not chunk: break parser.write(chunk) parser.finalize() ``` -------------------------------- ### Configuration with Environment Variables Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md This example shows how to use environment variables to dynamically set configuration values for the FormParser, providing flexibility and external control over settings like upload directory and maximum body size. ```python import os from python_multipart import FormParser config = { "UPLOAD_DIR": os.environ.get("UPLOAD_DIR", "/tmp/uploads"), "MAX_BODY_SIZE": float(os.environ.get("MAX_BODY_SIZE", float('inf'))), } parser = FormParser(..., config=config) ``` -------------------------------- ### Content-Transfer-Encoding Base64 Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md An example of using Content-Transfer-Encoding with base64 for a multipart/form-data field. The decoded value will be b"foobar". ```http Content-Disposition: form-data; name="data" Content-Transfer-Encoding: base64 Zm9vYmFy # "foobar" in base64 ``` -------------------------------- ### Set Callback Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/base-parser.md Demonstrates registering a callback using a lambda function and then unregistering it. This allows dynamic management of event handlers. ```python parser = OctetStreamParser() # Register a callback parser.set_callback("data", lambda d, s, e: print(d[s:e])) # Unregister a callback parser.set_callback("data", None) ``` -------------------------------- ### QuerystringParser Basic Usage Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Demonstrates writing URL-encoded data in chunks to a QuerystringParser configured for loose parsing. ```python parser = QuerystringParser(strict_parsing=False) parser.write(b"username=alice&") parser.write(b"password=secret&age=30") parser.finalize() ``` -------------------------------- ### Usage Pattern with Set Callback Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/base-parser.md Illustrates creating a parser instance and then registering multiple callbacks for different events using the `set_callback` method. This allows for flexible event handling setup. ```python parser = OctetStreamParser() parser.set_callback("data", on_data) parser.set_callback("start", lambda: print("Started")) parser.set_callback("end", lambda: print("Finished")) ``` -------------------------------- ### File Flush to Disk Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Demonstrates manually forcing an in-memory file to be flushed to disk. This is useful for ensuring a file is stored on disk even if it hasn't exceeded the memory limit. ```python file = File(b"large.bin", config={"MAX_MEMORY_FILE_SIZE": float('inf')}) file.write(lots_of_data) file.flush_to_disk() # Force spillover print(file.actual_file_name) # Now has a path ``` -------------------------------- ### Base64Decoder Write Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/decoders.md Shows how to use the write method of Base64Decoder, demonstrating that data can be written in arbitrary chunks. The decoder handles caching and alignment for base64 decoding. ```python decoder = Base64Decoder(output) # Can write in any size chunks decoder.write(b"Zm9v") # "foo" in base64 decoder.write(b"YmFy") # "bar" in base64 decoder.finalize() output.getvalue() # b"foobar" ``` -------------------------------- ### Manual Field Creation and Parsing Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/field.md Provides an example of manually creating and using a form parser to process URL-encoded data, populating lists of fields and files. ```python from python_multipart import create_form_parser from io import BytesIO fields = [] files = [] def on_field(field): fields.append(field) def on_file(file): files.append(file) headers = { "Content-Type": b"application/x-www-form-urlencoded" } parser = create_form_parser(headers, on_field, on_file) parser.write(b"name=Alice&age=30") parser.finalize() for field in fields: print(f"{field.field_name.decode()} = {field.value.decode()}") ``` -------------------------------- ### MultipartParser write Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Demonstrates feeding multipart data chunks to the parser and processing events using registered callbacks. Ensure to call finalize() after all data has been written. ```python from python_multipart import MultipartParser parser = MultipartParser(boundary=b"----WebKit") def on_part_begin(): print("New part") def on_headers_finished(): print("Headers complete, body follows") def on_part_data(data, start, end): chunk = data[start:end] print(f"Part data: {len(chunk)} bytes") def on_part_end(): print("Part complete") parser.set_callback("part_begin", on_part_begin) parser.set_callback("headers_finished", on_headers_finished) parser.set_callback("part_data", on_part_data) parser.set_callback("part_end", on_part_end) # Feed multipart data parser.write(b"\r\n------WebKit\r\n") parser.write(b"Content-Disposition: form-data; name=\"file\"; filename=\"test.txt\"\r\n\r\n") parser.write(b"file contents\r\n") parser.write(b"------WebKit--\r\n") parser.finalize() ``` -------------------------------- ### QuerystringParser Detailed Parsing Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Illustrates parsing URL-encoded data with custom callbacks to capture field names and values, handling cases where values might be absent. ```python from python_multipart import QuerystringParser fields = {} current_name = [] current_value = [] def on_field_name(data, start, end): current_name.append(data[start:end]) def on_field_data(data, start, end): current_value.append(data[start:end]) def on_field_end(): name = b".join(current_name).decode() value = b".join(current_value).decode() if current_value else None fields[name] = value current_name.clear() current_value.clear() parser = QuerystringParser(callbacks={ "on_field_name": on_field_name, "on_field_data": on_field_data, "on_field_end": on_field_end, }) parser.write(b"user=alice&role=admin&verified") parser.finalize() print(fields) # {'user': 'alice', 'role': 'admin', 'verified': None} ``` -------------------------------- ### Example: Boundary Extraction Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/helper-functions.md Illustrates how `create_form_parser` automatically extracts the boundary string from the 'Content-Type' header for multipart requests. The boundary is essential for delimiting parts of the form data. ```python # Input headers = {"Content-Type": b"multipart/form-data; boundary=----WebKit"} # Automatically extracts boundary as b"----WebKit" parser = create_form_parser(headers, on_field, on_file) ``` -------------------------------- ### FormParser write() Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Illustrates how to use the `write` method to feed data chunks to the FormParser, typically done in a loop while streaming from a request body. Ensures `finalize` is called afterward. ```python parser = FormParser(...) # Stream from request body while True: chunk = request.read(8192) if not chunk: break parser.write(chunk) parser.finalize() ``` -------------------------------- ### Base64Decoder Close Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/decoders.md Demonstrates the proper usage of the close method for Base64Decoder within a try-finally block to ensure the underlying file object is also closed. ```python decoder = Base64Decoder(file_obj) try: decoder.write(base64_data) decoder.finalize() finally: decoder.close() # Close underlying file ``` -------------------------------- ### MultipartParser Header Parsing Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Illustrates how to capture header field names and values using callbacks. The '_field' temporary key is used to store the field name before its corresponding value is processed. ```python headers = {} def on_header_field(data, start, end): headers['_field'] = data[start:end].decode() def on_header_value(data, start, end): field = headers.pop('_field') value = data[start:end].decode() headers[field] = value parser = MultipartParser(boundary=b"boundary") parser.set_callback("header_field", on_header_field) parser.set_callback("header_value", on_header_value) ``` -------------------------------- ### FormParser close() Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Shows the correct sequence of calling `finalize` followed by `close` on a FormParser instance to ensure all data is processed and any associated resources are properly released. ```python parser.finalize() parser.close() ``` -------------------------------- ### Catch All Python Multipart Parser Errors Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/exceptions.md Provides a comprehensive example of catching various python-multipart exceptions, from specific parsing errors to general form parser configuration issues and unexpected exceptions. ```python from python_multipart import create_form_parser from python_multipart.exceptions import ( FormParserError, MultipartParseError, QuerystringParseError, DecodeError, FileError, ) def error_response(message): # Placeholder for actual error response logic print(f"Error: {message}") return message headers = {} body_data = b'' on_field = None on_file = None try: parser = create_form_parser(headers, on_field, on_file) parser.write(body_data) parser.finalize() except MultipartParseError as e: # Specific multipart parsing error return error_response(f"Multipart parse error at {e.offset}: {e}") except (QuerystringParseError, DecodeError) as e: # Other parsing errors return error_response(f"Parse error: {e}") except FileError as e: # File system error (likely disk full) return error_response(f"Upload storage error: {e}") except FormParserError as e: # Configuration/setup error return error_response(f"Form parser config error: {e}") except Exception as e: # Unexpected error return error_response(f"Unexpected error: {e}") ``` -------------------------------- ### bytes_received Property Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Gets the total number of bytes written to the parser so far. ```APIDOC ## bytes_received Property Total bytes written to the parser so far (updated by each `write()` call). ### Signature ```python @property bytes_received(self) -> int ``` ### Type `int` ``` -------------------------------- ### Usage Pattern with Initializer Callbacks Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/base-parser.md Demonstrates instantiating a parser subclass and providing callbacks directly in the constructor. This is a convenient way to set up initial event handlers. ```python from python_multipart.multipart import OctetStreamParser def on_data(data, start, end): chunk = data[start:end] print(f"Received: {chunk}") parser = OctetStreamParser(callbacks={"on_data": on_data}) parser.write(b"hello world") parser.finalize() ``` -------------------------------- ### MultipartParser Initialization Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md Initialize MultipartParser with boundary, callbacks, and size/header limits. ```python MultipartParser( boundary: bytes | str, callbacks: dict = {}, max_size: float = float('inf'), max_header_count: int = DEFAULT_MAX_HEADER_COUNT, max_header_size: int = DEFAULT_MAX_HEADER_SIZE, ) ``` -------------------------------- ### OctetStreamParser Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Parses raw binary (application/octet-stream) request bodies. It emits start, data, and end callbacks. The constructor accepts callbacks and a maximum size for the body. ```APIDOC ## OctetStreamParser Parses raw binary (application/octet-stream) request bodies. The simplest parser—it emits start, data, and end callbacks. **Import:** `from python_multipart import OctetStreamParser` ### Constructor ```python OctetStreamParser( callbacks: OctetStreamCallbacks = {}, max_size: float = float("inf") ) ``` | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | callbacks | dict | {} | Callback functions (see below). | | max_size | float | inf | Maximum body size in bytes. If exceeded, data is truncated. | **Raises:** `ValueError` if `max_size` is not a positive number. ### Callbacks | Name | Signature | Description | |------|-----------|-------------| | on_start | `() -> None` | Emitted when the first data is written. | | on_data | `(bytes, int, int) -> None` | Emitted for each chunk of data. | | on_end | `() -> None` | Emitted when `finalize()` is called. | ### Methods #### write ```python def write(self, data: bytes) -> int ``` Write binary data to the parser. **Returns:** Number of bytes processed (may be less than `len(data)` if size limit is exceeded). **Example:** ```python parser = OctetStreamParser(max_size=1024*1024) parser.write(b"binary data") parser.finalize() ``` #### finalize ```python def finalize(self) -> None ``` Signal end of input and emit the `on_end` callback. ### Example ```python from python_multipart import OctetStreamParser def on_data(data, start, end): chunk = data[start:end] print(f"Received {end - start} bytes") parser = OctetStreamParser( callbacks={"on_data": on_data}, max_size=10*1024*1024 # 10 MiB ) # Stream binary data (e.g., from socket) while True: chunk = socket.recv(8192) if not chunk: break parser.write(chunk) parser.finalize() ``` ``` -------------------------------- ### Create and Write to a Field Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/field.md Demonstrates creating a Field instance with a name, writing data to it, finalizing it, and accessing its name and value. ```python from python_multipart.multipart import Field # Create a field with a name field = Field(b"username") field.write(b"alice") field.finalize() print(field.field_name) # b"username" print(field.value) # b"alice" ``` -------------------------------- ### Web Framework Usage Pattern Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Demonstrates a typical integration pattern for the FormParser within a web framework, including setting up callbacks and streaming the request body. ```python from python_multipart import create_form_parser class UploadHandler: def handle_request(self, request): fields = {} files = {} def on_field(field): fields[field.field_name.decode()] = field.value def on_file(file): files[file.field_name.decode()] = file parser = create_form_parser( request.headers, on_field, on_file, config={"MAX_MEMORY_FILE_SIZE": 20 * 1024 * 1024}, ) # Stream request body for chunk in request.stream: parser.write(chunk) parser.finalize() return self.process(fields, files) ``` -------------------------------- ### Testing Basic WSGI Application with Curl Source: https://github.com/kludex/python-multipart/blob/main/docs/index.md Demonstrates how to test the basic WSGI application using curl to verify server response and content. ```console $ curl -ik http://localhost:8123/ HTTP/1.0 200 OK Date: Sun, 07 Apr 2013 01:49:03 GMT Server: WSGIServer/0.1 Python/2.7.3 Content-type: text/plain Content-Length: 8 Hashes: ``` -------------------------------- ### Instance Methods Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Provides methods to interact with the file data, including writing data, handling internal callbacks, flushing to disk, and finalizing the file. ```APIDOC ## Instance Methods ### write ```python def write(data: bytes) -> int ``` Write bytes into the file. Calls `on_data()` internally. ### Parameters * **data** (bytes) - Bytes to write. **Returns:** Number of bytes written. **Raises:** `FileError` if spillover to disk fails. **Example:** ```python file = File(b"document.pdf") written = file.write(pdf_data_chunk) print(f"Wrote {written} bytes") ``` ### on_data ```python def on_data(data: bytes) -> int ``` Callback invoked when data is written to the file. Handles memory-to-disk spillover if the file exceeds `MAX_MEMORY_FILE_SIZE`. ### Parameters * **data** (bytes) - Bytes to write. **Returns:** Bytes written (may be less than input if write fails). **Behavior:** 1. Writes data to current file object (in-memory or disk) 2. Increments `_bytes_written` counter 3. If in-memory and `_bytes_written > MAX_MEMORY_FILE_SIZE`, calls `flush_to_disk()` ### flush_to_disk ```python def flush_to_disk(self) -> None ``` Manually flush an in-memory file to disk. Creates a temporary file and copies the in-memory buffer to it. **Raises:** `FileError` if unable to create the temporary file. **Behavior:** 1. If already on disk, logs a warning and returns 2. Seeks to start of in-memory buffer 3. Creates a new temporary file via `_get_disk_file()` 4. Copies all data to the temporary file 5. Seeks to the write position 6. Replaces the file object 7. Closes the old in-memory buffer **Example:** ```python file = File(b"large.bin", config={"MAX_MEMORY_FILE_SIZE": float('inf')}) file.write(lots_of_data) file.flush_to_disk() # Force spillover print(file.actual_file_name) # Now has a path ``` ### on_end ```python def on_end(self) -> None ``` Callback invoked when the file is finalized. Flushes the underlying file object. ``` -------------------------------- ### QuerystringParser Initialization Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md Initialize QuerystringParser with callbacks, strict parsing option, and maximum size. ```python QuerystringParser( callbacks: dict = {}, strict_parsing: bool = False, max_size: float = float('inf'), ) ``` -------------------------------- ### FormParser finalize() Method Example Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Demonstrates the usage of the `finalize` method to signal the end of input data to the FormParser. This is crucial for triggering any pending `on_end` callbacks and completing the parsing process. ```python parser.write(body_data) parser.finalize() # Must call to ensure on_end callbacks ``` -------------------------------- ### Integrate with Async Framework (Starlette/FastAPI) Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/helper-functions.md Demonstrates integration with asynchronous frameworks like Starlette or FastAPI. It uses `create_form_parser` and streams chunks from the request body for parsing. ```python from python_multipart import create_form_parser async def upload_handler(request): headers = {k.decode() if isinstance(k, bytes) else k: v for k, v in request.headers.items()} fields = [] files = [] parser = create_form_parser( headers, lambda f: fields.append(f), lambda f: files.append(f), ) # Stream chunks from request body async for chunk in request.stream(): parser.write(chunk) parser.finalize() return {"fields": len(fields), "files": len(files)} ``` -------------------------------- ### MultipartParser Constructor Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Initializes a new MultipartParser instance. It requires a boundary string and can optionally accept callbacks, maximum size, and limits for headers. ```APIDOC ## MultipartParser Constructor ### Description Initializes a new MultipartParser instance. It requires a boundary string and can optionally accept callbacks, maximum size, and limits for headers. ### Signature ```python MultipartParser( boundary: bytes | str, callbacks: MultipartCallbacks = {}, max_size: float = float("inf"), *, max_header_count: int = DEFAULT_MAX_HEADER_COUNT, max_header_size: int = DEFAULT_MAX_HEADER_SIZE, ) ``` ### Parameters - **boundary** (bytes | str) - Required - The boundary string from Content-Type header (without leading "--"). - **callbacks** (dict) - Optional - Callback functions. - **max_size** (float) - Optional - Maximum body size in bytes. Defaults to infinity. - **max_header_count** (int) - Optional - Maximum number of headers per part. Defaults to DEFAULT_MAX_HEADER_COUNT. - **max_header_size** (int) - Optional - Maximum size of a single header line (bytes). Defaults to DEFAULT_MAX_HEADER_SIZE. ### Raises - `ValueError` if `max_size` is invalid - `FormParserError` if boundary exceeds 256 bytes ``` -------------------------------- ### Import Core Components Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/overview.md Import necessary classes and functions from the python_multipart library for parsing various content types. ```python from python_multipart import ( BaseParser, FormParser, MultipartParser, OctetStreamParser, QuerystringParser, create_form_parser, parse_form, ) ``` -------------------------------- ### Testing Streaming Hash Calculation with Curl Source: https://github.com/kludex/python-multipart/blob/main/docs/index.md Demonstrates how to test the streaming SHA-256 hash calculation application using curl with a file upload. It verifies that the calculated hash matches the file's actual hash. ```console $ echo "Foo bar" > /tmp/test.txt $ shasum -a 256 /tmp/test.txt 0b64696c0f7ddb9e3435341720988d5455b3b0f0724688f98ec8e6019af3d931 /tmp/test.txt $ curl -ik -F file=@/tmp/test.txt http://localhost:8123/ HTTP/1.0 200 OK Date: Sun, 07 Apr 2013 02:09:10 GMT Server: WSGIServer/0.1 Python/2.7.3 Content-type: text/plain Hashes: Part hash: 0b64696c0f7ddb9e3435341720988d5455b3b0f0724688f98ec8e6019af3d931 ``` -------------------------------- ### FormParser Constructor Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Initializes a FormParser instance. It takes the content type, callbacks for fields, files, and completion, and optional parameters like boundary and configuration. ```APIDOC ## FormParser Constructor Initializes a FormParser instance. ### Signature ```python FormParser( content_type: str, on_field: Callable[[Field], None] | None, on_file: Callable[[File], None] | None, on_end: Callable[[], None] | None = None, boundary: bytes | str | None = None, file_name: bytes | None = None, config: dict[Any, Any] = {}, ) -> FormParser ``` ### Parameters * **content_type** (str) - The Content-Type header value (e.g., "multipart/form-data"). * **on_field** (Callable | None) - Callback invoked for each parsed field. * **on_file** (Callable | None) - Callback invoked for each parsed file. * **on_end** (Callable | None) - Optional callback when all data is parsed. Defaults to None. * **boundary** (bytes | str | None) - The multipart boundary (required for multipart/form-data). Defaults to None. * **file_name** (bytes | None) - File name for octet-stream uploads (used as fallback). Defaults to None. * **config** (dict) - Configuration dictionary (see overview.md). Defaults to {}. ### Raises * `FormParserError` if content_type is unknown or boundary is missing for multipart * `ValueError` if invalid config values provided ### Content-Type Handling | Content-Type | Behavior | |-----------------------------------|------------------------------------------------| | multipart/form-data | Creates `MultipartParser`, requires `boundary` parameter | | application/x-www-form-urlencoded | Creates `QuerystringParser` | | application/x-url-encoded | Creates `QuerystringParser` | | application/octet-stream | Creates `OctetStreamParser` | | Other | Raises `FormParserError` | ``` -------------------------------- ### OctetStreamParser Initialization Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/configuration.md Initialize OctetStreamParser with callbacks and maximum size. ```python OctetStreamParser( callbacks: dict = {}, max_size: float = float('inf'), ) ``` -------------------------------- ### MultipartParser Constructor Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/parsers.md Initializes the MultipartParser with a boundary string, optional callbacks, and size limits. It raises ValueError for invalid max_size and FormParserError if the boundary exceeds 256 bytes. ```python MultipartParser( boundary: bytes | str, callbacks: MultipartCallbacks = {}, max_size: float = float("inf"), *, max_header_count: int = DEFAULT_MAX_HEADER_COUNT, max_header_size: int = DEFAULT_MAX_HEADER_SIZE, ) ``` -------------------------------- ### Default Configuration for FormParser Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Shows the default configuration dictionary used by FormParser, which can be overridden by user-provided values during initialization. Includes settings for body size, header limits, and file upload behavior. ```python DEFAULT_CONFIG = { "MAX_BODY_SIZE": float('inf'), "MAX_HEADER_COUNT": 8, "MAX_HEADER_SIZE": 4224, "MAX_MEMORY_FILE_SIZE": 1 * 1024 * 1024, # 1 MiB "UPLOAD_DIR": None, "UPLOAD_DELETE_TMP": True, "UPLOAD_KEEP_FILENAME": False, "UPLOAD_KEEP_EXTENSIONS": False, "UPLOAD_ERROR_ON_BAD_CTE": False, } ``` -------------------------------- ### Run Fuzz Target Source: https://github.com/kludex/python-multipart/blob/main/fuzz/README.md Execute a fuzzing script to discover bugs in python-multipart. This command initiates the fuzzing process. ```sh python fuzz/fuzz_form.py ``` -------------------------------- ### Flask Integration for Uploads Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/usage-examples.md Demonstrates how to use `python-multipart` with Flask to process incoming form data and file uploads. It accesses the request data and headers to initialize the parser. ```python from flask import Flask, request from python_multipart import create_form_parser from io import BytesIO app = Flask(__name__) @app.route('/upload', methods=['POST']) def upload(): # Convert Flask headers to dict format headers = { 'Content-Type': request.headers.get('Content-Type', '').encode(), 'Content-Length': request.headers.get('Content-Length', '').encode(), } fields = [] files = [] parser = create_form_parser( headers, lambda f: fields.append(f), lambda f: files.append(f), ) # Stream from request stream parser.write(request.get_data()) parser.finalize() return {"uploaded": len(files)} ``` -------------------------------- ### File Constructor Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/file.md Creates a new File instance to store uploaded file data. It handles configuration for memory usage, temporary file management, and filename preservation. ```APIDOC ## File Constructor ```python File( file_name: bytes | None, field_name: bytes | None = None, config: FileConfig = {}, *, content_type: str | None = None ) -> File ``` Creates a new file instance for storing uploaded file data. ### Parameters * **file_name** (bytes | None) - The original filename from the upload. * **field_name** (bytes | None) - The form field name this file was uploaded with (None for octet-stream). * **config** (FileConfig) - Configuration dictionary (see overview.md for keys). * **content_type** (str | None) - The Content-Type header value (e.g., "image/png"). **Raises:** `FileError` if unable to create a temporary file (when spillover occurs). ### Example ```python from python_multipart.multipart import File config = { "UPLOAD_DIR": "/tmp/uploads", "UPLOAD_KEEP_FILENAME": True, "MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024, # 5 MiB } file = File( b"profile.jpg", field_name=b"avatar", config=config, content_type="image/jpeg" ) # Write data (automatic spillover to disk when exceeding MAX_MEMORY_FILE_SIZE) file.write(image_data) file.finalize() print(file.size) # bytes written print(file.in_memory) # False if spilled print(file.actual_file_name) # b"/tmp/uploads/profile.jpg" or temp path ``` ``` -------------------------------- ### Configure Parser Settings Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/overview.md Customize parser behavior by setting various configuration options, such as maximum body size, memory file size, and temporary file handling. ```python config = { "MAX_BODY_SIZE": float('inf'), # Max request body size in bytes "MAX_MEMORY_FILE_SIZE": 1 * 1024 * 1024, # Max in-memory file size (1 MiB) "MAX_HEADER_COUNT": 8, # Max headers per multipart part "MAX_HEADER_SIZE": 4224, # Max single header line size (bytes) "UPLOAD_DIR": None, # Directory for temporary files (None = system temp) "UPLOAD_DELETE_TMP": True, # Auto-delete temporary files "UPLOAD_KEEP_FILENAME": False, # Keep original filename "UPLOAD_KEEP_EXTENSIONS": False, # Keep file extension "UPLOAD_ERROR_ON_BAD_CTE": False, # Raise on unknown Content-Transfer-Encoding } ``` -------------------------------- ### FastAPI/Starlette Application Integration Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/README.md Shows how to integrate python-multipart with FastAPI or Starlette applications to handle file uploads and form data. ```python @app.post("/upload") async def upload(request): headers = dict(request.headers) parser = create_form_parser(headers, on_field, on_file) parser.write(await request.body()) parser.finalize() ``` -------------------------------- ### FormParser Constructor Signature Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/form-parser.md Defines the parameters for initializing a FormParser instance, including content type, callbacks for fields, files, and completion, and optional boundary and configuration settings. ```python FormParser( content_type: str, on_field: Callable[[Field], None] | None, on_file: Callable[[File], None] | None, on_end: Callable[[], None] | None = None, boundary: bytes | str | None = None, file_name: bytes | None = None, config: dict[Any, Any] = {}, ) -> FormParser ``` -------------------------------- ### Multipart Size Limit Handling Source: https://github.com/kludex/python-multipart/blob/main/_autodocs/usage-examples.md Configures a multipart parser with explicit size limits for the total body and in-memory files. The code demonstrates how to monitor the write operation to detect when the size limit is reached and stop processing. ```python from python_multipart import create_form_parser config = { "MAX_BODY_SIZE": 10 * 1024 * 1024, # 10 MiB limit "MAX_MEMORY_FILE_SIZE": 5 * 1024 * 1024, } parser = create_form_parser(headers, on_field, on_file, config=config) # Write data - parser silently truncates if size limit exceeded bytes_written = 0 while True: chunk = stream.read(1024 * 1024) if not chunk: break written = parser.write(chunk) bytes_written += written # Check if we hit the limit (write returned less than chunk size) if written < len(chunk): print(f"Size limit reached at {bytes_written} bytes") break parser.finalize() ``` -------------------------------- ### Simple WSGI Application for Multipart Parsing Source: https://github.com/kludex/python-multipart/blob/main/docs/index.md This snippet demonstrates a basic WSGI application that uses python-multipart to parse incoming request bodies. It defines callbacks for handling fields and files, and sets up a simple HTTP server. ```python import python_multipart def simple_app(environ, start_response): ret = [] # The following two callbacks just append the name to the return value. def on_field(field): ret.append(b"Parsed value parameter named: %s" % (field.field_name,)) def on_file(file): ret.append(b"Parsed file parameter named: %s" % (file.field_name,)) # Create headers object. We need to convert from WSGI to the actual # name of the header, since this library does not assume that you are # using WSGI. headers = {'Content-Type': environ['CONTENT_TYPE']} if 'CONTENT_LENGTH' in environ: headers['Content-Length'] = environ['CONTENT_LENGTH'] # Parse the form. python_multipart.parse_form(headers, environ['wsgi.input'], on_field, on_file) # Return something. start_response('200 OK', [('Content-type', 'text/plain')]) ret.append(b'\n') return ret from wsgiref.simple_server import make_server from wsgiref.validate import validator httpd = make_server('', 8123, simple_app) print("Serving on port 8123...") httpd.serve_forever() ``` -------------------------------- ### Testing the WSGI Application with Curl Source: https://github.com/kludex/python-multipart/blob/main/docs/index.md This console command shows how to test the simple WSGI application by sending a multipart form data request using curl. ```console $ curl -ik -F "foo=bar" http://localhost:8123/ ```