### Install rfc3986 Source: https://github.com/python-hyper/rfc3986/blob/main/README.rst Use pip to install the library. ```bash pip install rfc3986 ``` -------------------------------- ### Install rfc3986 using pip Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/index.md Use pip to install the rfc3986 library. This command can be run directly or using the python -m pip pattern. ```bash pip install rfc3986 ``` ```bash python -m pip install rfc3986 ``` ```bash python3.11 -m pip install rfc3986 ``` -------------------------------- ### Encode IRI to URI with IDNA Source: https://context7.com/python-hyper/rfc3986/llms.txt Encode an IRI to a standard URI using the `encode()` method, which applies IDNA encoding to hostnames containing Unicode characters. This requires the 'idna' package to be installed. ```python import rfc3986 # ... (previous IRI parsing code) ... # Encode IRI to URI (converts unicode to ASCII-compatible encoding) try: uri = iri.encode() print(uri.unsplit()) # => 'https://xn--fiqs8s/' (punycode encoded) except rfc3986.exceptions.MissingDependencyError: print("Install 'idna' package for IRI encoding: pip install rfc3986[idna]") ``` -------------------------------- ### Create URI Variations from a Base Source: https://context7.com/python-hyper/rfc3986/llms.txt Demonstrates the immutability of `URIBuilder`. New URIs can be created by adding paths to a base builder instance without altering the original. ```python base = URIBuilder().add_scheme('https').add_host('api.example.com') users_url = base.add_path('/users').geturl() # => 'https://api.example.com/users' posts_url = base.add_path('/posts').geturl() # => 'https://api.example.com/posts' ``` -------------------------------- ### Require Presence of URI Components Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/validating.md Use this to enforce that specific components like 'scheme' and 'host' must be present in the URI. ```python >>> from rfc3986 import validators, uri_reference >>> user_url = 'https://github.com/sigmavirus24/rfc3986' >>> validator = validators.Validator().allow_schemes( ... 'https', ... ).allow_hosts( ... 'github.com', ... ).forbid_use_of_password( ... ).require_presence_of( ... 'scheme', 'host', ... ) >>> validator.validate(uri_reference('//github.com')) Traceback (most recent call last): ... rfc3986.exceptions.MissingComponentError >>> validator.validate(uri_reference('https:/')) Traceback (most recent call last): ... rfc3986.exceptions.MissingComponentError >>> validator.validate(uri_reference('https://github.com')) >>> validator.validate(uri_reference( ... 'https://github.com/sigmavirus24/rfc3986' ... )) ``` -------------------------------- ### Construct URIs with URIBuilder Source: https://context7.com/python-hyper/rfc3986/llms.txt Use the immutable, chainable URIBuilder API to programmatically construct URIs with automatic encoding and normalization. ```python from rfc3986.builder import URIBuilder # Build a URI from scratch url = URIBuilder().add_scheme( 'https' ).add_host( 'api.example.com' ).add_port( 443 ).add_path( '/v1/users' ).add_query_from( {'page': 1, 'limit': 10} ).add_fragment( 'results' ).finalize().unsplit() print(url) # => 'https://api.example.com:443/v1/users?page=1&limit=10#results' # Use geturl() shortcut instead of finalize().unsplit() url = URIBuilder().add_scheme('https').add_host('example.com').add_path('/api').geturl() print(url) # => 'https://example.com/api' # Add credentials (automatically URL-encoded) url = URIBuilder().add_scheme( 'https' ).add_credentials( 'admin', 'p@ss:word!' # Special characters are encoded ).add_host( 'secure.example.com' ).add_path( '/admin' ).geturl() print(url) # => 'https://admin:p%40ss%3Aword%21@secure.example.com/admin' # Build from an existing URI and modify builder = URIBuilder.from_uri('https://api.github.com/users') new_url = builder.add_path('/sigmavirus24').add_query_from({'tab': 'repos'}).geturl() print(new_url) # => 'https://api.github.com/users/sigmavirus24?tab=repos' # Extend existing path url = URIBuilder().add_scheme('https').add_host('example.com').add_path( '/api' ).extend_path( '/v2/users' ).geturl() print(url) # => 'https://example.com/api/v2/users' ``` -------------------------------- ### Parse URI with urlparse (stdlib compatible) Source: https://context7.com/python-hyper/rfc3986/llms.txt Use `rfc3986.urlparse` as a direct replacement for `urllib.parse.urlparse`. It returns a `ParseResult` object with familiar attributes like `netloc`, `hostname`, and `params`. Modified copies can be created using `copy_with`, and components can be removed by passing `None`. ```python import rfc3986 # Parse URI using stdlib-compatible interface result = rfc3986.urlparse('https://user:password@example.com:8080/path/to/resource?query=value#section') # Access components using familiar attributes print(result.scheme) # => 'https' print(result.netloc) # => 'user:password@example.com:8080' (alias for authority) print(result.hostname) # => 'example.com' (alias for host) print(result.port) # => 8080 (as integer) print(result.path) # => '/path/to/resource' print(result.query) # => 'query=value' print(result.params) # => 'query=value' (alias for query) print(result.fragment) # => 'section' print(result.userinfo) # => 'user:password' # Get the full URL back print(result.geturl()) # => 'https://user:password@example.com:8080/path/to/resource?query=value#section' # Create modified copies new_result = result.copy_with(scheme='http', port=443, path='/new/path') print(new_result.geturl()) # => 'http://user:password@example.com:443/new/path?query=value#section' # Remove components by passing None minimal = result.copy_with(userinfo=None, query=None, fragment=None) print(minimal.geturl()) # => 'https://example.com:8080/path/to/resource' ``` -------------------------------- ### Normalize URIs with normalize_uri and URIReference.normalize() Source: https://context7.com/python-hyper/rfc3986/llms.txt Normalize URIs according to RFC 3986 Section 6.2.2 using `normalize_uri` or the `URIReference.normalize()` method. This process includes lowercasing the scheme and host, removing default ports, and decoding unnecessary percent-encoding. Normalized URIs can be compared for equality. ```python import rfc3986 # Normalize a mangled URI mangled = 'hTTp://exAMPLe.COM:80/Some/reallY/biZZare/pAth' normalized = rfc3986.normalize_uri(mangled) print(normalized) # => 'http://example.com/Some/reallY/biZZare/pAth' # Normalization preserves path case (only scheme and host are lowercased) uri = rfc3986.uri_reference('HTTPS://GitHub.COM/User/Repo') normal = uri.normalize() print(normal.scheme) # => 'https' print(normal.host) # => 'github.com' print(normal.unsplit()) # => 'https://github.com/User/Repo' # Normalized URIs can be compared for equality uri1 = rfc3986.uri_reference('HTTP://Example.com/path') uri2 = rfc3986.uri_reference('http://example.com/path') print(uri1 == uri2) # => True (comparison uses normalized equality) # Check normalized equality explicitly print(uri1.normalized_equality(uri2)) # => True ``` -------------------------------- ### Validate URI Components with rfc3986 Validator Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/validating.md Use the `Validator` class to define rules for URI components and check if a given URI reference adheres to them. This is useful for ensuring URIs meet specific criteria before processing. ```python >>> from rfc3986 import validators, uri_reference >>> valid_uri = uri_reference('https://github.com/') >>> validator = validators.Validator().allow_schemes( ... 'https', ... ).allow_hosts( ... 'github.com', ... ).forbid_use_of_password( ... ).require_presence_of( ... 'scheme', 'host', ... ).check_validity_of( ... 'scheme', 'host', 'path', ... ) >>> validator.validate(valid_uri) >>> invalid_uri = valid_uri.copy_with(path='/#invalid/path') >>> validator.validate(invalid_uri) Traceback (most recent call last): ... rfc3986.exceptions.InvalidComponentsError ``` -------------------------------- ### Parse a URI Reference with uri_reference Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/parsing.md Use uri_reference for parsing URI references. This method offers safe normalization of URIs. ```python uri = rfc3986.uri_reference('https://github.com/sigmavirus24/rfc3986') ``` -------------------------------- ### Resolve Relative URIs with RFC 3986 Algorithm Source: https://context7.com/python-hyper/rfc3986/llms.txt Utilizes the `resolve_with` method to resolve a relative URI reference against a base URI, adhering to RFC 3986 Section 5. Ensure the base URI has a scheme to avoid `ResolutionError`. ```python import rfc3986 # Define a base URI base = rfc3986.uri_reference('https://example.com/a/b/c') # Resolve relative paths relative1 = rfc3986.uri_reference('../d') resolved1 = relative1.resolve_with(base) print(resolved1.unsplit()) # => 'https://example.com/a/d' relative2 = rfc3986.uri_reference('/absolute/path') resolved2 = relative2.resolve_with(base) print(resolved2.unsplit()) # => 'https://example.com/absolute/path' relative3 = rfc3986.uri_reference('sibling') resolved3 = relative3.resolve_with(base) print(resolved3.unsplit()) # => 'https://example.com/a/b/sibling' # Resolve with query and fragment relative4 = rfc3986.uri_reference('?newquery') resolved4 = relative4.resolve_with(base) print(resolved4.unsplit()) # => 'https://example.com/a/b/c?newquery' relative5 = rfc3986.uri_reference('#section') resolved5 = relative5.resolve_with(base) print(resolved5.unsplit()) # => 'https://example.com/a/b/c#section' # Base URI must have a scheme, otherwise ResolutionError is raised try: relative = rfc3986.uri_reference('/path') relative.resolve_with('//example.com/base') # No scheme except rfc3986.exceptions.ResolutionError as e: print(f"Error: {e}") ``` -------------------------------- ### Normalize URIs Source: https://github.com/python-hyper/rfc3986/blob/main/README.rst Normalize URI components and compare them for functional equivalence. ```python mangled = uri_reference('hTTp://exAMPLe.COM') print(mangled.scheme) # => hTTp print(mangled.authority) # => exAMPLe.COM normal = mangled.normalize() print(normal.scheme) # => http print(mangled.authority) # => example.com ``` ```python if normal == mangled: webbrowser.open(normal.unsplit()) ``` ```python mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') normal = mangled.normalize() assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth' assert normal == 'http://example.com/Some/reallY/biZZare/pAth' assert normal != 'http://example.com/some/really/bizzare/path' ``` ```python from rfc3986 import normalize_uri assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') == 'http://example.com/Some/reallY/biZZare/pAth') ``` -------------------------------- ### Parse SSH URL with // Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/parsing.md Adding '//' before the authority in an SSH URL improves parsing, correctly identifying the authority and path components according to RFC 3986. ```python rfc3986.uri_reference('//git@github.com:sigmavirus24/rfc3986') ``` -------------------------------- ### Strict URI Parsing and Validation Source: https://github.com/python-hyper/rfc3986/blob/main/README.rst Parse URIs into URIReference objects and access their components. ```python from rfc3986 import uri_reference example = uri_reference('http://example.com') email = uri_reference('mailto:user@domain.com') ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git') ``` ```python print(example.scheme) # => http print(email.path) # => user@domain.com print(ssh.userinfo) # => user print(ssh.host) # => git.openstack.org print(ssh.port) # => 29418 ``` ```python uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃ print(uni.query) # utf8=%E2%98%83 ``` ```python if ssh.is_valid(): subprocess.call(['git', 'clone', ssh.unsplit()]) ``` -------------------------------- ### Parse and Modify URIs Source: https://github.com/python-hyper/rfc3986/blob/main/README.rst Parse a URI using urlparse and create modified copies using copy_with. ```python from rfc3986 import urlparse ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git') print(ssh.scheme) # => ssh print(ssh.userinfo) # => user print(ssh.params) # => None print(ssh.port) # => 29418 ``` ```python new_ssh = ssh.copy_with( scheme='https' userinfo='', port=443, path='/openstack/glance' ) print(new_ssh.scheme) # => https print(new_ssh.userinfo) # => None # etc. ``` -------------------------------- ### Validate URIs with is_valid_uri Source: https://context7.com/python-hyper/rfc3986/llms.txt Use the convenience function to check if a string conforms to RFC 3986, optionally enforcing the presence of specific components. ```python import rfc3986 # Basic validation print(rfc3986.is_valid_uri('https://example.com/path')) # => True print(rfc3986.is_valid_uri('not a valid uri')) # => True (path-only is valid per RFC) # Require specific components print(rfc3986.is_valid_uri( 'https://example.com/path', require_scheme=True, require_authority=True, require_path=True )) # => True print(rfc3986.is_valid_uri( 'mailto:user@example.com', require_authority=True )) # => False (mailto URIs have no authority) print(rfc3986.is_valid_uri( '//example.com/path', require_scheme=True )) # => False (no scheme present) # Validate URI with all components required print(rfc3986.is_valid_uri( 'https://example.com/path?query=value#fragment', require_scheme=True, require_authority=True, require_path=True, require_query=True, require_fragment=True )) # => True ``` -------------------------------- ### Parse URI with uri_reference Source: https://context7.com/python-hyper/rfc3986/llms.txt Use `uri_reference` to parse a URI string into a `URIReference` object. Access individual components like scheme, authority, path, query, and fragment. Sub-authority components (userinfo, host, port) are also accessible. This function supports both string and bytes input. You can create modified copies of a URI reference using `copy_with`. ```python import rfc3986 # Parse a simple URI uri = rfc3986.uri_reference('https://user:pass@github.com:443/sigmavirus24/rfc3986?tab=readme#installation') # Access individual components print(uri.scheme) # => 'https' print(uri.authority) # => 'user:pass@github.com:443' print(uri.path) # => '/sigmavirus24/rfc3986' print(uri.query) # => 'tab=readme' print(uri.fragment) # => 'installation' # Access sub-authority components print(uri.userinfo) # => 'user:pass' print(uri.host) # => 'github.com' print(uri.port) # => '443' # Get authority details as a dictionary auth_info = uri.authority_info() print(auth_info) # => {'userinfo': 'user:pass', 'host': 'github.com', 'port': '443'} # Parse a URI with bytes input uri_bytes = rfc3986.uri_reference(b'http://httpbin.org/get?utf8=%E2%98%83') print(uri_bytes.query) # => 'utf8=%E2%98%83' # Check if URI is absolute (has a scheme) print(uri.is_absolute()) # => True # Create a copy with modified components new_uri = uri.copy_with(scheme='http', port=None) print(new_uri.unsplit()) # => 'http://user:pass@github.com/sigmavirus24/rfc3986?tab=readme#installation' ``` -------------------------------- ### Extend Existing Query Parameters in URI Source: https://context7.com/python-hyper/rfc3986/llms.txt Use `extend_query_with` to add or update query parameters on an existing URI builder instance. The builder remains immutable, allowing for the creation of variations from a common base. ```python url = URIBuilder().add_scheme('https').add_host('search.example.com').add_query_from( {'q': 'python'} ).extend_query_with( {'page': 2, 'sort': 'date'} ).geturl() print(url) # => 'https://search.example.com?q=python&page=2&sort=date' ``` -------------------------------- ### RFC 3986 Compiled Matchers Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/api-ref/miscellaneous.md Compiled regular expressions for efficient URI component matching. ```APIDOC ## RFC 3986 Compiled Matchers ### Description Compiled regular expressions for efficient URI component matching. ### Endpoints #### URI_MATCHER - **Description**: Compiled version of `rfc3986.abnf_regexp.URL_PARSING_RE`. #### SUBAUTHORITY_MATCHER - **Description**: Compiled compilation of `rfc3986.abnf_regexp.USERINFO_RE`, `rfc3986.abnf_regexp.HOST_PATTERN`, `rfc3986.abnf_regexp.PORT_RE`. #### SCHEME_MATCHER - **Description**: Compiled version of `rfc3986.abnf_regexp.SCHEME_RE`. #### IPv4_MATCHER - **Description**: Compiled version of `rfc3986.abnf_regexp.IPv4_RE`. #### PATH_MATCHER - **Description**: Compiled version of `rfc3986.abnf_regexp.PATH_RE`. #### QUERY_MATCHER - **Description**: Compiled version of `rfc3986.abnf_regexp.QUERY_RE`. #### RELATIVE_REF_MATCHER - **Description**: Compiled compilation of `rfc3986.abnf_regexp.SCHEME_RE`, `rfc3986.abnf_regexp.HIER_PART_RE`, `rfc3986.abnf_regexp.QUERY_RE`. ``` -------------------------------- ### Handle Missing Component Error with Validator Source: https://context7.com/python-hyper/rfc3986/llms.txt Use `MissingComponentError` to handle cases where a required URI component (like scheme or host) is absent when validated against specific rules. ```python import rfc3986 from rfc3986 import exceptions, validators # MissingComponentError - required component not present validator = validators.Validator().require_presence_of('scheme', 'host') try: validator.validate(rfc3986.uri_reference('//example.com/path')) except exceptions.MissingComponentError as e: print(f"Missing components: {e.components}") # => ['scheme'] ``` -------------------------------- ### Parse and Access IRI Components Source: https://context7.com/python-hyper/rfc3986/llms.txt The `iri_reference` function parses Internationalized Resource Identifiers (IRIs) which can include Unicode characters. Access components like scheme, host, and path directly. ```python import rfc3986 # Parse an IRI with unicode characters iri = rfc3986.iri_reference('https://example.com/path?name=') print(iri.query) # => 'name=' # Parse IRI with unicode hostname (requires 'idna' package) # pip install rfc3986[idna] iri = rfc3986.iri_reference('https:///') # Chinese domain # Access IRI components print(iri.scheme) # => 'https' print(iri.host) # => '' print(iri.path) # => '/' ``` -------------------------------- ### Parse a URL with urlparse Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/parsing.md Use urlparse to parse a URL string. This function is intended to completely replace urllib.parse.urlparse. ```python url = rfc3986.urlparse('https://github.com/sigmavirus24/rfc3986') ``` -------------------------------- ### Perform Advanced Validation with Validator Source: https://context7.com/python-hyper/rfc3986/llms.txt Utilize the Validator class for complex rules like scheme/host/port restrictions, password prohibition, and component format checking. ```python from rfc3986 import validators, uri_reference # Create a validator that allows only HTTPS to specific hosts validator = validators.Validator().allow_schemes( 'https', ).allow_hosts( 'github.com', 'api.github.com', ).require_presence_of( 'scheme', 'host', ) # Valid URIs pass without exception validator.validate(uri_reference('https://github.com/user/repo')) validator.validate(uri_reference('https://api.github.com/users')) # Invalid scheme raises UnpermittedComponentError try: validator.validate(uri_reference('http://github.com/user/repo')) except validators.exceptions.UnpermittedComponentError as e: print(f"Error: {e}") # scheme was required to be one of ['https'] but was 'http' # Missing required component raises MissingComponentError try: validator.validate(uri_reference('//github.com/user/repo')) except validators.exceptions.MissingComponentError as e: print(f"Error: {e}") # scheme was required but missing # Forbid passwords in URIs (security best practice) secure_validator = validators.Validator().allow_schemes( 'https', ).forbid_use_of_password() secure_validator.validate(uri_reference('https://user@github.com')) # OK - no password try: secure_validator.validate(uri_reference('https://user:secret@github.com')) except validators.exceptions.PasswordForbidden as e: print(f"Error: {e}") # contained a password when validation forbade it # Validate component format validity strict_validator = validators.Validator().check_validity_of( 'scheme', 'host', 'path', 'query', 'fragment', ) try: invalid_uri = uri_reference('https://example.com').copy_with(path='/#invalid') strict_validator.validate(invalid_uri) except validators.exceptions.InvalidComponentsError as e: print(f"Error: {e}") # path was found to be invalid # Allow specific ports only api_validator = validators.Validator().allow_schemes( 'https', ).allow_ports( '443', '8443', ) api_validator.validate(uri_reference('https://api.example.com:443/v1')) # OK ``` -------------------------------- ### Allow Trusted Domains and Schemes Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/validating.md Use this validator to ensure URLs use only specific schemes and domains. It reuses the validator object for multiple validations. ```python >>> from rfc3986 import validators, uri_reference >>> user_url = 'https://github.com/sigmavirus24/rfc3986' >>> validator = validators.Validator().allow_schemes( ... 'https', ... ).allow_hosts( ... 'github.com', ... ) >>> validator.validate(uri_reference( ... 'https://github.com/sigmavirus24/rfc3986' ... )) >>> validator.validate(uri_reference( ... 'https://github.com/' ... )) >>> validator.validate(uri_reference( ... 'http://example.com' ... )) Traceback (most recent call last): ... rfc3986.exceptions.UnpermittedComponentError ``` -------------------------------- ### Handle Password Forbidden Error with Validator Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch `PasswordForbidden` exceptions when a validator is configured to disallow passwords in the URI's authority component, and one is present. ```python import rfc3986 from rfc3986 import exceptions, validators # PasswordForbidden - password present when forbidden validator = validators.Validator().forbid_use_of_password() try: validator.validate(rfc3986.uri_reference('https://user:pass@example.com')) except exceptions.PasswordForbidden as e: print(f"Password forbidden: {e}") ``` -------------------------------- ### Handle Resolution Error in RFC 3986 Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch ResolutionError when a relative URI cannot be resolved against a base URI, often due to scheme mismatches or invalid base URIs. ```python relative = rfc3986.uri_reference('/path') relative.resolve_with(rfc3986.uri_reference('//no-scheme.com')) except exceptions.ResolutionError as e: print(f"Resolution error: {e}") ``` -------------------------------- ### Parse IRI from Bytes with Explicit Encoding Source: https://context7.com/python-hyper/rfc3986/llms.txt Parse an IRI directly from bytes by specifying the encoding. This is useful when dealing with raw byte data that represents an IRI. ```python import rfc3986 # Parse IRI from bytes with explicit encoding iri_bytes = rfc3986.iri_reference(b'https://example.com/caf\xc3\xa9', encoding='utf-8') print(iri_bytes.path) # => '/caf%C3%A9' ``` -------------------------------- ### Parse SSH URL without // Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/parsing.md Parsing an SSH URL without the '//' prefix results in the authority being part of the path due to strict RFC 3986 conformance. ```python rfc3986.uri_reference('git@github.com:sigmavirus24/rfc3986') ``` -------------------------------- ### Validate URIs Source: https://github.com/python-hyper/rfc3986/blob/main/README.rst Validate URIs using helper functions or URIReference methods. ```python from rfc3986 import is_valid_uri assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') ``` ```python from rfc3986 import is_valid_uri assert is_valid_uri('http://localhost:8774/v2/resource', require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component assert is_valid_uri('mailto:user@example.com', require_authority=True) is False ``` ```python from rfc3986 import uri_reference http = uri_reference('http://localhost:8774/v2/resource') assert uri.is_valid(require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component mailto = uri_reference('mailto:user@example.com') assert uri.is_valid(require_authority=True) is False ``` -------------------------------- ### Handle Invalid Port Exception Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch `InvalidPort` exceptions when a port number in a URI is out of the valid range (0-65535) or is not a numeric value. ```python import rfc3986 from rfc3986 import exceptions, validators # InvalidPort - port number out of range or non-numeric try: result = rfc3986.urlparse('https://example.com:99999/') except exceptions.InvalidPort as e: print(f"Invalid port: {e}") ``` -------------------------------- ### RFC 3986 ABNF Regular Expressions Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/api-ref/miscellaneous.md Regular expressions for matching various components of a URI as defined by RFC 3986. ```APIDOC ## RFC 3986 ABNF Regular Expressions ### Description Regular expressions for matching various components of a URI as defined by RFC 3986. ### Endpoints #### HOST_PATTERN - **Description**: Pattern to match and validate the host piece of an authority. - **Composed of**: `rfc3986.abnf_regexp.REG_NAME`, `rfc3986.abnf_regexp.IPv4_RE`, `rfc3986.abnf_regexp.IP_LITERAL_RE`. - **Reference**: [RFC 3986 Section 3.2.2](https://datatracker.ietf.org/doc/html/rfc3986.html#section-3.2.2) #### USERINFO_RE - **Description**: Pattern to match and validate the user information portion of an authority component. - **Reference**: [RFC 3986 Section 3.2.2](https://datatracker.ietf.org/doc/html/rfc3986.html#section-3.2.2) #### PORT_RE - **Description**: Pattern to match and validate the port portion of an authority component. - **Reference**: [RFC 3986 Section 3.2.2](https://datatracker.ietf.org/doc/html/rfc3986.html#section-3.2.2) #### PCT_ENCODED - **Description**: Regular expression for percent-encoded characters. #### PERCENT_ENCODED - **Description**: Regular expression to match percent encoded character values. #### PCHAR - **Description**: Regular expression to match printable characters. #### PATH_RE - **Description**: Regular expression to match and validate the path component of a URI. - **Reference**: [RFC 3986 Section 3.3](https://datatracker.ietf.org/doc/html/rfc3986.html#section-3.3) #### PATH_EMPTY - **Description**: Component of `PATH_RE` representing an empty path. #### PATH_ROOTLESS - **Description**: Component of `PATH_RE` representing a rootless path. #### PATH_NOSCHEME - **Description**: Component of `PATH_RE` representing a path without a scheme. #### PATH_ABSOLUTE - **Description**: Component of `PATH_RE` representing an absolute path. #### PATH_ABEMPTY - **Description**: Component of `PATH_RE` representing an absolute or empty path. #### QUERY_RE - **Description**: Regular expression to parse and validate the query component of a URI. #### FRAGMENT_RE - **Description**: Regular expression to parse and validate the fragment component of a URI. #### RELATIVE_PART_RE - **Description**: Regular expression to parse the relative URI when resolving URIs. #### HIER_PART_RE - **Description**: The hierarchical part of a URI. This regular expression is used when resolving relative URIs. - **Reference**: [RFC 3986 Section 3](https://datatracker.ietf.org/doc/html/rfc3986.html#section-3) ``` -------------------------------- ### Handle Invalid Authority Exception Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch `InvalidAuthority` exceptions when attempting to parse a URI with a malformed authority component, such as an invalid IPv6 address format. ```python import rfc3986 from rfc3986 import exceptions, validators # InvalidAuthority - malformed authority component try: uri = rfc3986.uri_reference('https://[invalid-ipv6/path') uri.authority_info() # Triggers parsing of invalid authority except exceptions.InvalidAuthority as e: print(f"Invalid authority: {e}") ``` -------------------------------- ### Handle Unpermitted Component Error with Validator Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch `UnpermittedComponentError` when a URI component's value is not within the set of allowed values, as defined by a validator. ```python import rfc3986 from rfc3986 import exceptions, validators # UnpermittedComponentError - component value not in allowed set validator = validators.Validator().allow_schemes('https') try: validator.validate(rfc3986.uri_reference('ftp://files.example.com')) except exceptions.UnpermittedComponentError as e: print(f"Unpermitted {e.component_name}: {e.component_value}") print(f"Allowed values: {e.allowed_values}") ``` -------------------------------- ### Handle Invalid Components Error in RFC 3986 Source: https://context7.com/python-hyper/rfc3986/llms.txt Catch InvalidComponentsError when a URI component fails format validation. The error object provides access to the invalid components. ```python validator = validators.Validator().check_validity_of('path') try: uri = rfc3986.uri_reference('https://example.com').copy_with(path='/path#invalid') validator.validate(uri) except exceptions.InvalidComponentsError as e: print(f"Invalid components: {e.components}") ``` -------------------------------- ### Prevent User Credential Leaks Source: https://github.com/python-hyper/rfc3986/blob/main/docs/source/user/validating.md Configure the validator to disallow passwords in the user information part of a URI's authority component. ```python >>> from rfc3986 import validators, uri_reference >>> user_url = 'https://github.com/sigmavirus24/rfc3986' >>> validator = validators.Validator().allow_schemes( ... 'https', ... ).allow_hosts( ... 'github.com', ... ).forbid_use_of_password() >>> validator.validate(uri_reference( ... 'https://github.com/sigmavirus24/rfc3986' ... )) >>> validator.validate(uri_reference( ... 'https://github.com/' ... )) >>> validator.validate(uri_reference( ... 'http://example.com' ... )) Traceback (most recent call last): ... rfc3986.exceptions.UnpermittedComponentError >>> validator.validate(uri_reference( ... 'https://sigmavirus24@github.com' ... )) >>> validator.validate(uri_reference( ... 'https://sigmavirus24:not-my-real-password@github.com' ... )) Traceback (most recent call last): ... rfc3986.exceptions.PasswordForbidden ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.