### Install Build Tools Source: https://facelessuser.github.io/soupsieve Install the 'build' package, which is required for manually building and installing Soup Sieve from source. ```bash pip install build ``` -------------------------------- ### Install Documentation Dependencies Source: https://facelessuser.github.io/soupsieve/about/development Installs all necessary Python packages for building and previewing documentation. Ensure you are in the project root. ```bash pip install -r requirements/docs.txt ``` -------------------------------- ### PEP440 Versioning Examples Source: https://facelessuser.github.io/soupsieve/about/security Examples of versioning following the PEP440 standard, including major, minor, and patch releases. ```text 8.0 8.1 8.1.3 ``` -------------------------------- ### Build and Install Soup Sieve from Source Source: https://facelessuser.github.io/soupsieve Manually build the wheel package for Soup Sieve and install it. Replace '' with the specific version number. This method is for advanced users or when installing from a local source. ```bash python -m build -w ``` ```bash pip install dist/soupsieve--py3-none-any.whl ``` -------------------------------- ### Install Tox Source: https://facelessuser.github.io/soupsieve/about/development Installs Tox, a tool for automating testing and dependency management in virtual environments. ```bash pip install tox ``` -------------------------------- ### CSS :nth-child() Syntax Examples Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates the basic syntax for :nth-child() with keywords, numerical indices, and the an+b formula. ```css :nth-child(even) ``` ```css :nth-child(odd) ``` ```css :nth-child(2) ``` ```css :nth-child(2n+2) ``` -------------------------------- ### Install Beautiful Soup 4 Source: https://facelessuser.github.io/soupsieve Install the Beautiful Soup 4 library, a prerequisite for Soup Sieve. This command is typically run using pip. ```bash pip install beautifulsoup4 ``` -------------------------------- ### CSS :nth-last-of-type() Syntax Examples Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Provides syntax examples for the :nth-last-of-type() pseudo-class, covering keywords and mathematical expressions for selecting elements of a specific type. ```css element:nth-last-of-type(even) ``` ```css element:nth-last-of-type(odd) ``` ```css element:nth-last-of-type(2) ``` ```css element:nth-last-of-type(2n+2) ``` -------------------------------- ### Python Example: Using :open with BeautifulSoup Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates selecting an open

element using the :open pseudo-class in Python. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

A summary

...

Content

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('details:open')) [

A summary

Content

] ``` -------------------------------- ### Install Soup Sieve Source: https://facelessuser.github.io/soupsieve Install the Soup Sieve library directly using pip. This is the standard method if Soup Sieve is not automatically included with Beautiful Soup 4. ```bash pip install soupsieve ``` -------------------------------- ### CSS :nth-last-child() Syntax Examples Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates the basic syntax for the :nth-last-child() pseudo-class, including keywords and mathematical expressions. ```css :nth-last-child(even) ``` ```css :nth-last-child(odd) ``` ```css :nth-last-child(2) ``` ```css :nth-last-child(2n+2) ``` -------------------------------- ### CSS Namespace Declaration Example Source: https://facelessuser.github.io/soupsieve/api Illustrates the syntax for declaring namespaces in CSS using the `@namespace` at-rule. ```css @namespace url("http://www.w3.org/1999/xhtml"); @namespace svg url("http://www.w3.org/2000/svg"); ``` -------------------------------- ### Python Example: Using :optional with BeautifulSoup Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates selecting optional form elements using the :optional pseudo-class in Python. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ... ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select(':optional')) [] ``` -------------------------------- ### Define and use custom CSS selectors Source: https://facelessuser.github.io/soupsieve/api Create custom selectors by providing a dictionary to the `custom` argument in `select`. Custom selectors must start with `:--`. ```python import soupsieve as sv import bs4 markup = """

Header 1

Header 2

child

*|*)", ":--parent-paragraph": "p:--parent" } # Use custom selectors print(sv.select(':--header', soup, custom=custom)) print(sv.select(':--parent-paragraph', soup, custom=custom)) ``` -------------------------------- ### Run Spell Checker Source: https://facelessuser.github.io/soupsieve/about/development Initiates the spell check process on the project's documentation files. Requires Aspell to be installed and in the system path. ```bash pyspelling ``` -------------------------------- ### Python Example: Using :nth-of-type() with BeautifulSoup Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates selecting elements using :nth-of-type() with even, odd, specific index, and pattern-based selectors in Python. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... ... ... ...

...

... ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('span:nth-of-type(even)')) [, , ] >>> print(soup.select('span:nth-of-type(odd)')) [, , ] >>> print(soup.select('p:nth-of-type(2)')) [

] >>> print(soup.select('p:nth-of-type(-n+3)')) [

] ``` -------------------------------- ### Python Example: Using :only-child with BeautifulSoup Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates selecting an element that is the only child within its parent using the :only-child pseudo-class in Python. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('p:only-child')) [

] ``` -------------------------------- ### Select Elements with Attribute Value Starting With a Prefix Source: https://facelessuser.github.io/soupsieve/selectors/basic Use `[attribute|=value]` to select elements where an attribute's value starts with a specific string, followed by a hyphen or the end of the string. Useful for language codes. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Some text

...

Some more text

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('div[lang|="en"]')) [

Some text

Some more text

] ``` -------------------------------- ### Python Example: Using :only-of-type with BeautifulSoup Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Demonstrates selecting an element that is the only one of its type among its siblings using the :only-of-type pseudo-class in Python. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ...

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('span:only-of-type')) [] ``` -------------------------------- ### Selecting Elements Without Explicit Namespace Source: https://facelessuser.github.io/soupsieve/selectors/basic Shows how a simple element selector `a` matches elements from any namespace if no default namespace is specified. This includes elements from the SVG namespace in this example. ```python >>> print(soup.select('a', namespaces={'svg': 'http://www.w3.org/2000/svg'})) [Soup Sieve Docs, MDN Web Docs] ``` -------------------------------- ### Python BeautifulSoup :nth-last-child() Examples Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Shows how to use BeautifulSoup with :nth-last-child() to select elements based on their position counting from the end. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('p:nth-last-child(even)')) [

] >>> print(soup.select('p:nth-last-child(odd)')) [

] >>> print(soup.select('p:nth-last-child(2)')) [

] >>> print(soup.select('p:nth-last-child(-n+3)')) [

] ``` -------------------------------- ### Select Elements with Attribute Value Starting With a Substring Source: https://facelessuser.github.io/soupsieve/selectors/basic Use `[attribute^=value]` to select elements where an attribute's value begins with a specified substring. This is case-sensitive. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href^=http]')) [Example link, Example org link] ``` -------------------------------- ### Select elements by language tag with :lang() (Level 3) Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Use :lang() to select elements whose language matches the provided tag or starts with the tag followed by a hyphen. Language is determined by the document type. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('p:lang(de)')) [

] ``` -------------------------------- ### Custom Selectors Source: https://facelessuser.github.io/soupsieve/api Soup Sieve allows the creation of custom selectors, which are aliases for complex CSS selectors. Custom pseudo-class names must start with `:--`. ```APIDOC ## Custom Selectors ### Description Allows assigning complex selectors to custom pseudo-class names. ### Usage Pass a dictionary to the `custom` parameter of `select`, `match`, or `filter` functions. Keys are custom pseudo-class names (e.g., `:--header`), and values are the CSS selectors they represent. ### Rules - Custom pseudo-class names must start with `:--`. - Pseudo-class names are not case-sensitive; duplicate names with different casing will cause an error. - Custom selectors can depend on other custom selectors. - Circular dependencies will result in a `SelectorSyntaxError`. ### Example 1: Simple Custom Selector ```python import soupsieve as sv import bs4 markup = """

Header 1

Header 2

child

""" soup = bs4.BeautifulSoup(markup, 'lxml') print(sv.select(':--header', soup, custom={':--header': 'h1, h2, h3, h4, h5, h6'})) # Expected Output: [

Header 1

Header 2

] ``` ### Example 2: Dependent Custom Selectors ```python custom = { ":--parent": ":has(> *|*)", ":--parent-paragraph": "p:--parent" } print(sv.select(':--parent-paragraph', soup, custom=custom)) # Expected Output: [

child

] ``` ``` -------------------------------- ### Serve Documentation Locally Source: https://facelessuser.github.io/soupsieve/about/development Builds and serves the project documentation locally, enabling live preview of changes. Access it at localhost:8000. ```bash python3 -m zensical serve -f zensical.yml ``` -------------------------------- ### Build Documentation Source: https://facelessuser.github.io/soupsieve/about/development Cleans previous builds and generates the documentation. This is a prerequisite for spell checking. ```bash python3 -m zensical build --clean -f zensical.yml ``` -------------------------------- ### Run All Tox Environments Source: https://facelessuser.github.io/soupsieve/about/development Executes all defined test, linting, and documentation environments using Tox. ```bash tox ``` -------------------------------- ### Handle CSS Identifiers Starting with Numbers Source: https://facelessuser.github.io/soupsieve/differences Soup Sieve requires CSS identifiers (like class names) to be valid. Use CSS escapes for selectors starting with numbers, e.g., `.2class` becomes `r'.\32 class'`. ```python soup.select(r'.\32 class') ``` -------------------------------- ### Run Linting with Tox Source: https://facelessuser.github.io/soupsieve/about/development Executes the linting environment using Tox to check code style. ```bash tox -e lint ``` -------------------------------- ### Run Document Building and Spell Checking with Tox Source: https://facelessuser.github.io/soupsieve/about/development Executes the document building and spell checking environments using Tox. ```bash tox -e documents ``` -------------------------------- ### Run Project Tests Source: https://facelessuser.github.io/soupsieve/about/development Executes the project's unit test suite using pytest. ```bash pytest ``` -------------------------------- ### Multi-line Selectors with Soup Sieve Source: https://facelessuser.github.io/soupsieve Demonstrates using multi-line strings for complex CSS selectors, improving readability. Selectors can span multiple lines just like in CSS files. ```python >>> selector = """ ... .a, ... .b, ... .c ... """ >>> sv.select(selector, soup) [

Cat

Dog

Mouse

] ``` -------------------------------- ### Python BeautifulSoup :nth-last-of-type() Examples Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Illustrates using BeautifulSoup with :nth-last-of-type() to select elements of a specific type based on their position counting from the end among siblings of the same type. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... ... ... ...

...

... ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('span:nth-last-of-type(even)')) [, , ] >>> print(soup.select('span:nth-last-of-type(odd)')) [, , ] >>> print(soup.select('p:nth-last-of-type(2)')) [

] >>> print(soup.select('p:nth-last-of-type(-n+3)')) [

] ``` -------------------------------- ### Soup Sieve Namespace Configuration Source: https://facelessuser.github.io/soupsieve/api Demonstrates how to define a namespace dictionary in Soup Sieve, mapping prefixes to namespace URIs. The empty string key represents the default namespace. ```python namespace = { "": "http://www.w3.org/1999/xhtml", # Default namespace is for XHTML "svg": "http://www.w3.org/2000/svg", # The SVG namespace defined with prefix of "svg" } ``` -------------------------------- ### Initialize SelectorLang Source: https://facelessuser.github.io/soupsieve/about/development Initializes the SelectorLang class. Used for defining language-based selectors. ```python class SelectorLang: """Selector language rules.""" def __init__(self, languages): """Initialize.""" ``` -------------------------------- ### Selectors with Comments using Soup Sieve Source: https://facelessuser.github.io/soupsieve Shows how to include comments within multi-line selectors for better documentation of complex queries. Comments are ignored by the selector engine. ```python >>> selector = """ ... /* This isn't complicated, but we're going to annotate it anyways. ... This is the a class */ ... .a, ... /* This is the b class */ ... .b, ... /* This is the c class */ ... .c ... """ >>> sv.select(selector, soup) [

Cat

Dog

Mouse

] ``` -------------------------------- ### Select the Root Element with :root Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Use :root to select the root element of the document tree. This is typically the element. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Here is some text.

...

Here is some more text.

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select(':root')) [

Here is some text.

Here is some more text.

] ``` -------------------------------- ### soupsieve.DEBUG Flag Source: https://facelessuser.github.io/soupsieve/api Demonstrates the use of the `DEBUG` flag to print detailed parsing information when compiling a selector. ```APIDOC ## `soupseive.DEBUG` Print debug output when parsing a selector. ``` >>> import soupsieve as sv >>> sv.compile('p:has(#id) > span.some-class:contains(text)', flags=sv.DEBUG) ## PARSING: 'p:has(#id) > span.some-class:contains(text)' TOKEN: 'tag' --> 'p' at position 0 TOKEN: 'pseudo_class' --> ':has(' at position 1 is_pseudo: True is_open: True is_relative: True TOKEN: 'id' --> '#id' at position 6 TOKEN: 'pseudo_close' --> ')' at position 9 TOKEN: 'combine' --> ' > ' at position 10 TOKEN: 'tag' --> 'span' at position 13 TOKEN: 'class' --> '.some-class' at position 17 TOKEN: 'pseudo_contains' --> ':contains(text)' at position 28 ## END PARSING SoupSieve(pattern='p:has(#id) > span.some-class:contains(text)', namespaces=None, custom=None, flags=1) ``` ``` -------------------------------- ### Pretty Print Compiled CSS Selector Source: https://facelessuser.github.io/soupsieve/about/development Demonstrates how to pretty print the compiled CSS selector structure for debugging purposes. Access the compiled selectors via `.selectors` and use the `.pretty()` method. ```python >>> import soupsieve as sv >>> sv.compile('this > that.class[name=value]').selectors.pretty() SelectorList( selectors=( Selector( tag=SelectorTag( name='that', prefix=None), ids=(), classes=( 'class', ), attributes=( SelectorAttribute( attribute='name', prefix='', pattern=re.compile( '^value$'), xml_type_pattern=None), ), nth=(), selectors=(), relation=SelectorList( selectors=( Selector( tag=SelectorTag( name='this', prefix=None), ids=(), classes=(), attributes=(), nth=(), selectors=(), relation=SelectorList( selectors=(), is_not=False, is_html=False), rel_type='>', contains=(), lang=(), flags=0), ), is_not=False, is_html=False), rel_type=None, contains=(), lang=(), flags=0), ), is_not=False, is_html=False) ``` -------------------------------- ### Create BeautifulSoup Object Source: https://facelessuser.github.io/soupsieve Initialize a BeautifulSoup object with HTML content and a parser. This is a prerequisite for using Soup Sieve with BeautifulSoup. ```python >>> import bs4 >>> text = """ ...

... ...

Cat

...

Dog

...

Mouse

...

... """ >>> soup = bs4.BeautifulSoup(text, 'html5lib') ``` -------------------------------- ### Selecting Elements with Namespaces Source: https://facelessuser.github.io/soupsieve/selectors/basic Demonstrates how to select elements belonging to a specific namespace using `svg|a`. This requires defining the namespace mapping in the `namespaces` argument. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

SVG Example

...

Soup Sieve Docs

... ... ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('svg|a', namespaces={'svg': 'http://www.w3.org/2000/svg'})) [MDN Web Docs] ``` -------------------------------- ### Initialize Selector Source: https://facelessuser.github.io/soupsieve/about/development Initializes a Selector object with various parameters for matching CSS selector components. Used for complex selector parsing. ```python class Selector: """Selector.""" def __init__( self, tag, ids, classes, attributes, nth, selectors, relation, rel_type, contains, lang, flags ): """Initialize.""" ``` -------------------------------- ### Compile Selector with Soup Sieve Source: https://facelessuser.github.io/soupsieve Pre-compile a Soup Sieve selector for performance when it will be used multiple times. Compiled selectors offer the same functionality but can be faster. ```python >>> selector = sv.compile('p:is(.a, .b, .c)') >>> selector.filter(soup.div) [

Cat

Dog

Mouse

] ``` -------------------------------- ### Selecting Attributes with Namespaced Prefix Source: https://facelessuser.github.io/soupsieve/selectors/basic Demonstrates selecting an attribute with a specific namespace prefix using `[xlink|href ]`. This requires mapping the `xlink` prefix to its URI in the `namespaces` dictionary. ```python >>> print(soup.select('[ xlink|href]', namespaces={' xlink': 'http://www.w3.org/1999/xlink'})) [MDN Web Docs] ``` -------------------------------- ### Run Specific Tox Python Version Environment Source: https://facelessuser.github.io/soupsieve/about/development Targets a specific Python version environment for testing with Tox, e.g., Python 3.10. ```bash tox -e py310 ``` -------------------------------- ### Namespace Selector Syntax Source: https://facelessuser.github.io/soupsieve/selectors/basic This section outlines the general syntax for namespace selectors in Soup Sieve, covering various combinations of namespace prefixes, universal selectors, and element/attribute names. ```text ns|element ns|* *|* *|element |element [ns|attr] [*|attr] [|attr] ``` -------------------------------- ### Select Any Link Element Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Use the :any-link pseudo-class to select all `` and `` elements that have an `href` attribute, regardless of their visited state. This is useful for styling all active links on a page. ```python >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

A link to click

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select(':any-link')) [click] ``` -------------------------------- ### Initialize SelectorTag Source: https://facelessuser.github.io/soupsieve/about/development Initializes a SelectorTag object to represent an HTML tag name and its namespace prefix for matching purposes. ```python class SelectorTag: """Selector tag.""" def __init__(self, name, prefix): """Initialize.""" ``` -------------------------------- ### Selecting Elements with Default Namespace Source: https://facelessuser.github.io/soupsieve/selectors/basic Illustrates how defining a default namespace (e.g., `''` for `http://www.w3.org/1999/xhtml`) restricts the `a` selector to only match elements within that default namespace. ```python >>> print(soup.select('a', namespaces={'': 'http://www.w3.org/1999/xhtml', 'svg': 'http://www.w3.org/2000/svg'})) [Soup Sieve Docs] ``` -------------------------------- ### Select Elements as Generator with soupsieve.iselect() Source: https://facelessuser.github.io/soupsieve/api Use `iselect` for memory-efficient selection of multiple elements, returning a generator instead of a list. It accepts the same arguments as `select`. ```python def iselect(select, node, namespaces=None, limit=0, flags=0, **kwargs): """Select the specified tags.""" ``` -------------------------------- ### soupsieve.compile() Source: https://facelessuser.github.io/soupsieve/api Pre-compiles a CSS selector pattern, returning a SoupSieve object. This object provides the same selector functions as the module but without needing to specify the selector, namespaces, or flags repeatedly. ```APIDOC ## `soupsieve.compile()` ### Description Compile CSS pattern. ### Parameters - **pattern** (string) - The CSS selector pattern to compile. - **namespaces** (dict, optional) - An optional namespace dictionary. - **flags** (int, optional) - Flags to modify compilation behavior. - **kwargs** - Additional keyword arguments. ### Returns A `SoupSieve` object with compiled selector functions. ``` -------------------------------- ### Select Element by ID Source: https://facelessuser.github.io/soupsieve/selectors/basic Use ID selectors to match an element with a specific 'id' attribute. The ID must match exactly. ```python from bs4 import BeautifulSoup as bs html = """

Here is some text.

Here is some more text.

""" soup = bs(html, 'html5lib') print(soup.select('#some-id')) ``` -------------------------------- ### Select elements by text direction using :dir() Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Use the :dir() pseudo-class to select elements based on their text directionality. Accepts 'ltr' or 'rtl'. ```python from bs4 import BeautifulSoup as bs html = """

זאת השפה העברית Text

""" soup = bs(html, 'html5lib') print(soup.select(':dir(rtl)')) ``` -------------------------------- ### Initialize SelectorNull Source: https://facelessuser.github.io/soupsieve/about/development Initializes a SelectorNull object. This selector is designed to match nothing, serving as a null or empty state in selector logic. ```python class SelectorNull: """Null Selector.""" def __init__(self): """Initialize.""" ``` -------------------------------- ### Syntax for :-soup-contains-own() Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Illustrates the basic syntax for using the :-soup-contains-own() pseudo-class, which selects elements containing specific text directly within themselves, not in descendants. ```css :-soup-contains-own(text) :-soup-contains-own("This text", "or this text") ``` -------------------------------- ### Select optional form elements with :optional Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Matches form elements (, ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select(':placeholder-shown')) [, ] ``` -------------------------------- ### Select the first child element using :first-child Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes The :first-child pseudo-class selects an element if it is the first child of its parent. This is useful for styling or selecting the initial element in a sibling group. ```python from bs4 import BeautifulSoup as bs html = """

""" soup = bs(html, 'html5lib') print(soup.select('p:first-child')) ``` -------------------------------- ### Select open elements with :open Source: https://facelessuser.github.io/soupsieve/selectors/pseudo-classes Matches elements that have open and closed states, specifically when they are in the open state. Currently targets

and