### Install python-abp Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Install or upgrade the python-abp library using pip. Ensure you have Python 2.7 or 3.5+ installed. ```bash pip install --upgrade python-abp ``` -------------------------------- ### Install python-abp from PyPI Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Install the python-abp library using pip. The `-U` flag ensures the latest version is installed or upgraded. ```bash $ pip install -U python-abp ``` -------------------------------- ### Install python-abp from Local Source Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Install the python-abp library from a local source directory after cloning the repository. Use the full path to the cloned directory. ```bash $ pip install -U /path/to/python-abp ``` -------------------------------- ### Install python-abp within Virtual Environment Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Install python-abp into an activated virtual environment using its specific pip executable. Supports installation from PyPI or local source. ```bash $ env/bin/pip install -U python-abp ``` -------------------------------- ### Example Filter List Output Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Demonstrates the parsed output of a sample filter list, showing different types of filter entries and their attributes. ```python Header(version='Adblock Plus 2.0') Metadata(key='Title', value='Example list') EmptyLine() Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'div#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', True)])]) Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'abc.com/ad'}, action='block', options=[('image', True)]) Filter(text='@@/abc\.com/', selector={'type': 'url-regexp', 'value': 'abc\.com'}, action='allow', options=[]) ``` -------------------------------- ### flrender CLI - Render Filter List Fragments Source: https://context7.com/adblockplus/python-abp/llms.txt Command-line script to combine filter list fragments into a complete, timestamped filter list. Supports local and remote includes, named source directories, and verbose logging. Install via 'pip install python-abp'. ```bash # Render fragment.txt -> filterlist.txt flrender fragment.txt filterlist.txt # Render and print to stdout flrender fragment.txt # Read from stdin, write to stdout cat fragment.txt | flrender # Specify named source directories for local %include source:path% references flrender -i easylist=/home/user/easylist \ -i abpfilters=/home/user/abp-filters \ easylist_top.txt output/easylist.txt # Verbose mode: log each included fragment to stderr flrender -v -i easylist=/home/user/easylist easylist_top.txt output.txt # Error: unknown source (missing -i option) # flrender easylist.txt output.txt # => Unknown source: 'easylist' when including 'easylist:easylist/easylist_general_block.txt' # from 'easylist.txt' ``` -------------------------------- ### Use python-abp functions in R Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Call python-abp functions from R using the imported module object (e.g., `abp`). Example shows calling `line2dict`. ```R > abp$line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") ``` -------------------------------- ### Parse Single Filter List Line with `parse_line` Source: https://context7.com/adblockplus/python-abp/llms.txt Employ `parse_line` to parse a single text line from a filter list. The `position` argument controls recognized line types: 'start' for headers, 'metadata' for key-value pairs, and 'body' (default) for filters, comments, includes, and empty lines. Includes error handling for invalid syntax. ```python from abp.filters import parse_line from abp.filters.parser import ParseError examples = [ ("[Adblock Plus 2.0]", "start"), ("! Title: EasyList", "metadata"), ("||example.com^$script,domain=foo.com", "body"), ("@@||safe.example.com^", "body"), ("example.com##div.ad-banner", "body"), ("%include easylist:easylist/easylist_general_block.txt%", "body"), ("", "body"), ] for text, pos in examples: parsed = parse_line(text, pos) print(f"[{pos}] {parsed.type}: {parsed}") # Error handling try: parse_line("%bad instruction%") except ParseError as e: print(f"ParseError: {e}") ``` -------------------------------- ### Render Filter List to File Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Use the 'flrender' script to combine filter list fragments into a single filter list file. The output is saved to the specified file. ```bash flrender fragment.txt filterlist.txt ``` -------------------------------- ### flrender CLI Source: https://context7.com/adblockplus/python-abp/llms.txt Command-line script to render filter list fragments, combining local and remote includes into a complete, timestamped filter list file. ```APIDOC ## `flrender` CLI — Render filter list fragments from the command line The `flrender` script combines filter list fragments (with local and remote includes) into a complete, timestamped filter list file. It is installed as a console entry point by `pip install python-abp`. ```bash # Render fragment.txt -> filterlist.txt flrender fragment.txt filterlist.txt # Render and print to stdout flrender fragment.txt # Read from stdin, write to stdout cat fragment.txt | flrender # Specify named source directories for local %include source:path% references flrender -i easylist=/home/user/easylist \ -i abpfilters=/home/user/abp-filters \ easylist_top.txt output/easylist.txt # Verbose mode: log each included fragment to stderr flrender -v -i easylist=/home/user/easylist easylist_top.txt output.txt # Error: unknown source (missing -i option) # flrender easylist.txt output.txt # => Unknown source: 'easylist' when including 'easylist:easylist/easylist_general_block.txt' # from 'easylist.txt' ``` ``` -------------------------------- ### Render Filter List to Stdout Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Use the 'flrender' script to render a filter list fragment. If only one argument is provided, the result is sent to stdout. ```bash flrender fragment.txt ``` -------------------------------- ### Create Python 2 Virtual Environment Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Create a virtual environment named 'env' using the `virtualenv` command for Python 2. ```bash $ virtualenv env ``` -------------------------------- ### Handle Unknown Sources in flrender Source: https://github.com/adblockplus/python-abp/blob/master/README.rst If 'flrender' encounters an unknown source during rendering, it will report the missing source and the context of the include instruction. ```bash flrender easylist.txt output/easylist.txt Unknown source: 'easylist' when including 'easylist:easylist/easylist_gener al_block.txt' from 'easylist.txt' ``` -------------------------------- ### Create Python 3 Virtual Environment Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Create a virtual environment named 'env' using the `venv` module for Python 3. ```bash $ python3 -m venv env ``` -------------------------------- ### render_filterlist(name, sources, top_source=None) Source: https://context7.com/adblockplus/python-abp/llms.txt Renders a filter list fragment into a full filter list by recursively resolving include directives, adding a version timestamp, and stripping legacy checksums. It also validates the presence of a header. ```APIDOC ## `render_filterlist(name, sources, top_source=None)` — Render fragment into full filter list Resolves all `%include …%` directives recursively, stamps the list with a `Version` timestamp, strips legacy checksums, and validates that a valid header is present. Returns an iterator of parsed line namedtuples ready for serialization with `.to_string()`. ### Parameters - **name** (`string`) - The name or path of the top-level filter list fragment to render. - **sources** (`dict`) - A dictionary mapping logical names to `Source` objects (e.g., `WebSource`, `FSSource`) for resolving included files. - **top_source** (`Source`, optional) - The source to use for the initial fragment. Defaults to `None`. ### Returns - `iterator` - An iterator yielding parsed line namedtuples. ### Raises - `MissingHeader` - If the filter list fragment is missing a valid header. - `IncludeError` - If there is an error resolving include directives. ### Example ```python import io from abp.filters.renderer import render_filterlist, IncludeError, MissingHeader from abp.filters.sources import FSSource, TopSource, WebSource # Sources map logical names to filesystem directories or web protocols sources = { "http": WebSource("http"), "https": WebSource("https"), "easylist": FSSource("/path/to/easylist-repo"), } try: lines = render_filterlist( "my_fragment.txt", # top-level input fragment sources, top_source=TopSource() ) with io.open("output_filterlist.txt", "w", encoding="utf-8") as out: for line in lines: out.write(line.to_string() + "\n") except MissingHeader as e: print(f"Invalid fragment - no header: {e}") except IncludeError as e: print(f"Include resolution failed: {e}") ``` ``` -------------------------------- ### Filter List Fragment Sources (FSSource, TopSource, WebSource) Source: https://context7.com/adblockplus/python-abp/llms.txt Abstracts the retrieval of filter list fragments. FSSource reads from a local directory, TopSource handles stdin or a top-level file, and WebSource fetches fragments over HTTP/HTTPS. Ensure correct paths and network access. ```python from abp.filters.sources import FSSource, TopSource, WebSource, NotFound # Filesystem source — maps a logical name to a directory fs = FSSource("/home/user/easylist", encoding="utf-8") for line in fs.get("easylist/easylist_general_block.txt"): print(line) # Web source — fetches remote includes web_https = WebSource("https") try: for line in web_https.get("//easylist-downloads.adblockplus.org/easylist.txt"): print(line) break # just show the first line except NotFound as e: print(f"Remote file not found: {e}") # TopSource — for the top-level fragment (stdin or file path) top = TopSource() for line in top.get("my_fragment.txt"): print(line) # Using all sources together in rendering from abp.filters.renderer import render_filterlist sources = { "http": WebSource("http"), "https": WebSource("https"), "easylist": FSSource("/repos/easylist"), "abpfilters": FSSource("/repos/abp-filters"), } lines = render_filterlist("top_fragment.txt", sources, TopSource()) ``` -------------------------------- ### Render Filter List from Stdin Source: https://github.com/adblockplus/python-abp/blob/master/README.rst When 'flrender' is run without positional arguments, it reads from stdin and writes the rendered filter list to stdout. ```bash flrender ``` -------------------------------- ### Render Filter List with Includes Source: https://context7.com/adblockplus/python-abp/llms.txt Resolves include directives, adds version timestamps, and validates headers to render a complete filter list. Handles potential include resolution or missing header errors. ```python import io from abp.filters.renderer import render_filterlist, IncludeError, MissingHeader from abp.filters.sources import FSSource, TopSource, WebSource # Sources map logical names to filesystem directories or web protocols sources = { "http": WebSource("http"), "https": WebSource("https"), "easylist": FSSource("/path/to/easylist-repo"), } try: lines = render_filterlist( "my_fragment.txt", # top-level input fragment sources, top_source=TopSource() ) with io.open("output_filterlist.txt", "w", encoding="utf-8") as out: for line in lines: out.write(line.to_string() + "\n") except MissingHeader as e: print(f"Invalid fragment - no header: {e}") except IncludeError as e: print(f"Include resolution failed: {e}") ``` -------------------------------- ### Parsing a Filter List Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Demonstrates how to read an entire filter list file and iterate over its parsed lines using `parse_filterlist`. ```APIDOC ## Parsing a Filter List ### Description Use the `parse_filterlist` function to read a filter list from a file-like object and iterate over its parsed lines. Each line is parsed into a structured object. ### Usage ```python from abp.filters import parse_filterlist with open('filterlist.txt') as filterlist: for line in parse_filterlist(filterlist): print(line) ``` ### Example Output ``` Header(version='Adblock Plus 2.0') Metadata(key='Title', value='Example list') EmptyLine() Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'div#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', True)])]) Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'abc.com/ad'}, action='block', options=[('image', True)]) Filter(text='@@/abc\.com/', selector={'type': 'url-regexp', 'value': 'abc\.com'}, action='allow', options=[]) ``` ``` -------------------------------- ### R Interface for ABP Filters Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Shows how to use the `python-abp` library within R using the `reticulate` package. ```APIDOC ## Using the Library with R ### Description The `python-abp` library can be utilized within R environments through the `reticulate` package, enabling R users to leverage the filter parsing capabilities. ### Installation (if not already installed) ```bash pip install -U python-abp ``` ### R Usage Example 1. **Load reticulate and import the ABP module:** ```R library(reticulate) # If using a virtual environment, specify its path: # use_virtualenv("~/path/to/env", required=TRUE) abp <- import("abp.filters.rpy") ``` 2. **Use ABP functions:** Functions can be accessed using the `$` operator, for example: ```R abp$line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") ``` ### Further Information For more details on using `reticulate`, refer to the `reticulate` package guide: https://rstudio.github.io/reticulate/ ``` -------------------------------- ### Include Remote and Local Filter List Fragments Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Filter list fragments can include other fragments using http(s) URLs or local paths within a repository. Use the '-i' option to specify the path to local repository sources. ```text %include http://www.server.org/dir/list.txt% ``` ```text %include easylist:easylist/easylist_general_block.txt% ``` ```bash flrender -i easylist=/home/abc/easylist input.txt output.txt ``` -------------------------------- ### Import python-abp in R with reticulate Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Import the python-abp library into an R session using the `reticulate` package. Specify the virtual environment path if necessary. ```R > library(reticulate) > use_virtualenv("~/path/to/env", required=TRUE) # If using virtualenv > abp <- import("abp.filters.rpy") ``` -------------------------------- ### Generate Incremental Diffs with `fldiff` Source: https://context7.com/adblockplus/python-abp/llms.txt Use the `fldiff` script to compute and generate diff files between the latest filter list and archived versions. Specify an output directory to organize the generated diffs. ```bash # Basic usage: diff latest against all archived lists, write diffs to current dir fldiff easylist.txt archive/easylist_*.txt ``` ```bash # Specify output directory for diff files fldiff -o diffs/easylist/ easylist.txt archive/easylist_*.txt # Writes: diffs/easylist/diff202401010000.txt, diffs/easylist/diff202312010000.txt, ... ``` ```bash # Diff file format produced: # [Adblock Plus Diff] # ! Version: 202402010000 # - ||removed-filter.example.com^ # + ||added-filter.example.com^ ``` -------------------------------- ### Generate Filter List Diff Source: https://context7.com/adblockplus/python-abp/llms.txt Compares two filter lists and generates diff lines in the ABP diff format. Useful for incremental updates. Can save the diff to a file. ```python import io from abp.filters.renderer import render_diff base_list = """[Adblock Plus 2.0] ! Title: My List ! Version: 202401010000 ||old-ad.example.com^ ||keep.example.com^ """.splitlines(keepends=True) latest_list = """[Adblock Plus 2.0] ! Title: My List ! Version: 202402010000 ||new-ad.example.com^ ||keep.example.com^ """.splitlines(keepends=True) for diff_line in render_diff(base_list, latest_list): print(diff_line) # Output: # [Adblock Plus Diff] # ! Version: 202402010000 # - ||old-ad.example.com^ # + ||new-ad.example.com^ # Save diff to a file with io.open("diff202401010000.txt", "w", encoding="utf-8") as out: for diff_line in render_diff(base_list, latest_list): out.write(diff_line + "\n") ``` -------------------------------- ### render_diff(base, latest) Source: https://context7.com/adblockplus/python-abp/llms.txt Generates an incremental diff between two filter lists, following the Adblock Plus diff format. It includes a header, metadata changes, and lines indicating added or removed filters. ```APIDOC ## `render_diff(base, latest)` — Generate incremental diff between two filter lists Compares a base (older) filter list with the latest version and yields diff lines following the ABP diff format: a `[Adblock Plus Diff]` header, metadata change lines, lines prefixed with `- ` for removed filters, and lines prefixed with `+ ` for added filters. ### Parameters - **base** (`list` of `string`) - The older filter list, as a list of strings. - **latest** (`list` of `string`) - The newer filter list, as a list of strings. ### Returns - `iterator` - An iterator yielding diff lines as strings. ### Example ```python import io from abp.filters.renderer import render_diff base_list = """ [Adblock Plus 2.0] ! Title: My List ! Version: 202401010000 ||old-ad.example.com^ ||keep.example.com^ """.splitlines(keepends=True) latest_list = """ [Adblock Plus 2.0] ! Title: My List ! Version: 202402010000 ||new-ad.example.com^ ||keep.example.com^ """.splitlines(keepends=True) for diff_line in render_diff(base_list, latest_list): print(diff_line) # Output: # [Adblock Plus Diff] # ! Version: 202402010000 # - ||old-ad.example.com^ # + ||new-ad.example.com^ # Save diff to a file with io.open("diff202401010000.txt", "w", encoding="utf-8") as out: for diff_line in render_diff(base_list, latest_list): out.write(diff_line + "\n") ``` ``` -------------------------------- ### Filter List Fragment Sources Source: https://context7.com/adblockplus/python-abp/llms.txt Abstracts the retrieval of filter list fragments during rendering. Supports filesystem, top-level input, and web sources. ```APIDOC ## `FSSource` / `TopSource` / `WebSource` — Filter list fragment sources These classes abstract the retrieval of filter list fragments during rendering. `FSSource` reads from a local directory (safe path resolution prevents directory traversal). `TopSource` handles the top-level input file or stdin. `WebSource` fetches fragments over HTTP or HTTPS. ```python from abp.filters.sources import FSSource, TopSource, WebSource, NotFound # Filesystem source — maps a logical name to a directory fs = FSSource("/home/user/easylist", encoding="utf-8") for line in fs.get("easylist/easylist_general_block.txt"): print(line) # Web source — fetches remote includes web_https = WebSource("https") try: for line in web_https.get("//easylist-downloads.adblockplus.org/easylist.txt"): print(line) break # just show the first line except NotFound as e: print(f"Remote file not found: {e}") # TopSource — for the top-level fragment (stdin or file path) top = TopSource() for line in top.get("my_fragment.txt"): print(line) # Using all sources together in rendering from abp.filters.renderer import render_filterlist sources = { "http": WebSource("http"), "https": WebSource("https"), "easylist": FSSource("/repos/easylist"), "abpfilters": FSSource("/repos/abp-filters"), } lines = render_filterlist("top_fragment.txt", sources, TopSource()) ``` ``` -------------------------------- ### Generate Filter List Diffs Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Use the 'fldiff' script to find differences between the latest filter list and archived versions. Specify an output directory with '-o'. Diffs are written to the specified directory or current directory if omitted. ```bash fldiff -o diffs/easylist/ easylist.txt archive/* ``` -------------------------------- ### Load Filter Hit Statistics Source: https://context7.com/adblockplus/python-abp/llms.txt Loads filter hit statistics from a CSV file, optionally filtering by source names. Numeric columns are coerced to integers. ```APIDOC ## `load_filterhit_statistics(path, sources=None)` — Load filter hit statistics from CSV Reads a CSV file of filter hit statistics (as produced by Adblock Plus telemetry pipelines) and yields row dictionaries with integer-coerced numeric columns (`onehour_sessions`, `hits`, `domains`, `rootdomains`). Optionally filters rows to only those from specified source names. ```python from abp.stats.filterhits import load_filterhit_statistics # Load all stats for entry in load_filterhit_statistics("filterhits.csv"): print(entry) # {'filter': '||ads.example.com^', 'source': 'easylist', # 'hits': 9420, 'domains': 150, 'rootdomains': 80, 'onehour_sessions': 312} # Load only stats from specific sources easylist_hits = list( load_filterhit_statistics( "filterhits.csv", sources={"easylist", "easylist_privacy"} ) ) print(f"Loaded {len(easylist_hits)} entries from easylist sources") # Compute total hits for the top filters top_filters = sorted(easylist_hits, key=lambda e: e["hits"], reverse=True)[:5] for entry in top_filters: print(f"{entry['hits']:>10,} {entry['filter']}") ``` ``` -------------------------------- ### Parse Filter List with Python-ABP Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Use `parse_filterlist` to iterate over lines in a filter list file. Requires opening the file in read mode. ```python from abp.filters import parse_filterlist with open('filterlist.txt') as filterlist: for line in parse_filterlist(filterlist): print(line) ``` -------------------------------- ### Load Filter Hit Statistics from CSV Source: https://context7.com/adblockplus/python-abp/llms.txt Reads filter hit statistics from a CSV file, yielding row dictionaries. Numeric columns are integer-coerced. Optionally filters rows by specified source names. Useful for analyzing filter performance. ```python from abp.stats.filterhits import load_filterhit_statistics # Load all stats for entry in load_filterhit_statistics("filterhits.csv"): print(entry) # {'filter': '||ads.example.com^', 'source': 'easylist', # 'hits': 9420, 'domains': 150, 'rootdomains': 80, 'onehour_sessions': 312} # Load only stats from specific sources easylist_hits = list( load_filterhit_statistics( "filterhits.csv", sources={"easylist", "easylist_privacy"} ) ) print(f"Loaded {len(easylist_hits)} entries from easylist sources") # Compute total hits for the top filters top_filters = sorted(easylist_hits, key=lambda e: e["hits"], reverse=True)[:5] for entry in top_filters: print(f"{entry['hits']:>10,} {entry['filter']}") ``` -------------------------------- ### Process Filter List into Blocks Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Utilize `to_blocks` from `abp.filters.blocks` to process parsed filter lists into distinct blocks, often separated by comments. Outputs blocks as JSON dictionaries. ```python from abp.filters import parse_filterlist from abp.filters.blocks import to_blocks with open(fl_path) as f: for block in to_blocks(parse_filterlist(f)): print(json.dumps(block.to_dict(), indent=2)) ``` -------------------------------- ### Processing Blocks of Filters Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Details the `to_blocks` function for processing filter lists that are separated into blocks by comments. ```APIDOC ## Processing Blocks of Filters ### Description The `to_blocks` function, located in `abp.filters.blocks`, allows for the processing of filter lists where filters are separated into blocks by comments. It takes an iterator of parsed filter lines and yields blocks of filters. ### Usage ```python from abp.filters import parse_filterlist from abp.filters.blocks import to_blocks with open(fl_path) as f: for block in to_blocks(parse_filterlist(f)): print(json.dumps(block.to_dict(), indent=2)) ``` ### Further Information For more information on the `to_blocks` function and its capabilities, use `help(abp.filters.blocks)` in an interactive Python session. ``` -------------------------------- ### R-compatible Filter Parsing Source: https://context7.com/adblockplus/python-abp/llms.txt Provides functions to parse filter list lines into Python dictionaries compatible with R via the reticulate bridge. ```APIDOC ## `line2dict` / `lines2dicts` (R interop via `abp.filters.rpy`) — R-compatible filter parsing These functions wrap `parse_line` and return plain Python dicts (with UTF-8 byte strings) compatible with the rPython/reticulate R-to-Python bridge, enabling use of python-abp from R data analysis workflows. ```python # Python usage from abp.filters.rpy import line2dict, lines2dicts # Parse a single line to a dict result = line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") print(result) # {'text': '@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org', # 'selector': {'type': 'url-pattern', 'value': '||g.doubleclick.net/pagead/'}, # 'action': 'allow', # 'options': {'subdocument': True, 'domain': {'hon30.org': True}}, # 'type': 'Filter'} # Parse multiple lines at once lines = [ "||ads.example.com^", "example.com##div.ad", "! This is a comment", ] dicts = lines2dicts(lines) for d in dicts: print(d["type"], "->", d.get("text") or d.get("value", "")) ``` ```r # R usage via reticulate library(reticulate) use_virtualenv("~/path/to/env", required = TRUE) abp <- import("abp.filters.rpy") result <- abp$line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") print(result$action) # "allow" print(result$type) # "Filter" ``` ``` -------------------------------- ### to_blocks(parsed_lines) Source: https://context7.com/adblockplus/python-abp/llms.txt Groups parsed filter list lines into comment-delimited blocks. Each block contains filters, a description, and variables, which is useful for processing structured filter lists. ```APIDOC ## `to_blocks(parsed_lines)` — Group filters into comment-delimited blocks Takes the output of `parse_filterlist` and yields `FiltersBlock` objects. Each block contains a `.filters` list, a `.description` string (concatenated preceding comments), and a `.variables` dict for `!:key=value` comment annotations. Useful for processing structured filter lists like EasyList or the ABP exception list. ### Parameters - **parsed_lines** (`iterator`) - An iterator yielding parsed filter list lines, typically from `parse_filterlist`. ### Returns - `iterator` - An iterator yielding `FiltersBlock` objects. ### Example ```python import json from abp.filters import parse_filterlist from abp.filters.blocks import to_blocks filterlist_text = """ [Adblock Plus 2.0] ! Title: Partner List !:partner_id=1234 !:type=partner ! Block ads for partner 1234 ||partner-ads.example.com^$script ||partner-tracker.example.com^ !:partner_id=5678 !:type=partner ! Block ads for partner 5678 ||other-ads.example.com^ """.splitlines() for block in to_blocks(parse_filterlist(filterlist_text)): print(json.dumps(block.to_dict(), indent=2)) # Output: # { # "variables": {"partner_id": "1234", "type": "partner"}, # "description": "Block ads for partner 1234", # "filters": [ # {"text": "||partner-ads.example.com^$script", "type": "Filter", ...}, # {"text": "||partner-tracker.example.com^", "type": "Filter", ...} # ] # } # { # "variables": {"partner_id": "5678", "type": "partner"}, # "description": "Block ads for partner 5678", # "filters": [ # {"text": "||other-ads.example.com^", "type": "Filter", ...} # ] # } ``` ``` -------------------------------- ### Parsing an Individual Filter Line Source: https://github.com/adblockplus/python-abp/blob/master/README.rst Explains the use of the lower-level `parse_line` function for parsing single lines from a filter list. ```APIDOC ## Parsing an Individual Filter Line ### Description The `abp.filters` module provides a lower-level function, `parse_line`, for parsing individual lines of a filter list. This function returns a parsed line object, similar to the items yielded by `parse_filterlist`. ### Usage For more details on `parse_line` and its usage, refer to the module's docstrings or use Python's `help()` function interactively: `help(abp.filters.parse_line)`. ``` -------------------------------- ### parse_filter Source: https://context7.com/adblockplus/python-abp/llms.txt Parses a raw filter rule string into its components: text, selector, action, and options. ```APIDOC ## parse_filter(text) ### Description Parses a raw filter rule string (without the leading `!` or `%include` syntax) and returns a `Filter` namedtuple with `text`, `selector`, `action`, and `options` fields. This is a lower-level entry point useful when the input is known to be a filter (not a comment or metadata line). ### Parameters #### Path Parameters - **text** (string) - Required - The raw filter rule string to parse. ### Request Example ```python from abp.filters.parser import parse_filter, FilterAction, SelectorType filters = [ "||ads.example.com^$script,third-party", "@@||safe.example.com^$document", "example.com,other.com##div.sponsored", "example.com#?#div:-abp-properties(color: red)", # extended CSS "example.com#$#abort-on-property-read adProp" # snippet ] for f_text in filters: f = parse_filter(f_text) print(f"action={f.action}, selector_type={f.selector['type']}, options={f.options}") ``` ### Response #### Success Response (Filter namedtuple) - **text** (string) - The original filter rule text. - **selector** (dict) - A dictionary containing the selector type and value. - **type** (string) - The type of selector (e.g., `"url-pattern"`, `"css"`, `"extended-css"`, `"snippet"`). - **value** (string) - The selector value. - **action** (string) - The action to perform (e.g., `"block"`, `"allow"`, `"hide"`). - **options** (list of tuples) - A list of options applied to the filter. #### Response Example ``` action=block, selector_type=url-pattern, options=[('script', True), ('third-party', True)] action=allow, selector_type=url-pattern, options=[] action=hide, selector_type=css, options=[('domain', [('example.com', True), ('other.com', True)])] action=hide, selector_type=extended-css, options=[] action=hide, selector_type=snippet, options=[] ``` ``` -------------------------------- ### parse_line Source: https://context7.com/adblockplus/python-abp/llms.txt Parses a single text line and returns a typed namedtuple, with behavior configurable by the `position` argument. ```APIDOC ## parse_line(line, position="body") ### Description Parses one text line and returns a single typed namedtuple. The `position` argument controls which line types are recognized: `"start"` recognizes headers, `"metadata"` also recognizes metadata key-value pairs, and `"body"` (default) recognizes filters, comments, includes, and empty lines. ### Parameters #### Path Parameters - **line** (string) - Required - The text line to parse. - **position** (string) - Optional - The expected position of the line in a filter list (`"start"`, `"metadata"`, or `"body"`). Defaults to `"body"`. ### Request Example ```python from abp.filters import parse_line from abp.filters.parser import ParseError examples = [ ("[Adblock Plus 2.0]", "start"), ("! Title: EasyList", "metadata"), ("||example.com^$script,domain=foo.com", "body"), ("@@||safe.example.com^", "body"), ("example.com##div.ad-banner", "body"), ("%include easylist:easylist/easylist_general_block.txt%", "body"), ("", "body") ] for text, pos in examples: parsed = parse_line(text, pos) print(f"[{pos}] {parsed.type}: {parsed}") # Error handling try: parse_line("%bad instruction%") except ParseError as e: print(f"ParseError: {e}") ``` ### Response #### Success Response (namedtuple) Returns a namedtuple representing the parsed line (e.g., `Header`, `Metadata`, `Filter`, `Include`, `EmptyLine`). #### Response Example ``` [start] header: Header(version='Adblock Plus 2.0') [metadata] metadata: Metadata(key='Title', value='EasyList') [body] filter: Filter(text='||example.com^$script,...', action='block', ...) [body] filter: Filter(text='@@||safe.example.com^', action='allow', ...) [body] filter: Filter(text='example.com##div.ad-banner', action='hide', ...) [body] include: Include(target='easylist:easylist/easylist_general_block.txt') [body] emptyline: EmptyLine() ParseError: Invalid instruction: %bad instruction% ``` ``` -------------------------------- ### R-compatible Filter Parsing (line2dict, lines2dicts) Source: https://context7.com/adblockplus/python-abp/llms.txt Wraps parse_line to return plain Python dicts compatible with the rPython/reticulate bridge. Enables using python-abp from R data analysis workflows. Ensure reticulate is configured correctly in R. ```python # Python usage from abp.filters.rpy import line2dict, lines2dicts # Parse a single line to a dict result = line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") print(result) # {'text': '@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org', # 'selector': {'type': 'url-pattern', 'value': '||g.doubleclick.net/pagead/'}, # 'action': 'allow', # 'options': {'subdocument': True, 'domain': {'hon30.org': True}}, # 'type': 'Filter'} # Parse multiple lines at once lines = [ "||ads.example.com^", "example.com##div.ad", "! This is a comment", ] dicts = lines2dicts(lines) for d in dicts: print(d["type"], "->", d.get("text") or d.get("value", "")) ``` ```r # R usage via reticulate library(reticulate) use_virtualenv("~/path/to/env", required = TRUE) abp <- import("abp.filters.rpy") result <- abp$line2dict("@@||g.doubleclick.net/pagead/$subdocument,domain=hon30.org") print(result$action) # "allow" print(result$type) # "Filter" ``` -------------------------------- ### Serialize Parsed Filter to Text Source: https://context7.com/adblockplus/python-abp/llms.txt Converts a parsed filter object back into its canonical ABP filter list string representation. Use for round-tripping filters. ```python from abp.filters.parser import parse_filter, unparse_filter # Round-trip a URL blocking filter original = "||ads.example.com^$script,domain=foo.com|~bar.com" parsed = parse_filter(original) restored = unparse_filter(parsed) print(restored) ``` ```python from abp.filters.parser import parse_filter, unparse_filter # Round-trip an element hiding filter original_hide = "example.com,other.com##div.ad" parsed_hide = parse_filter(original_hide) restored_hide = unparse_filter(parsed_hide) print(restored_hide) ``` -------------------------------- ### Group Filters into Blocks Source: https://context7.com/adblockplus/python-abp/llms.txt Groups parsed filter list lines into blocks, extracting filters, descriptions, and variables. Useful for processing structured filter lists. ```python import json from abp.filters import parse_filterlist from abp.filters.blocks import to_blocks filterlist_text = """[Adblock Plus 2.0] ! Title: Partner List !:partner_id=1234 !:type=partner ! Block ads for partner 1234 ||partner-ads.example.com^$script ||partner-tracker.example.com^ !:partner_id=5678 !:type=partner ! Block ads for partner 5678 ||other-ads.example.com^ """.splitlines() for block in to_blocks(parse_filterlist(filterlist_text)): print(json.dumps(block.to_dict(), indent=2)) ``` -------------------------------- ### unparse_filter(filter) Source: https://context7.com/adblockplus/python-abp/llms.txt Serializes a parsed filter object back into its canonical Adblock Plus filter list string representation. This is useful for round-tripping filters through parsing and serialization. ```APIDOC ## `unparse_filter(filter)` — Serialize a parsed filter back to text Takes a `Filter` namedtuple (as produced by `parse_filter` or `parse_line`) and converts it back to its canonical ABP filter list string representation. ### Parameters - **filter** (`Filter` namedtuple) - The parsed filter object to serialize. ### Returns - `string` - The canonical ABP filter list string representation of the input filter. ### Example ```python from abp.filters.parser import parse_filter, unparse_filter # Round-trip a URL blocking filter original = "||ads.example.com^$script,domain=foo.com|~bar.com" parsed = parse_filter(original) restored = unparse_filter(parsed) print(restored) # ||ads.example.com^$script,domain=foo.com|~bar.com # Round-trip an element hiding filter original_hide = "example.com,other.com##div.ad" parsed_hide = parse_filter(original_hide) restored_hide = unparse_filter(parsed_hide) print(restored_hide) # example.com,other.com##div.ad ``` ``` -------------------------------- ### Parse Filter to Dictionary Source: https://context7.com/adblockplus/python-abp/llms.txt Converts a filter string into a dictionary representation. Useful for serialization. ```python from abp.filters import parse_filter f = parse_filter("||ads.example.com^$domain=foo.com|~bar.com") print(f.to_dict()) ``` -------------------------------- ### Parse Single Filter Rule with `parse_filter` Source: https://context7.com/adblockplus/python-abp/llms.txt Utilize `parse_filter` for parsing a raw filter rule string, excluding comment or include syntax. It returns a `Filter` namedtuple with `text`, `selector`, `action`, and `options` fields. This is a lower-level function for when the input is confirmed to be a filter rule. ```python from abp.filters.parser import parse_filter, FilterAction, SelectorType filters = [ "||ads.example.com^$script,third-party", "@@||safe.example.com^$document", "example.com,other.com##div.sponsored", "example.com#?#div:-abp-properties(color: red)", # extended CSS "example.com#$#abort-on-property-read adProp", # snippet ] for f_text in filters: f = parse_filter(f_text) print(f"action={f.action}, selector_type={f.selector['type']}, options={f.options}") ``` -------------------------------- ### parse_filterlist Source: https://context7.com/adblockplus/python-abp/llms.txt Parses an iterable of text lines representing an entire filter list and yields typed namedtuple objects for each line. ```APIDOC ## parse_filterlist(lines) ### Description Parses an iterable of text lines (e.g., an open file) and yields a sequence of typed namedtuple objects representing each line: `Header`, `Metadata`, `Comment`, `Filter`, `Include`, or `EmptyLine`. Position-aware parsing automatically distinguishes header/metadata lines at the top from filter body lines below. ### Parameters #### Path Parameters - **lines** (iterable) - Required - An iterable of text lines to parse. ### Request Example ```python from abp.filters import parse_filterlist filterlist_text = [ "[Adblock Plus 2.0]", "! Title: My List", "! Version: 202401010000", "", "abc.com,cdf.com##div#ad1", "abc.com/ad$image", "@@/abc\\.com/" ] for line in parse_filterlist(filterlist_text): print(f"type={line.type!r:12} {line}") ``` ### Response #### Success Response (yields namedtuples) - **Header**: Represents a header line. - **Metadata**: Represents a metadata line (key-value pair). - **Comment**: Represents a comment line. - **Filter**: Represents a filter rule. - **Include**: Represents an include directive. - **EmptyLine**: Represents an empty line. #### Response Example ``` type='header' Header(version='Adblock Plus 2.0') type='metadata' Metadata(key='Title', value='My List') type='metadata' Metadata(key='Version', value='202401010000') type='emptyline' EmptyLine() type='filter' Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'div#ad1'}, action='hide', options=[('domain', [('abc.com', True), ('cdf.com', True)])]) type='filter' Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'abc.com/ad'}, action='block', options=[('image', True)]) type='filter' Filter(text='@@/abc\\.com/', selector={'type': 'url-regexp', 'value': 'abc\\.com'}, action='allow', options=[]) ``` ``` -------------------------------- ### Parse Entire Filter List with `parse_filterlist` Source: https://context7.com/adblockplus/python-abp/llms.txt Use `parse_filterlist` to parse an iterable of text lines representing a filter list. It yields typed namedtuple objects for each line, distinguishing headers, metadata, comments, filters, includes, and empty lines based on their position. ```python from abp.filters import parse_filterlist filterlist_text = """ [Adblock Plus 2.0] ! Title: My List ! Version: 202401010000 abc.com,cdf.com##div#ad1 abc.com/ad$image @@/abc\.com/ """.splitlines() for line in parse_filterlist(filterlist_text): print(f"type={line.type!r:12} {line}") ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.