Hello, world!

world!

{ Selector::parse(input).map_err(|e| { match e { SelectorErrorKind::UnexpectedToken(token) => format!("Unexpected token in selector"), SelectorErrorKind::EndOfLine => format!("Selector ended unexpectedly"), SelectorErrorKind::QualRuleInvalid => format!("Invalid qualified rule"), _ => format!("Selector parse error: {}", e), } }) } // Valid selector match parse_selector("div.class") { Ok(_) => println!("Valid selector"), Err(e) => println!("Error: {}", e), } // Invalid selector match parse_selector("[invalid") { Ok(_) => println!("Valid selector"), Err(e) => println!("Error: {}", e), } ``` -------------------------------- ### ElementRef::attr - Get Attribute Value Source: https://context7.com/rust-scraper/scraper/llms.txt Convenience method to directly access an attribute value from an ElementRef. ```APIDOC ## ElementRef::attr ### Description Convenience method to directly access an attribute value from an ElementRef. ### Parameters #### Query Parameters - **attr_name** (str) - Required - The name of the attribute to retrieve. ``` -------------------------------- ### ElementRef::attr - Get Attribute Value Source: https://context7.com/rust-scraper/scraper/llms.txt Convenience method to directly access an attribute value from an ElementRef. Useful for quickly retrieving specific attribute values like 'href', 'src', or checking for the presence of boolean attributes. ```rust use scraper::{Html, Selector}; let html = r#"Link

"#; let document = Html::parse_fragment(html); // Get href from link let link_sel = Selector::parse("a").unwrap(); let link = document.select(&link_sel).next().unwrap(); println!("URL: {}", link.attr("href").unwrap_or("no href")); // Output: URL: https://example.com // Get multiple attributes from image let img_sel = Selector::parse("img").unwrap(); let img = document.select(&img_sel).next().unwrap(); println!("Image src: {}, alt: {}", img.attr("src").unwrap_or(""), img.attr("alt").unwrap_or("no alt")); // Output: Image src: /image.png, alt: Description // Check for boolean attribute let input_sel = Selector::parse("input").unwrap(); let input = document.select(&input_sel).next().unwrap(); let is_required = input.attr("required").is_some(); println!("Input is required: {}", is_required); // Output: Input is required: true ``` -------------------------------- ### Extract Text Content with ElementRef::text Source: https://context7.com/rust-scraper/scraper/llms.txt Use `ElementRef::text()` to get an iterator over all text nodes within an element and its descendants. This method includes text from nested elements. It can be collected into a Vec for individual segments or joined into a single String. ```rust use scraper::{Html, Selector}; let html = rлід

Hello, bold and italic text!

"#; let document = Html::parse_fragment(html); let selector = Selector::parse("p").unwrap(); let paragraph = document.select(&selector).next().unwrap(); // Collect all text segments let text_parts: Vec<_> = paragraph.text().collect(); println!("Text parts: {:?}", text_parts); // Output: Text parts: ["Hello, ", "bold", " and ", "italic", " text!"] // Join into single string let full_text: String = paragraph.text().collect(); println!("Full text: {}", full_text); // Output: Full text: Hello, bold and italic text! ``` ```rust use scraper::{Html, Selector}; let html = rлід

Title

First paragraph with link.

Second paragraph.

"#; let document = Html::parse_fragment(html); let article_sel = Selector::parse(".article").unwrap(); let article = document.select(&article_sel).next().unwrap(); let all_text: String = article.text() .map(|s| s.trim()) .filter(|s| !s.is_empty()) .collect::>() .join(" "); println!("Article text: {}", all_text); // Output: Article text: Title First paragraph with link . Second paragraph. ``` -------------------------------- ### Html::select - Select Elements from Document Source: https://context7.com/rust-scraper/scraper/llms.txt Returns an iterator over all elements in the document that match the given CSS selector. ```APIDOC ## Html::select ### Description Returns an iterator over all elements in the document that match the given selector. The iterator yields `ElementRef` objects that provide access to element data. ### Parameters #### Query Parameters - **selector** (Selector) - Required - The CSS selector used to match elements in the document. ``` -------------------------------- ### ElementRef::html and ElementRef::inner_html Source: https://context7.com/rust-scraper/scraper/llms.txt Methods to serialize an element or its children back to an HTML string. ```APIDOC ## ElementRef::html and ElementRef::inner_html ### Description Methods to serialize an element back to an HTML string. `html()` includes the element itself, while `inner_html()` returns only the children. ### Method Method call on ElementRef ### Response - **String** - The serialized HTML representation. ``` -------------------------------- ### Parse CSS Selector in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Create a CSS selector using `Selector::parse`. This selector can then be used to query elements within an HTML document or fragment. `.unwrap()` is used for simplicity; consider proper error handling. ```rust use scraper::Selector; let selector = Selector::parse("h1.foo").unwrap(); ``` -------------------------------- ### ElementRef::child_elements and descendent_elements Source: https://context7.com/rust-scraper/scraper/llms.txt Methods to iterate over child and descendant elements without requiring a selector. ```APIDOC ## ElementRef::child_elements and descendent_elements ### Description Methods to iterate over child and descendant elements without needing a selector. These methods skip text nodes. ### Method Method call on ElementRef ### Response - **Iterator** - An iterator over the child or descendant elements. ``` -------------------------------- ### Parse Complete HTML Document Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a full HTML string into a document tree, automatically injecting missing structural elements like html, head, and body. ```rust use scraper::Html; // Parse a complete HTML document let html = r#" Hello, world!

Welcome

This is a paragraph.

"#; let document = Html::parse_document(html); // Access the root element let root = document.root_element(); println!("Root element: {}", root.value().name()); // Output: Root element: html // Serialize the entire document back to HTML let serialized = document.html(); println!("{}", serialized); // Output: ...... ``` -------------------------------- ### Selector::parse Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a CSS selector string into a reusable Selector object. ```APIDOC ## Selector::parse ### Description Parses a CSS selector string into a reusable Selector object. Supports standard CSS selector syntax including element, class, ID, attribute selectors, combinators, and pseudo-classes. ### Parameters #### Request Body - **selector_string** (string) - Required - The CSS selector string to parse. ### Request Example "div.container#main" ### Response #### Success Response (200) - **selector** (Selector) - A parsed Selector object. #### Error Response (400) - **error** (String) - Returns an error if the selector syntax is invalid. ``` -------------------------------- ### HtmlTreeSink Source: https://context7.com/rust-scraper/scraper/llms.txt Enables DOM manipulation such as removing, reparenting, and modifying nodes. ```APIDOC ## HtmlTreeSink ### Description The `HtmlTreeSink` wrapper enables DOM manipulation using the `TreeSink` trait from html5ever. This allows removing, reparenting, and modifying nodes. ### Method Struct wrapper for DOM manipulation ### Response - **Html** - The modified document returned after calling `finish()`. ``` -------------------------------- ### Html::select - Select Elements from Document Source: https://context7.com/rust-scraper/scraper/llms.txt Returns an iterator over all elements matching a CSS selector. Use this to find all occurrences of specific tags or classes within the entire document. ```rust use scraper::{Html, Selector}; let html = r#"

First
Second
Third

"#; let document = Html::parse_fragment(html); let selector = Selector::parse("li.item").unwrap(); // Iterate over all matching elements for element in document.select(&selector) { println!("Element: {} - Text: {}", element.value().name(), element.text().collect::()); } // Output: // Element: li - Text: First // Element: li - Text: Second // Element: li - Text: Third // Get specific element (first match) let first = document.select(&selector).next(); if let Some(el) = first { println!("First item: {}", el.inner_html()); // Output: First item: First } // Count matches let count = document.select(&selector).count(); println!("Found {} items", count); // Output: Found 3 items // Reverse iteration let reversed: Vec<_> = document.select(&selector) .rev() .map(|e| e.inner_html()) .collect(); println!("Reversed: {:?}", reversed); // Output: Reversed: ["Third", "Second", "First"] ``` -------------------------------- ### Parse HTML Fragment Source: https://context7.com/rust-scraper/scraper/llms.txt Parses partial HTML content without adding document-level wrappers. Ideal for snippets or user-generated content. ```rust use scraper::Html; // Parse an HTML fragment (no DOCTYPE, html, head, body wrapper) let fragment = Html::parse_fragment("

Hello, world!

"); // The fragment creates a minimal tree with the content let root = fragment.root_element(); println!("Fragment root: {}", root.value().name()); // Output: Fragment root: html // The fragment content is inside the root println!("{}", root.inner_html()); // Output:

Hello, world!

``` -------------------------------- ### Html::parse_document Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a string of HTML as a complete document, automatically adding missing structure. ```APIDOC ## Html::parse_document ### Description Parses a string of HTML as a complete document, automatically adding , , and elements if missing. ### Parameters #### Request Body - **html** (string) - Required - The HTML string to parse as a document. ### Request Example "

Welcome

" ### Response #### Success Response (200) - **document** (Html) - A parsed document object representing the HTML tree. ``` -------------------------------- ### Html::parse_fragment Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a string of HTML as a fragment without the document structure. ```APIDOC ## Html::parse_fragment ### Description Parses a string of HTML as a fragment without the document structure. Useful for parsing partial HTML content. ### Parameters #### Request Body - **fragment** (string) - Required - The HTML fragment string to parse. ### Request Example "

Hello, world!

" ### Response #### Success Response (200) - **fragment** (Html) - A parsed fragment object representing the HTML tree. ``` -------------------------------- ### Parse HTML Document in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Use `Html::parse_document` to parse a full HTML document string into a traversable structure. Ensure the input is a valid HTML document. ```rust use scraper::Html; let html = r#" Hello, world!

Hello, world!

"#; let document = Html::parse_document(html); ``` -------------------------------- ### Iterate Elements with child_elements and descendent_elements Source: https://context7.com/rust-scraper/scraper/llms.txt Use `child_elements()` to iterate over direct child elements, skipping text nodes. Use `descendent_elements()` to iterate over all descendant elements in the DOM tree. These methods avoid the need for selectors when traversing the element structure. ```rust use scraper::Html; let html = rлід "#; let document = Html::parse_fragment(html); let root = document.root_element(); // Get direct child elements (skips text nodes) println!("Child elements:"); for child in root.child_elements() { println!(" - {}", child.value().name()); } // Output: // Child elements: // - nav // Get all descendant elements println!("Descendant elements:"); for desc in root.descendent_elements() { println!(" - {}", desc.value().name()); } // Output: // Descendant elements: // - html // - nav // - a // - a // - div // - a ``` -------------------------------- ### Parse HTML Fragment in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Use `Html::parse_fragment` for parsing HTML snippets that may not form a complete document. This is useful for handling partial HTML content. ```rust use scraper::Html; let fragment = Html::parse_fragment("

Hello, world!

"); ``` -------------------------------- ### Parse CSS Selector Source: https://context7.com/rust-scraper/scraper/llms.txt Converts CSS selector strings into reusable Selector objects. Supports standard syntax, combinators, and specific pseudo-classes. ```rust use scraper::Selector; // Simple element selector let selector = Selector::parse("div").unwrap(); // Class selector let selector = Selector::parse(".container").unwrap(); // ID selector let selector = Selector::parse("#main").unwrap(); // Combined selectors let selector = Selector::parse("div.container#main").unwrap(); // Attribute selectors let selector = Selector::parse(r#"input[type="text"]"#).unwrap(); let selector = Selector::parse("a[href^='https://']").unwrap(); // Descendant and child combinators let selector = Selector::parse("ul li").unwrap(); // descendant let selector = Selector::parse("ul > li").unwrap(); // direct child // Pseudo-classes (supported: :has, :is, :where) let selector = Selector::parse(":has(a)").unwrap(); let selector = Selector::parse(":is(h1, h2, h3)").unwrap(); let selector = Selector::parse(":where(article, section) p").unwrap(); // Handle invalid selectors match Selector::parse("") { Ok(_) => println!("Valid selector"), Err(e) => println!("Invalid selector: {}", e), // Output: Invalid selector: Token "<" was not expected } ``` -------------------------------- ### ElementRef::select - Select Descendant Elements Source: https://context7.com/rust-scraper/scraper/llms.txt Selects elements within a specific element's subtree, supporting the :scope pseudo-class. ```APIDOC ## ElementRef::select ### Description Selects elements within a specific element's subtree. Supports the :scope pseudo-class to reference the scoping element. ### Parameters #### Query Parameters - **selector** (Selector) - Required - The CSS selector used to match descendant elements. ``` -------------------------------- ### DOM Manipulation with HtmlTreeSink Source: https://context7.com/rust-scraper/scraper/llms.txt The `HtmlTreeSink` wrapper allows DOM manipulation using the `TreeSink` trait. This enables operations like removing, reparenting, and modifying nodes. After manipulation, `finish()` returns the modified `Html` document. ```rust use html5ever::tree_builder::TreeSink; use scraper::{Html, Selector, HtmlTreeSink}; let html = "

Keep me

Remove me

Also keep

"; let selector = Selector::parse(".remove").unwrap(); // Parse and collect node IDs to remove let mut document = Html::parse_document(html); let node_ids: Vec<_> = document.select(&selector).map(|x| x.id()).collect(); // Wrap in TreeSink for manipulation let tree = HtmlTreeSink::new(document); // Remove nodes for id in node_ids { tree.remove_from_parent(&id); } // Finish manipulation and get back the Html let document = tree.finish(); println!("{}", document.html()); // Output:

Keep me

Also keep

``` -------------------------------- ### Select Descendent Elements in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Query for elements that are descendants of another selected element. This allows for more specific selections within a particular part of the DOM. ```rust use scraper::{Html, Selector}; let html = r#"

"#; let fragment = Html::parse_fragment(html); let ul_selector = Selector::parse("ul").unwrap(); let li_selector = Selector::parse("li").unwrap(); let ul = fragment.select(&ul_selector).next().unwrap(); for element in ul.select(&li_selector) { assert_eq!("li", element.value().name()); } ``` -------------------------------- ### Select Elements with CSS Selector in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Iterate over elements matching a CSS selector within an HTML fragment. The `select` method returns an iterator of matching elements. ```rust use scraper::{Html, Selector}; let html = r#"

"#; let fragment = Html::parse_fragment(html); let selector = Selector::parse("li").unwrap(); for element in fragment.select(&selector) { assert_eq!("li", element.value().name()); } ``` -------------------------------- ### Check if Element Matches Selector in Rust Source: https://context7.com/rust-scraper/scraper/llms.txt Use the `matches` method on a `Selector` to check if an `ElementRef` satisfies the selector's criteria without performing a full document search. This is useful for conditional logic on individual elements. ```rust use scraper::{Html, Selector}; let html = r#"

Regular Item

"#; let document = Html::parse_fragment(html); let card_sel = Selector::parse(".card").unwrap(); let featured_sel = Selector::parse(".featured").unwrap(); // Check each card for card in document.select(&card_sel) { let is_featured = featured_sel.matches(&card); let title_sel = Selector::parse("h2").unwrap(); let title = card.select(&title_sel).next() .map(|h| h.text().collect::()) .unwrap_or_default(); println!("{}: featured={}", title, is_featured); } ``` -------------------------------- ### Access Element Attributes in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Retrieve the value of an attribute from a selected HTML element. The `attr` method returns an `Option<&str>`. ```rust use scraper::{Html, Selector}; let fragment = Html::parse_fragment(r""); let selector = Selector::parse(r"input[name="foo"]" ).unwrap(); let input = fragment.select(&selector).next().unwrap(); assert_eq!(Some("bar"), input.value().attr("value")); ``` -------------------------------- ### ElementRef::value - Access Element Data Source: https://context7.com/rust-scraper/scraper/llms.txt Returns a reference to the underlying Element struct, providing access to the element's name, ID, classes, and attributes. ```APIDOC ## ElementRef::value ### Description Returns a reference to the underlying `Element` struct, providing access to the element's name, ID, classes, and attributes. ``` -------------------------------- ### ElementRef::select - Select Descendant Elements Source: https://context7.com/rust-scraper/scraper/llms.txt Selects elements within a specific element's subtree. Supports the :scope pseudo-class to reference the scoping element. Use this for targeted selections within a known parent element. ```rust use scraper::{Html, Selector}; let html = r#"

Nested 1

Paragraph in box

Direct child

"#; let document = Html::parse_fragment(html); // First, select the container let container_sel = Selector::parse("#container").unwrap(); let container = document.select(&container_sel).next().unwrap(); // Then select descendants within it let span_sel = Selector::parse("span").unwrap(); for span in container.select(&span_sel) { println!("Span text: {}", span.text().collect::()); } // Output: // Span text: Nested 1 // Span text: Direct child // Use :scope to select direct children only let direct_span_sel = Selector::parse(":scope > span").unwrap(); for span in container.select(&direct_span_sel) { println!("Direct child span: {}", span.text().collect::()); } // Output: Direct child span: Direct child ``` -------------------------------- ### ElementRef::value - Access Element Data Source: https://context7.com/rust-scraper/scraper/llms.txt Returns a reference to the underlying `Element` struct, providing access to the element's name, ID, classes, and attributes. Use this to inspect the properties of a selected element. ```rust use scraper::{Html, Selector, CaseSensitivity}; let html = r"

Content

"; let document = Html::parse_fragment(html); let selector = Selector::parse("div").unwrap(); let element = document.select(&selector).next().unwrap(); // Get element value (the Element struct) let el = element.value(); // Element name println!("Tag name: {}", el.name()); // Output: Tag name: div // Element ID println!("ID: {:?}", el.id()); // Output: ID: Some("main") // Check for class println!("Has 'container': {}", el.has_class("container", CaseSensitivity::CaseSensitive)); // Output: Has 'container': true // Iterate over classes let classes: Vec<_> = el.classes().collect(); println!("Classes: {:?}", classes); // Output: Classes: ["active", "container"] // Get attribute value println!("data-count: {:?}", el.attr("data-count")); // Output: data-count: Some("42") // Iterate over all attributes for (name, value) in el.attrs() { println!(" {}: {}", name, value); } // Output: // class: container active // data-count: 42 // id: main ``` -------------------------------- ### ElementRef::text Source: https://context7.com/rust-scraper/scraper/llms.txt Extracts all text nodes within an element and its descendants. ```APIDOC ## ElementRef::text ### Description Returns an iterator over all text nodes within an element and its descendants. Text from nested elements is included. ### Method Method call on ElementRef ### Response - **Iterator** - An iterator yielding text segments found within the element. ``` -------------------------------- ### Manipulate DOM by Removing Elements in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Modify the HTML document by removing elements that match a specific selector. This involves parsing the HTML, selecting nodes, and then using `HtmlTreeSink` to perform the removal. ```rust use html5ever::tree_builder::TreeSink; use scraper::{Html, Selector, HtmlTreeSink}; let html = "hello

REMOVE ME

"; let selector = Selector::parse(".hello").unwrap(); let mut document = Html::parse_document(html); let node_ids: Vec<_> = document.select(&selector).map(|x| x.id()).collect(); let tree = HtmlTreeSink::new(document); for id in node_ids { tree.remove_from_parent(&id); } let document = tree.finish(); assert_eq!(document.html(), "hello"); ``` -------------------------------- ### Generic Element Selection with Selectable Trait in Rust Source: https://context7.com/rust-scraper/scraper/llms.txt The `Selectable` trait enables generic functions to work with both `Html` documents and `ElementRef` instances, allowing for reusable selection logic across different scopes within the HTML structure. ```rust use scraper::{Html, Selector}; use scraper::selectable::Selectable; use scraper::element_ref::ElementRef; // Generic function that works with Html or ElementRef fn extract_links<'a, S>(selectable: S) -> Vec where S: Selectable<'a>, { let selector = Selector::parse("a[href]").unwrap(); selectable .select(&selector) .filter_map(|el| el.attr("href").map(String::from)) .collect() } let html = r#"

Article

"#; let document = Html::parse_fragment(html); // Use with full document let all_links = extract_links(&document); println!("All links: {:?}", all_links); // Use with specific element let nav_sel = Selector::parse("nav").unwrap(); let nav = document.select(&nav_sel).next().unwrap(); let nav_links = extract_links(nav); println!("Nav links: {:?}", nav_links); ``` -------------------------------- ### Access Descendent Text in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Extract all text nodes within a selected element and its descendants. The `.text()` method returns an iterator over string slices, which can be collected. ```rust use scraper::{Html, Selector}; let fragment = Html::parse_fragment("

Hello, world!

"); let selector = Selector::parse("h1").unwrap(); let h1 = fragment.select(&selector).next().unwrap(); let text = h1.text().collect::>(); assert_eq!(vec!["Hello, ", "world!"], text); ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.

Hello, world!

Hello, world!

Title

Welcome

Hello, world!

Hello, world!

Welcome

Hello, world!

Hello, world!

Hello, world!

Featured Item

Regular Item

Hello, world!