### Serialize Element to HTML with ElementRef::html and inner_html Source: https://context7.com/rust-scraper/scraper/llms.txt Use `html()` to get the full HTML string of an element, including the element itself. Use `inner_html()` to get only the HTML content of its children. These methods are useful for serializing parts of the DOM back into strings. ```rust use scraper::{Html, Selector}; let html = rлід

Hello, world!

"#; let document = Html::parse_fragment(html); let selector = Selector::parse("div").unwrap(); let div = document.select(&selector).next().unwrap(); // Get full HTML including the element println!("html(): {}", div.html()); // Output: html():

Hello, world!

// Get only inner content println!("inner_html(): {}", div.inner_html()); // Output: inner_html():

Hello, world!

// Select nested element let p_sel = Selector::parse("p").unwrap(); let p = document.select(&p_sel).next().unwrap(); println!("Paragraph html: {}", p.html()); // Output: Paragraph html:

Hello, world!

println!("Paragraph inner: {}", p.inner_html()); // Output: Paragraph inner: Hello, world! ``` -------------------------------- ### Serialize HTML and Inner HTML in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Get the HTML representation of a selected element using `.html()` or its inner HTML content using `.inner_html()`. Useful for extracting or displaying specific parts of the DOM. ```rust use scraper::{Html, Selector}; let fragment = Html::parse_fragment("

Hello, world!

"); let selector = Selector::parse("h1").unwrap(); let h1 = fragment.select(&selector).next().unwrap(); assert_eq!("

Hello, world!

", h1.html()); assert_eq!("Hello, world!", h1.inner_html()); ``` -------------------------------- ### Error Handling for Selector Parsing in Rust Source: https://context7.com/rust-scraper/scraper/llms.txt The `SelectorErrorKind` enum provides specific reasons for selector parsing failures. This example demonstrates how to map these errors into user-friendly messages. ```rust use scraper::Selector; use scraper::error::SelectorErrorKind; fn parse_selector(input: &str) -> Result { Selector::parse(input).map_err(|e| { match e { SelectorErrorKind::UnexpectedToken(token) => format!("Unexpected token in selector"), SelectorErrorKind::EndOfLine => format!("Selector ended unexpectedly"), SelectorErrorKind::QualRuleInvalid => format!("Invalid qualified rule"), _ => format!("Selector parse error: {}", e), } }) } // Valid selector match parse_selector("div.class") { Ok(_) => println!("Valid selector"), Err(e) => println!("Error: {}", e), } // Invalid selector match parse_selector("[invalid") { Ok(_) => println!("Valid selector"), Err(e) => println!("Error: {}", e), } ``` -------------------------------- ### ElementRef::attr - Get Attribute Value Source: https://context7.com/rust-scraper/scraper/llms.txt Convenience method to directly access an attribute value from an ElementRef. ```APIDOC ## ElementRef::attr ### Description Convenience method to directly access an attribute value from an ElementRef. ### Parameters #### Query Parameters - **attr_name** (str) - Required - The name of the attribute to retrieve. ``` -------------------------------- ### ElementRef::attr - Get Attribute Value Source: https://context7.com/rust-scraper/scraper/llms.txt Convenience method to directly access an attribute value from an ElementRef. Useful for quickly retrieving specific attribute values like 'href', 'src', or checking for the presence of boolean attributes. ```rust use scraper::{Html, Selector}; let html = r#"Link Description "#; let document = Html::parse_fragment(html); // Get href from link let link_sel = Selector::parse("a").unwrap(); let link = document.select(&link_sel).next().unwrap(); println!("URL: {}", link.attr("href").unwrap_or("no href")); // Output: URL: https://example.com // Get multiple attributes from image let img_sel = Selector::parse("img").unwrap(); let img = document.select(&img_sel).next().unwrap(); println!("Image src: {}, alt: {}", img.attr("src").unwrap_or(""), img.attr("alt").unwrap_or("no alt")); // Output: Image src: /image.png, alt: Description // Check for boolean attribute let input_sel = Selector::parse("input").unwrap(); let input = document.select(&input_sel).next().unwrap(); let is_required = input.attr("required").is_some(); println!("Input is required: {}", is_required); // Output: Input is required: true ``` -------------------------------- ### Extract Text Content with ElementRef::text Source: https://context7.com/rust-scraper/scraper/llms.txt Use `ElementRef::text()` to get an iterator over all text nodes within an element and its descendants. This method includes text from nested elements. It can be collected into a Vec for individual segments or joined into a single String. ```rust use scraper::{Html, Selector}; let html = rлід

Hello, bold and italic text!

"#; let document = Html::parse_fragment(html); let selector = Selector::parse("p").unwrap(); let paragraph = document.select(&selector).next().unwrap(); // Collect all text segments let text_parts: Vec<_> = paragraph.text().collect(); println!("Text parts: {:?}", text_parts); // Output: Text parts: ["Hello, ", "bold", " and ", "italic", " text!"] // Join into single string let full_text: String = paragraph.text().collect(); println!("Full text: {}", full_text); // Output: Full text: Hello, bold and italic text! ``` ```rust use scraper::{Html, Selector}; let html = rлід

Title

First paragraph with link.

Second paragraph.

"#; let document = Html::parse_fragment(html); let article_sel = Selector::parse(".article").unwrap(); let article = document.select(&article_sel).next().unwrap(); let all_text: String = article.text() .map(|s| s.trim()) .filter(|s| !s.is_empty()) .collect::>() .join(" "); println!("Article text: {}", all_text); // Output: Article text: Title First paragraph with link . Second paragraph. ``` -------------------------------- ### Html::select - Select Elements from Document Source: https://context7.com/rust-scraper/scraper/llms.txt Returns an iterator over all elements in the document that match the given CSS selector. ```APIDOC ## Html::select ### Description Returns an iterator over all elements in the document that match the given selector. The iterator yields `ElementRef` objects that provide access to element data. ### Parameters #### Query Parameters - **selector** (Selector) - Required - The CSS selector used to match elements in the document. ``` -------------------------------- ### ElementRef::html and ElementRef::inner_html Source: https://context7.com/rust-scraper/scraper/llms.txt Methods to serialize an element or its children back to an HTML string. ```APIDOC ## ElementRef::html and ElementRef::inner_html ### Description Methods to serialize an element back to an HTML string. `html()` includes the element itself, while `inner_html()` returns only the children. ### Method Method call on ElementRef ### Response - **String** - The serialized HTML representation. ``` -------------------------------- ### Parse CSS Selector in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Create a CSS selector using `Selector::parse`. This selector can then be used to query elements within an HTML document or fragment. `.unwrap()` is used for simplicity; consider proper error handling. ```rust use scraper::Selector; let selector = Selector::parse("h1.foo").unwrap(); ``` -------------------------------- ### ElementRef::child_elements and descendent_elements Source: https://context7.com/rust-scraper/scraper/llms.txt Methods to iterate over child and descendant elements without requiring a selector. ```APIDOC ## ElementRef::child_elements and descendent_elements ### Description Methods to iterate over child and descendant elements without needing a selector. These methods skip text nodes. ### Method Method call on ElementRef ### Response - **Iterator** - An iterator over the child or descendant elements. ``` -------------------------------- ### Parse Complete HTML Document Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a full HTML string into a document tree, automatically injecting missing structural elements like html, head, and body. ```rust use scraper::Html; // Parse a complete HTML document let html = r#" Hello, world!

Welcome

This is a paragraph.

"#; let document = Html::parse_document(html); // Access the root element let root = document.root_element(); println!("Root element: {}", root.value().name()); // Output: Root element: html // Serialize the entire document back to HTML let serialized = document.html(); println!("{}", serialized); // Output: ...... ``` -------------------------------- ### Selector::parse Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a CSS selector string into a reusable Selector object. ```APIDOC ## Selector::parse ### Description Parses a CSS selector string into a reusable Selector object. Supports standard CSS selector syntax including element, class, ID, attribute selectors, combinators, and pseudo-classes. ### Parameters #### Request Body - **selector_string** (string) - Required - The CSS selector string to parse. ### Request Example "div.container#main" ### Response #### Success Response (200) - **selector** (Selector) - A parsed Selector object. #### Error Response (400) - **error** (String) - Returns an error if the selector syntax is invalid. ``` -------------------------------- ### HtmlTreeSink Source: https://context7.com/rust-scraper/scraper/llms.txt Enables DOM manipulation such as removing, reparenting, and modifying nodes. ```APIDOC ## HtmlTreeSink ### Description The `HtmlTreeSink` wrapper enables DOM manipulation using the `TreeSink` trait from html5ever. This allows removing, reparenting, and modifying nodes. ### Method Struct wrapper for DOM manipulation ### Response - **Html** - The modified document returned after calling `finish()`. ``` -------------------------------- ### Html::select - Select Elements from Document Source: https://context7.com/rust-scraper/scraper/llms.txt Returns an iterator over all elements matching a CSS selector. Use this to find all occurrences of specific tags or classes within the entire document. ```rust use scraper::{Html, Selector}; let html = r#"
  • First
  • Second
  • Third
"#; let document = Html::parse_fragment(html); let selector = Selector::parse("li.item").unwrap(); // Iterate over all matching elements for element in document.select(&selector) { println!("Element: {} - Text: {}", element.value().name(), element.text().collect::()); } // Output: // Element: li - Text: First // Element: li - Text: Second // Element: li - Text: Third // Get specific element (first match) let first = document.select(&selector).next(); if let Some(el) = first { println!("First item: {}", el.inner_html()); // Output: First item: First } // Count matches let count = document.select(&selector).count(); println!("Found {} items", count); // Output: Found 3 items // Reverse iteration let reversed: Vec<_> = document.select(&selector) .rev() .map(|e| e.inner_html()) .collect(); println!("Reversed: {:?}", reversed); // Output: Reversed: ["Third", "Second", "First"] ``` -------------------------------- ### Parse HTML Fragment Source: https://context7.com/rust-scraper/scraper/llms.txt Parses partial HTML content without adding document-level wrappers. Ideal for snippets or user-generated content. ```rust use scraper::Html; // Parse an HTML fragment (no DOCTYPE, html, head, body wrapper) let fragment = Html::parse_fragment("

Hello, world!

"); // The fragment creates a minimal tree with the content let root = fragment.root_element(); println!("Fragment root: {}", root.value().name()); // Output: Fragment root: html // The fragment content is inside the root println!("{}", root.inner_html()); // Output:

Hello, world!

``` -------------------------------- ### Html::parse_document Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a string of HTML as a complete document, automatically adding missing structure. ```APIDOC ## Html::parse_document ### Description Parses a string of HTML as a complete document, automatically adding , , and elements if missing. ### Parameters #### Request Body - **html** (string) - Required - The HTML string to parse as a document. ### Request Example "

Welcome

" ### Response #### Success Response (200) - **document** (Html) - A parsed document object representing the HTML tree. ``` -------------------------------- ### Html::parse_fragment Source: https://context7.com/rust-scraper/scraper/llms.txt Parses a string of HTML as a fragment without the document structure. ```APIDOC ## Html::parse_fragment ### Description Parses a string of HTML as a fragment without the document structure. Useful for parsing partial HTML content. ### Parameters #### Request Body - **fragment** (string) - Required - The HTML fragment string to parse. ### Request Example "

Hello, world!

" ### Response #### Success Response (200) - **fragment** (Html) - A parsed fragment object representing the HTML tree. ``` -------------------------------- ### Parse HTML Document in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Use `Html::parse_document` to parse a full HTML document string into a traversable structure. Ensure the input is a valid HTML document. ```rust use scraper::Html; let html = r#" Hello, world!

Hello, world!

"#; let document = Html::parse_document(html); ``` -------------------------------- ### Iterate Elements with child_elements and descendent_elements Source: https://context7.com/rust-scraper/scraper/llms.txt Use `child_elements()` to iterate over direct child elements, skipping text nodes. Use `descendent_elements()` to iterate over all descendant elements in the DOM tree. These methods avoid the need for selectors when traversing the element structure. ```rust use scraper::Html; let html = rлід "#; let document = Html::parse_fragment(html); let root = document.root_element(); // Get direct child elements (skips text nodes) println!("Child elements:"); for child in root.child_elements() { println!(" - {}", child.value().name()); } // Output: // Child elements: // - nav // Get all descendant elements println!("Descendant elements:"); for desc in root.descendent_elements() { println!(" - {}", desc.value().name()); } // Output: // Descendant elements: // - html // - nav // - a // - a // - div // - a ``` -------------------------------- ### Parse HTML Fragment in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Use `Html::parse_fragment` for parsing HTML snippets that may not form a complete document. This is useful for handling partial HTML content. ```rust use scraper::Html; let fragment = Html::parse_fragment("

Hello, world!

"); ``` -------------------------------- ### Parse CSS Selector Source: https://context7.com/rust-scraper/scraper/llms.txt Converts CSS selector strings into reusable Selector objects. Supports standard syntax, combinators, and specific pseudo-classes. ```rust use scraper::Selector; // Simple element selector let selector = Selector::parse("div").unwrap(); // Class selector let selector = Selector::parse(".container").unwrap(); // ID selector let selector = Selector::parse("#main").unwrap(); // Combined selectors let selector = Selector::parse("div.container#main").unwrap(); // Attribute selectors let selector = Selector::parse(r#"input[type="text"]"#).unwrap(); let selector = Selector::parse("a[href^='https://']").unwrap(); // Descendant and child combinators let selector = Selector::parse("ul li").unwrap(); // descendant let selector = Selector::parse("ul > li").unwrap(); // direct child // Pseudo-classes (supported: :has, :is, :where) let selector = Selector::parse(":has(a)").unwrap(); let selector = Selector::parse(":is(h1, h2, h3)").unwrap(); let selector = Selector::parse(":where(article, section) p").unwrap(); // Handle invalid selectors match Selector::parse("") { Ok(_) => println!("Valid selector"), Err(e) => println!("Invalid selector: {}", e), // Output: Invalid selector: Token "<" was not expected } ``` -------------------------------- ### ElementRef::select - Select Descendant Elements Source: https://context7.com/rust-scraper/scraper/llms.txt Selects elements within a specific element's subtree, supporting the :scope pseudo-class. ```APIDOC ## ElementRef::select ### Description Selects elements within a specific element's subtree. Supports the :scope pseudo-class to reference the scoping element. ### Parameters #### Query Parameters - **selector** (Selector) - Required - The CSS selector used to match descendant elements. ``` -------------------------------- ### DOM Manipulation with HtmlTreeSink Source: https://context7.com/rust-scraper/scraper/llms.txt The `HtmlTreeSink` wrapper allows DOM manipulation using the `TreeSink` trait. This enables operations like removing, reparenting, and modifying nodes. After manipulation, `finish()` returns the modified `Html` document. ```rust use html5ever::tree_builder::TreeSink; use scraper::{Html, Selector, HtmlTreeSink}; let html = "

Keep me

Remove me

Also keep

"; let selector = Selector::parse(".remove").unwrap(); // Parse and collect node IDs to remove let mut document = Html::parse_document(html); let node_ids: Vec<_> = document.select(&selector).map(|x| x.id()).collect(); // Wrap in TreeSink for manipulation let tree = HtmlTreeSink::new(document); // Remove nodes for id in node_ids { tree.remove_from_parent(&id); } // Finish manipulation and get back the Html let document = tree.finish(); println!("{}", document.html()); // Output:

Keep me

Also keep

``` -------------------------------- ### Select Descendent Elements in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Query for elements that are descendants of another selected element. This allows for more specific selections within a particular part of the DOM. ```rust use scraper::{Html, Selector}; let html = r#"
  • Foo
  • Bar
  • Baz
"#; let fragment = Html::parse_fragment(html); let ul_selector = Selector::parse("ul").unwrap(); let li_selector = Selector::parse("li").unwrap(); let ul = fragment.select(&ul_selector).next().unwrap(); for element in ul.select(&li_selector) { assert_eq!("li", element.value().name()); } ``` -------------------------------- ### Select Elements with CSS Selector in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Iterate over elements matching a CSS selector within an HTML fragment. The `select` method returns an iterator of matching elements. ```rust use scraper::{Html, Selector}; let html = r#"
  • Foo
  • Bar
  • Baz
"#; let fragment = Html::parse_fragment(html); let selector = Selector::parse("li").unwrap(); for element in fragment.select(&selector) { assert_eq!("li", element.value().name()); } ``` -------------------------------- ### Check if Element Matches Selector in Rust Source: https://context7.com/rust-scraper/scraper/llms.txt Use the `matches` method on a `Selector` to check if an `ElementRef` satisfies the selector's criteria without performing a full document search. This is useful for conditional logic on individual elements. ```rust use scraper::{Html, Selector}; let html = r#"

Regular Item

"#; let document = Html::parse_fragment(html); let card_sel = Selector::parse(".card").unwrap(); let featured_sel = Selector::parse(".featured").unwrap(); // Check each card for card in document.select(&card_sel) { let is_featured = featured_sel.matches(&card); let title_sel = Selector::parse("h2").unwrap(); let title = card.select(&title_sel).next() .map(|h| h.text().collect::()) .unwrap_or_default(); println!("{}: featured={}", title, is_featured); } ``` -------------------------------- ### Access Element Attributes in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Retrieve the value of an attribute from a selected HTML element. The `attr` method returns an `Option<&str>`. ```rust use scraper::{Html, Selector}; let fragment = Html::parse_fragment(r""); let selector = Selector::parse(r"input[name="foo"]" ).unwrap(); let input = fragment.select(&selector).next().unwrap(); assert_eq!(Some("bar"), input.value().attr("value")); ``` -------------------------------- ### ElementRef::value - Access Element Data Source: https://context7.com/rust-scraper/scraper/llms.txt Returns a reference to the underlying Element struct, providing access to the element's name, ID, classes, and attributes. ```APIDOC ## ElementRef::value ### Description Returns a reference to the underlying `Element` struct, providing access to the element's name, ID, classes, and attributes. ``` -------------------------------- ### ElementRef::select - Select Descendant Elements Source: https://context7.com/rust-scraper/scraper/llms.txt Selects elements within a specific element's subtree. Supports the :scope pseudo-class to reference the scoping element. Use this for targeted selections within a known parent element. ```rust use scraper::{Html, Selector}; let html = r#"
Nested 1

Paragraph in box

Direct child
"#; let document = Html::parse_fragment(html); // First, select the container let container_sel = Selector::parse("#container").unwrap(); let container = document.select(&container_sel).next().unwrap(); // Then select descendants within it let span_sel = Selector::parse("span").unwrap(); for span in container.select(&span_sel) { println!("Span text: {}", span.text().collect::()); } // Output: // Span text: Nested 1 // Span text: Direct child // Use :scope to select direct children only let direct_span_sel = Selector::parse(":scope > span").unwrap(); for span in container.select(&direct_span_sel) { println!("Direct child span: {}", span.text().collect::()); } // Output: Direct child span: Direct child ``` -------------------------------- ### ElementRef::value - Access Element Data Source: https://context7.com/rust-scraper/scraper/llms.txt Returns a reference to the underlying `Element` struct, providing access to the element's name, ID, classes, and attributes. Use this to inspect the properties of a selected element. ```rust use scraper::{Html, Selector, CaseSensitivity}; let html = r"
Content
"; let document = Html::parse_fragment(html); let selector = Selector::parse("div").unwrap(); let element = document.select(&selector).next().unwrap(); // Get element value (the Element struct) let el = element.value(); // Element name println!("Tag name: {}", el.name()); // Output: Tag name: div // Element ID println!("ID: {:?}", el.id()); // Output: ID: Some("main") // Check for class println!("Has 'container': {}", el.has_class("container", CaseSensitivity::CaseSensitive)); // Output: Has 'container': true // Iterate over classes let classes: Vec<_> = el.classes().collect(); println!("Classes: {:?}", classes); // Output: Classes: ["active", "container"] // Get attribute value println!("data-count: {:?}", el.attr("data-count")); // Output: data-count: Some("42") // Iterate over all attributes for (name, value) in el.attrs() { println!(" {}: {}", name, value); } // Output: // class: container active // data-count: 42 // id: main ``` -------------------------------- ### ElementRef::text Source: https://context7.com/rust-scraper/scraper/llms.txt Extracts all text nodes within an element and its descendants. ```APIDOC ## ElementRef::text ### Description Returns an iterator over all text nodes within an element and its descendants. Text from nested elements is included. ### Method Method call on ElementRef ### Response - **Iterator** - An iterator yielding text segments found within the element. ``` -------------------------------- ### Manipulate DOM by Removing Elements in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Modify the HTML document by removing elements that match a specific selector. This involves parsing the HTML, selecting nodes, and then using `HtmlTreeSink` to perform the removal. ```rust use html5ever::tree_builder::TreeSink; use scraper::{Html, Selector, HtmlTreeSink}; let html = "hello

REMOVE ME

"; let selector = Selector::parse(".hello").unwrap(); let mut document = Html::parse_document(html); let node_ids: Vec<_> = document.select(&selector).map(|x| x.id()).collect(); let tree = HtmlTreeSink::new(document); for id in node_ids { tree.remove_from_parent(&id); } let document = tree.finish(); assert_eq!(document.html(), "hello"); ``` -------------------------------- ### Generic Element Selection with Selectable Trait in Rust Source: https://context7.com/rust-scraper/scraper/llms.txt The `Selectable` trait enables generic functions to work with both `Html` documents and `ElementRef` instances, allowing for reusable selection logic across different scopes within the HTML structure. ```rust use scraper::{Html, Selector}; use scraper::selectable::Selectable; use scraper::element_ref::ElementRef; // Generic function that works with Html or ElementRef fn extract_links<'a, S>(selectable: S) -> Vec where S: Selectable<'a>, { let selector = Selector::parse("a[href]").unwrap(); selectable .select(&selector) .filter_map(|el| el.attr("href").map(String::from)) .collect() } let html = r#"
Article
"#; let document = Html::parse_fragment(html); // Use with full document let all_links = extract_links(&document); println!("All links: {:?}", all_links); // Use with specific element let nav_sel = Selector::parse("nav").unwrap(); let nav = document.select(&nav_sel).next().unwrap(); let nav_links = extract_links(nav); println!("Nav links: {:?}", nav_links); ``` -------------------------------- ### Access Descendent Text in Rust Source: https://github.com/rust-scraper/scraper/blob/master/scraper/README.md Extract all text nodes within a selected element and its descendants. The `.text()` method returns an iterator over string slices, which can be collected. ```rust use scraper::{Html, Selector}; let fragment = Html::parse_fragment("

Hello, world!

"); let selector = Selector::parse("h1").unwrap(); let h1 = fragment.select(&selector).next().unwrap(); let text = h1.text().collect::>(); assert_eq!(vec!["Hello, ", "world!"], text); ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.