### Install htmlquery Package Source: https://github.com/antchfx/htmlquery/blob/master/README.md Use go get to install the htmlquery package. ```go go get github.com/antchfx/htmlquery ``` -------------------------------- ### Quick Start Example: Bing Search Results Source: https://github.com/antchfx/htmlquery/blob/master/README.md A comprehensive example demonstrating loading a URL, querying for news items, and extracting titles and links from search results. ```go func main() { doc, err := htmlquery.LoadURL("https://www.bing.com/search?q=golang") if err != nil { panic(err) } // Find all news item. list, err := htmlquery.QueryAll(doc, "//ol/li") if err != nil { panic(err) } for i, n := range list { a := htmlquery.FindOne(n, "//a") if a != nil { fmt.Printf("%d %s(%s)\n", i, htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href")) } } } ``` -------------------------------- ### Optimized Node Searching with Pre-compiled XPath (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt Utilize `QuerySelectorAll` with a pre-compiled `*xpath.Expr` for maximum performance in tight loops. This bypasses string parsing and the LRU cache, making it ideal for repeated evaluations of the same XPath expression. The example shows compiling an expression once and reusing it across multiple HTML documents. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" "github.com/antchfx/xpath" ) func main() { // Compile once, reuse many times expr, err := xpath.Compile("//a[@href]") if err != nil { panic(err) } pages := []string{ `One`, `TwoThree`, } for _, page := range pages { doc, _ := htmlquery.Parse(strings.NewReader(page)) nodes := htmlquery.QuerySelectorAll(doc, expr) for _, n := range nodes { fmt.Printf("%s -> %s\n", htmlquery.InnerText(n), htmlquery.SelectAttr(n, "href")) } } } ``` -------------------------------- ### Find Nodes in HTML Document (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt Use `Find` for panic-on-error node searching with XPath. It's suitable for hardcoded expressions where invalid syntax indicates a programming error. This example demonstrates finding all table rows and then cells within each row. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `
AB
CD
` doc, _ := htmlquery.Parse(strings.NewReader(html)) rows := htmlquery.Find(doc, "//tr") for _, row := range rows { cols := htmlquery.Find(row, "td") for j, col := range cols { fmt.Printf("cell[%d]: %s\n", j, htmlquery.InnerText(col)) } } } ``` -------------------------------- ### Selecting HTML Element Attributes (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt Use `SelectAttr` to get the value of a specified attribute from an HTML node. It returns an empty string if the attribute is not present. This function also works with attribute nodes obtained directly from XPath queries. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `A mountain` doc, _ := htmlquery.Parse(strings.NewReader(html)) img := htmlquery.FindOne(doc, "//img") fmt.Println(htmlquery.SelectAttr(img, "src")) // Output: photo.jpg fmt.Println(htmlquery.SelectAttr(img, "alt")) // Output: A mountain fmt.Println(htmlquery.SelectAttr(img, "missing")) // Output: (empty string) // Attribute nodes via XPath srcNode := htmlquery.FindOne(doc, "//img/@src") fmt.Println(htmlquery.InnerText(srcNode)) // Output: photo.jpg } ``` -------------------------------- ### Extracting Inner Text from HTML Nodes (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt The `InnerText` function retrieves the combined text content of a node and its descendants, stripping all tags and ignoring comments. This is useful for getting clean text from HTML elements. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

Hello, World!

` doc, _ := htmlquery.Parse(strings.NewReader(html)) div := htmlquery.FindOne(doc, "//div") fmt.Println(htmlquery.InnerText(div)) // Output: Hello, World! } ``` -------------------------------- ### Load and Query HTML from a Local File Source: https://context7.com/antchfx/htmlquery/llms.txt Opens an HTML file from the local filesystem and parses it. Finds and prints the text content of all paragraph elements. Requires importing the htmlquery package. ```go package main import ( "fmt" "github.com/antchfx/htmlquery" ) func main() { doc, err := htmlquery.LoadDoc("/var/www/html/index.html") if err != nil { fmt.Printf("failed to load file: %v\n", err) return } // Find all paragraphs for i, p := range htmlquery.Find(doc, "//p") { fmt.Printf("p[%d]: %s\n", i+1, htmlquery.InnerText(p)) } } ``` -------------------------------- ### Load HTML Document from File Path Source: https://github.com/antchfx/htmlquery/blob/master/README.md Load an HTML document from a local file path. ```go filePath := "/home/user/sample.html" doc, err := htmlquery.LoadDoc(filePath) ``` -------------------------------- ### Load HTML Document from URL Source: https://github.com/antchfx/htmlquery/blob/master/README.md Load an HTML document directly from a given URL. ```go doc, err := htmlquery.LoadURL("http://example.com/") ``` -------------------------------- ### Load HTML from File Source: https://context7.com/antchfx/htmlquery/llms.txt Opens an HTML file from the local filesystem and parses it into a node tree. ```APIDOC ## LoadDoc(path string) (*html.Node, error) ### Description Opens an HTML file from the local filesystem and parses it into a node tree. ### Method ```go LoadDoc(path string) (*html.Node, error) ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **path** (string) - The path to the HTML file. ### Request Example ```go package main import ( "fmt" "github.com/antchfx/htmlquery" ) func main() { doc, err := htmlquery.LoadDoc("/var/www/html/index.html") if err != nil { fmt.Printf("failed to load file: %v\n", err) return } // Find all paragraphs for i, p := range htmlquery.Find(doc, "//p") { fmt.Printf("p[%d]: %s\n", i+1, htmlquery.InnerText(p)) } } ``` ### Response #### Success Response (200) - **doc** (*html.Node) - The root node of the parsed HTML document. - **err** (error) - An error if opening or parsing the file fails. #### Response Example None provided. ``` -------------------------------- ### Load and Query HTML from a URL Source: https://context7.com/antchfx/htmlquery/llms.txt Fetches an HTML page from a URL, handling compression and charsets. Extracts the page title and lists all links with their href attributes. Requires importing the htmlquery package. ```go package main import ( "fmt" "github.com/antchfx/htmlquery" ) func main() { doc, err := htmlquery.LoadURL("https://example.com/") if err != nil { fmt.Printf("failed to load URL: %v\n", err) return } // Extract the page title title := htmlquery.FindOne(doc, "//title") if title != nil { fmt.Println("Page title:", htmlquery.InnerText(title)) } // List all links for _, a := range htmlquery.Find(doc, "//a[@href]") { fmt.Printf("Link: %s -> %s\n", htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href"), ) } } ``` -------------------------------- ### CreateXPathNavigator Source: https://context7.com/antchfx/htmlquery/llms.txt Creates an `xpath.NodeNavigator` from an `*html.Node` tree, enabling advanced XPath evaluations. ```APIDOC ## CreateXPathNavigator(top *html.Node) *NodeNavigator ### Description Creates an `xpath.NodeNavigator` backed by an `*html.Node` tree, enabling direct use of the `github.com/antchfx/xpath` evaluation API for advanced use cases such as XPath `Evaluate()` (returning numbers, booleans, or strings rather than node-sets). ### Parameters - **top** (*html.Node) - The root node of the HTML document to create a navigator for. ### Return Value *NodeNavigator - An XPath navigator for the given HTML node tree. ``` -------------------------------- ### Create XPath Navigator for Advanced Evaluation Source: https://context7.com/antchfx/htmlquery/llms.txt Creates an XPath navigator from an HTML node tree for advanced XPath evaluations, such as returning numbers, booleans, or strings using `xpath.Evaluate()`. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" "github.com/antchfx/xpath" ) func main() { html := `` doc, _ := htmlquery.Parse(strings.NewReader(html)) // Count nodes using XPath Evaluate (returns float64 for count()) expr, _ := xpath.Compile("count(//li)") nav := htmlquery.CreateXPathNavigator(doc) count := expr.Evaluate(nav).(float64) fmt.Printf("Number of
  • elements: %.0f\n", count) // Output: 3 // Check a boolean expression exprBool, _ := xpath.Compile("boolean(//li[text()='B'])") result := exprBool.Evaluate(htmlquery.CreateXPathNavigator(doc)).(bool) fmt.Println("Contains 'B':", result) // Output: true } ``` -------------------------------- ### Load HTML from URL Source: https://context7.com/antchfx/htmlquery/llms.txt Fetches an HTML page over HTTP, automatically handling gzip and deflate Content-Encoding, charset detection, and response body cleanup. ```APIDOC ## LoadURL(url string) (*html.Node, error) ### Description Fetches an HTML page over HTTP, automatically handling gzip and deflate Content-Encoding, charset detection, and response body cleanup. ### Method ```go LoadURL(url string) (*html.Node, error) ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **url** (string) - The URL of the HTML page to fetch. ### Request Example ```go package main import ( "fmt" "github.com/antchfx/htmlquery" ) func main() { doc, err := htmlquery.LoadURL("https://example.com/") if err != nil { fmt.Printf("failed to load URL: %v\n", err) return } // Extract the page title title := htmlquery.FindOne(doc, "//title") if title != nil { fmt.Println("Page title:", htmlquery.InnerText(title)) } // List all links for _, a := range htmlquery.Find(doc, "//a[@href]") { fmt.Printf("Link: %s -> %s\n", htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href"), ) } } ``` ### Response #### Success Response (200) - **doc** (*html.Node) - The root node of the parsed HTML document. - **err** (error) - An error if fetching or parsing fails. #### Response Example None provided. ``` -------------------------------- ### Query All Elements with XPath Source: https://github.com/antchfx/htmlquery/blob/master/README.md Execute an XPath query to find all matching nodes in an HTML document. Panics if the XPath expression is invalid. ```go nodes, err := htmlquery.QueryAll(doc, "//a") if err != nil { panic(`not a valid XPath expression.`) } ``` -------------------------------- ### Load HTML Document from String Reader Source: https://github.com/antchfx/htmlquery/blob/master/README.md Parse an HTML document from an io.Reader, such as a string reader. ```go s := `....` doc, err := htmlquery.Parse(strings.NewReader(s)) ``` -------------------------------- ### Find Single Node and Attributes (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt Use `FindOne` to retrieve the first matching node or `nil` if not found. This function is useful for accessing specific elements like the `` tag or retrieving attribute values. It also handles cases where the target node is missing. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `Test` doc, _ := htmlquery.Parse(strings.NewReader(html)) htmlNode := htmlquery.FindOne(doc, "//html") fmt.Println(htmlquery.SelectAttr(htmlNode, "lang")) // Output: en title := htmlquery.FindOne(doc, "//title") fmt.Println(htmlquery.InnerText(title)) // Output: Test missing := htmlquery.FindOne(doc, "//article") fmt.Println(missing) // Output: } ``` -------------------------------- ### Find All Anchor Elements Source: https://github.com/antchfx/htmlquery/blob/master/README.md Find all anchor () elements within the loaded HTML document. ```go list := htmlquery.Find(doc, "//a") ``` -------------------------------- ### Configure HTML Query Cache Source: https://context7.com/antchfx/htmlquery/llms.txt Control the built-in LRU XPath expression cache using `SelectorCacheMaxEntries` and `DisableSelectorCache`. Caching is enabled by default with a capacity of 50 entries and is safe for concurrent use. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

    Hello

    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) // Increase cache capacity for workloads with many distinct XPath expressions htmlquery.SelectorCacheMaxEntries = 200 node, _ := htmlquery.Query(doc, "//p") fmt.Println(htmlquery.InnerText(node)) // Output: Hello // Disable the cache (e.g., for testing or one-shot scripts) htmlquery.DisableSelectorCache = true node2, _ := htmlquery.Query(doc, "//p") fmt.Println(htmlquery.InnerText(node2)) // Output: Hello // Re-enable htmlquery.DisableSelectorCache = false } ``` -------------------------------- ### Parse HTML from a Reader Source: https://context7.com/antchfx/htmlquery/llms.txt Parses HTML from any io.Reader. Useful when HTML is already in memory or from a stream. Requires importing the htmlquery package. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { s := `

    Hello, World!

    ` doc, err := htmlquery.Parse(strings.NewReader(s)) if err != nil { panic(err) } node := htmlquery.FindOne(doc, "//h1") fmt.Println(htmlquery.InnerText(node)) // Output: Hello, World! } ``` -------------------------------- ### Evaluate Count of Image Elements Source: https://github.com/antchfx/htmlquery/blob/master/README.md Compile an XPath expression to count all image elements and evaluate the result. ```go expr, _ := xpath.Compile("count(//img)") v := expr.Evaluate(htmlquery.CreateXPathNavigator(doc)).(float64) fmt.Printf("total count is %f", v) ``` -------------------------------- ### Serialize HTML Node to String Source: https://context7.com/antchfx/htmlquery/llms.txt Serializes an HTML node to a string. Use `self=true` to include the node's tag, or `self=false` for only its inner HTML. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

    Title

    Body text.

    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) div := htmlquery.FindOne(doc, "//div") fmt.Println(htmlquery.OutputHTML(div, true)) // Output:

    Title

    Body text.

    fmt.Println(htmlquery.OutputHTML(div, false)) // Output:

    Title

    Body text.

    } ``` -------------------------------- ### QuerySelector and QuerySelectorAll Source: https://context7.com/antchfx/htmlquery/llms.txt These functions accept a pre-compiled `*xpath.Expr` to bypass string parsing and the LRU cache, offering optimal performance for repeated evaluations of the same expression. ```APIDOC ## QuerySelector and QuerySelectorAll ### Description Accept a pre-compiled `*xpath.Expr` to bypass both string parsing and the LRU cache entirely. Best for tight loops where the same expression is evaluated many times. ### Method Signatures `QuerySelector(top *html.Node, selector *xpath.Expr) *html.Node` `QuerySelectorAll(top *html.Node, selector *xpath.Expr) []*html.Node` ### Parameters - **top** (*html.Node) - The root node to search within. - **selector** (*xpath.Expr) - A pre-compiled XPath expression. ### Returns - `QuerySelector`: The first matching HTML node, or nil if no match is found. - `QuerySelectorAll`: A slice of matching HTML nodes. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" "github.com/antchfx/xpath" ) func main() { // Compile once, reuse many times expr, err := xpath.Compile("//a[@href]") if err != nil { panic(err) } pages := []string{ `
    One`, `TwoThree`, } for _, page := range pages { doc, _ := htmlquery.Parse(strings.NewReader(page)) nodes := htmlquery.QuerySelectorAll(doc, expr) for _, n := range nodes { fmt.Printf("%s -> %s\n", htmlquery.InnerText(n), htmlquery.SelectAttr(n, "href")) } } } ``` ``` -------------------------------- ### Query All Nodes Matching XPath Source: https://context7.com/antchfx/htmlquery/llms.txt Returns all nodes that match the given XPath expression. Handles invalid XPath expressions gracefully by returning an error. Requires importing the htmlquery package. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `` doc, _ := htmlquery.Parse(strings.NewReader(html)) nodes, err := htmlquery.QueryAll(doc, "//li") if err != nil { fmt.Printf("invalid XPath: %v\n", err) return } for _, n := range nodes { fmt.Println(htmlquery.InnerText(n)) } // Output: // One // Two // Three } ``` -------------------------------- ### Query First Node Source: https://context7.com/antchfx/htmlquery/llms.txt Returns the first matching node, or nil if none match. Returns an error only if the expression is malformed. ```APIDOC ## Query(top *html.Node, expr string) (*html.Node, error) ### Description Returns the first matching node, or `nil` if none match. Returns an error only if the expression is malformed. ### Method ```go Query(top *html.Node, expr string) (*html.Node, error) ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **top** (*html.Node) - The root node from which to start the search. - **expr** (string) - The XPath expression to evaluate. ### Request Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

    First

    Second

    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) node, err := htmlquery.Query(doc, `//div[@id="main"]/p[1]`) if err != nil { fmt.Printf("XPath error: %v\n", err) return } if node == nil { fmt.Println("no match") return } fmt.Println(htmlquery.InnerText(node)) // Output: First } ``` ### Response #### Success Response (200) - **node** (*html.Node) - The first `*html.Node` pointer matching the XPath expression, or `nil` if no match is found. - **err** (error) - An error if the XPath expression is malformed. #### Response Example None provided. ``` -------------------------------- ### Query First Node Matching XPath Source: https://context7.com/antchfx/htmlquery/llms.txt Returns the first node that matches the XPath expression, or nil if no match is found. Returns an error only if the XPath expression is malformed. Requires importing the htmlquery package. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

    First

    Second

    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) node, err := htmlquery.Query(doc, `//div[@id="main"]/p[1]`) if err != nil { fmt.Printf("XPath error: %v\n", err) return } if node == nil { fmt.Println("no match") return } fmt.Println(htmlquery.InnerText(node)) // Output: First } ``` -------------------------------- ### Find Nested Image Source within Anchor Source: https://github.com/antchfx/htmlquery/blob/master/README.md Find the first anchor element, then find the first image element within it, and extract its 'src' attribute. ```go a := htmlquery.FindOne(doc, "//a") img := htmlquery.FindOne(a, "//img") fmt.Prinln(htmlquery.SelectAttr(img, "src")) // output @src value ``` -------------------------------- ### Find Anchor Elements with Href Attribute Source: https://github.com/antchfx/htmlquery/blob/master/README.md Find all anchor () elements that possess an 'href' attribute. ```go list := htmlquery.Find(doc, "//a[@href]") ``` -------------------------------- ### Parse HTML from io.Reader Source: https://context7.com/antchfx/htmlquery/llms.txt Parses an HTML document from any io.Reader and returns the root *html.Node. This is useful when HTML is already in memory or arrives from any stream. ```APIDOC ## Parse(r io.Reader) (*html.Node, error) ### Description Parses an HTML document from any `io.Reader` and returns the root `*html.Node`. This is the lowest-level loader, useful when HTML is already in memory or arrives from any stream. ### Method ```go Parse(r io.Reader) (*html.Node, error) ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { s := `

    Hello, World!

    ` doc, err := htmlquery.Parse(strings.NewReader(s)) if err != nil { panic(err) } node := htmlquery.FindOne(doc, "//h1") fmt.Println(htmlquery.InnerText(node)) // Output: Hello, World! } ``` ### Response #### Success Response (200) - **doc** (*html.Node) - The root node of the parsed HTML document. - **err** (error) - An error if parsing fails. #### Response Example None provided. ``` -------------------------------- ### OutputHTML Source: https://context7.com/antchfx/htmlquery/llms.txt Serializes an HTML node back to an HTML string. The `self` parameter determines whether the node's own tag is included. ```APIDOC ## OutputHTML(n *html.Node, self bool) string ### Description Serializes a node back to an HTML string. When `self` is `true` the node's own tag is included; when `false` only the inner HTML (children) is returned. ### Parameters - **n** (*html.Node) - The HTML node to serialize. - **self** (bool) - If true, include the node's own tag; otherwise, only include its children. ### Return Value string - The serialized HTML string. ``` -------------------------------- ### Checking for Attribute Existence (Go) Source: https://context7.com/antchfx/htmlquery/llms.txt The `ExistsAttr` function returns `true` if an attribute is present on a node, irrespective of its value. This is useful for checking the existence of boolean attributes or attributes that might be empty. ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `` doc, _ := htmlquery.Parse(strings.NewReader(html)) input := htmlquery.FindOne(doc, "//input") fmt.Println(htmlquery.ExistsAttr(input, "checked")) // Output: true fmt.Println(htmlquery.ExistsAttr(input, "disabled")) // Output: false } ``` -------------------------------- ### FindOne Source: https://context7.com/antchfx/htmlquery/llms.txt Finds the first node matching the given XPath expression within the specified HTML node. Returns nil if no match is found. This is a panic-on-error variant. ```APIDOC ## FindOne ### Description Finds the first node matching the given XPath expression within the specified HTML node. Returns nil if no match is found. This is a panic-on-error variant. ### Method Signature `FindOne(top *html.Node, expr string) *html.Node` ### Parameters - **top** (*html.Node) - The root node to search within. - **expr** (string) - The XPath expression to evaluate. ### Returns - `*html.Node` - The first matching HTML node, or nil if no match is found. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `Test` doc, _ := htmlquery.Parse(strings.NewReader(html)) htmlNode := htmlquery.FindOne(doc, "//html") fmt.Println(htmlquery.SelectAttr(htmlNode, "lang")) // Output: en title := htmlquery.FindOne(doc, "//title") fmt.Println(htmlquery.InnerText(title)) // Output: Test missing := htmlquery.FindOne(doc, "//article") fmt.Println(missing) // Output: } ``` ``` -------------------------------- ### Cache Configuration Source: https://context7.com/antchfx/htmlquery/llms.txt Controls the built-in LRU XPath expression cache. Caching is enabled by default. ```APIDOC ## Cache Configuration ### `DisableSelectorCache` / `SelectorCacheMaxEntries` Package-level variables that control the built-in LRU XPath expression cache. Caching is enabled by default with a capacity of 50 entries and is safe for concurrent use. ### Parameters - **DisableSelectorCache** (bool) - Set to `true` to disable the cache. Defaults to `false`. - **SelectorCacheMaxEntries** (int) - Sets the maximum number of entries in the LRU cache. Defaults to 50. ``` -------------------------------- ### QueryAll Nodes Source: https://context7.com/antchfx/htmlquery/llms.txt Returns all nodes matching the XPath expression, or an error if the expression is invalid. Prefer this over Find when invalid expressions should be handled gracefully. ```APIDOC ## QueryAll(top *html.Node, expr string) ([]*html.Node, error) ### Description Returns all nodes matching the XPath expression, or an error if the expression is invalid. Prefer this over `Find` when invalid expressions should be handled gracefully. ### Method ```go QueryAll(top *html.Node, expr string) ([]*html.Node, error) ``` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **top** (*html.Node) - The root node from which to start the search. - **expr** (string) - The XPath expression to evaluate. ### Request Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `
    • One
    • Two
    • Three
    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) nodes, err := htmlquery.QueryAll(doc, "//li") if err != nil { fmt.Printf("invalid XPath: %v\n", err) return } for _, n := range nodes { fmt.Println(htmlquery.InnerText(n)) } // Output: // One // Two // Three } ``` ### Response #### Success Response (200) - **nodes** ([]*html.Node) - A slice of `*html.Node` pointers matching the XPath expression. - **err** (error) - An error if the XPath expression is invalid. #### Response Example None provided. ``` -------------------------------- ### Find the Third Anchor Element Source: https://github.com/antchfx/htmlquery/blob/master/README.md Locate and return the third anchor (
    ) element in the document using XPath indexing. ```go a := htmlquery.FindOne(doc, "//a[3]") ``` -------------------------------- ### Extract Href Attribute Values from Anchor Elements Source: https://github.com/antchfx/htmlquery/blob/master/README.md Find all anchor () elements with an 'href' attribute and extract only the 'href' attribute values. ```go list := htmlquery.Find(doc, "//a/@href") for _ , n := range list{ fmt.Println(htmlquery.InnerText(n)) // output @href value } ``` -------------------------------- ### ExistsAttr Source: https://context7.com/antchfx/htmlquery/llms.txt Returns true if the named attribute is present on the node, regardless of its value. ```APIDOC ## ExistsAttr ### Description Returns `true` if the named attribute is present on the node, regardless of its value. ### Method Signature `ExistsAttr(n *html.Node, name string) bool` ### Parameters - **n** (*html.Node) - The HTML node. - **name** (string) - The name of the attribute to check for. ### Returns - `bool` - `true` if the attribute exists, `false` otherwise. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `` doc, _ := htmlquery.Parse(strings.NewReader(html)) input := htmlquery.FindOne(doc, "//input") fmt.Println(htmlquery.ExistsAttr(input, "checked")) // Output: true fmt.Println(htmlquery.ExistsAttr(input, "disabled")) // Output: false } ``` ``` -------------------------------- ### Find Source: https://context7.com/antchfx/htmlquery/llms.txt Finds all nodes matching the given XPath expression within the specified HTML node. This is a panic-on-error variant, suitable for hardcoded expressions where errors indicate a programming mistake. ```APIDOC ## Find ### Description Finds all nodes matching the given XPath expression within the specified HTML node. This is a panic-on-error variant, suitable for hardcoded expressions where errors indicate a programming mistake. ### Method Signature `Find(top *html.Node, expr string) []*html.Node` ### Parameters - **top** (*html.Node) - The root node to search within. - **expr** (string) - The XPath expression to evaluate. ### Returns - `[]*html.Node` - A slice of matching HTML nodes. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `
    AB
    CD
    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) rows := htmlquery.Find(doc, "//tr") for _, row := range rows { cols := htmlquery.Find(row, "td") for j, col := range cols { fmt.Printf("cell[%d]: %s\n", j, htmlquery.InnerText(col)) } } } ``` ``` -------------------------------- ### Disable Query Caching Source: https://github.com/antchfx/htmlquery/blob/master/README.md Globally disable the query selector cache by setting the htmlquery.DisableSelectorCache variable to true. This can impact performance. ```go htmlquery.DisableSelectorCache = true ``` -------------------------------- ### SelectAttr Source: https://context7.com/antchfx/htmlquery/llms.txt Returns the value of the named attribute on an element node, or an empty string if absent. Also handles attribute nodes returned directly from XPath attribute-axis queries. ```APIDOC ## SelectAttr ### Description Returns the value of the named attribute on an element node, or an empty string if absent. Also handles attribute nodes returned directly from XPath attribute-axis queries. ### Method Signature `SelectAttr(n *html.Node, name string) string` ### Parameters - **n** (*html.Node) - The HTML node. - **name** (string) - The name of the attribute to retrieve. ### Returns - `string` - The attribute's value, or an empty string if the attribute is not found. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `A mountain` doc, _ := htmlquery.Parse(strings.NewReader(html)) img := htmlquery.FindOne(doc, "//img") fmt.Println(htmlquery.SelectAttr(img, "src")) // Output: photo.jpg fmt.Println(htmlquery.SelectAttr(img, "alt")) // Output: A mountain fmt.Println(htmlquery.SelectAttr(img, "missing")) // Output: (empty string) // Attribute nodes via XPath srcNode := htmlquery.FindOne(doc, "//img/@src") fmt.Println(htmlquery.InnerText(srcNode)) // Output: photo.jpg } ``` ``` -------------------------------- ### InnerText Source: https://context7.com/antchfx/htmlquery/llms.txt Returns the concatenated text content of a node and all its descendants, stripping all tags and skipping comment nodes. ```APIDOC ## InnerText ### Description Returns the concatenated text content of a node and all its descendants, stripping all tags and skipping comment nodes. ### Method Signature `InnerText(n *html.Node) string` ### Parameters - **n** (*html.Node) - The HTML node from which to extract text. ### Returns - `string` - The extracted text content. ### Example ```go package main import ( "fmt" "strings" "github.com/antchfx/htmlquery" ) func main() { html := `

    Hello, World!

    ` doc, _ := htmlquery.Parse(strings.NewReader(html)) div := htmlquery.FindOne(doc, "//div") fmt.Println(htmlquery.InnerText(div)) // Output: Hello, World! } ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.