### Install htmlquery Package
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Use go get to install the htmlquery package.
```go
go get github.com/antchfx/htmlquery
```
--------------------------------
### Quick Start Example: Bing Search Results
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
A comprehensive example demonstrating loading a URL, querying for news items, and extracting titles and links from search results.
```go
func main() {
doc, err := htmlquery.LoadURL("https://www.bing.com/search?q=golang")
if err != nil {
panic(err)
}
// Find all news item.
list, err := htmlquery.QueryAll(doc, "//ol/li")
if err != nil {
panic(err)
}
for i, n := range list {
a := htmlquery.FindOne(n, "//a")
if a != nil {
fmt.Printf("%d %s(%s)\n", i, htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href"))
}
}
}
```
--------------------------------
### Optimized Node Searching with Pre-compiled XPath (Go)
Source: https://context7.com/antchfx/htmlquery/llms.txt
Utilize `QuerySelectorAll` with a pre-compiled `*xpath.Expr` for maximum performance in tight loops. This bypasses string parsing and the LRU cache, making it ideal for repeated evaluations of the same XPath expression. The example shows compiling an expression once and reusing it across multiple HTML documents.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
"github.com/antchfx/xpath"
)
func main() {
// Compile once, reuse many times
expr, err := xpath.Compile("//a[@href]")
if err != nil {
panic(err)
}
pages := []string{
`
One`,
`TwoThree`,
}
for _, page := range pages {
doc, _ := htmlquery.Parse(strings.NewReader(page))
nodes := htmlquery.QuerySelectorAll(doc, expr)
for _, n := range nodes {
fmt.Printf("%s -> %s\n", htmlquery.InnerText(n), htmlquery.SelectAttr(n, "href"))
}
}
}
```
--------------------------------
### Find Nodes in HTML Document (Go)
Source: https://context7.com/antchfx/htmlquery/llms.txt
Use `Find` for panic-on-error node searching with XPath. It's suitable for hardcoded expressions where invalid syntax indicates a programming error. This example demonstrates finding all table rows and then cells within each row.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
A
B
C
D
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
rows := htmlquery.Find(doc, "//tr")
for _, row := range rows {
cols := htmlquery.Find(row, "td")
for j, col := range cols {
fmt.Printf("cell[%d]: %s\n", j, htmlquery.InnerText(col))
}
}
}
```
--------------------------------
### Selecting HTML Element Attributes (Go)
Source: https://context7.com/antchfx/htmlquery/llms.txt
Use `SelectAttr` to get the value of a specified attribute from an HTML node. It returns an empty string if the attribute is not present. This function also works with attribute nodes obtained directly from XPath queries.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := ``
doc, _ := htmlquery.Parse(strings.NewReader(html))
img := htmlquery.FindOne(doc, "//img")
fmt.Println(htmlquery.SelectAttr(img, "src")) // Output: photo.jpg
fmt.Println(htmlquery.SelectAttr(img, "alt")) // Output: A mountain
fmt.Println(htmlquery.SelectAttr(img, "missing")) // Output: (empty string)
// Attribute nodes via XPath
srcNode := htmlquery.FindOne(doc, "//img/@src")
fmt.Println(htmlquery.InnerText(srcNode)) // Output: photo.jpg
}
```
--------------------------------
### Extracting Inner Text from HTML Nodes (Go)
Source: https://context7.com/antchfx/htmlquery/llms.txt
The `InnerText` function retrieves the combined text content of a node and its descendants, stripping all tags and ignoring comments. This is useful for getting clean text from HTML elements.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
Hello, World!
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
div := htmlquery.FindOne(doc, "//div")
fmt.Println(htmlquery.InnerText(div)) // Output: Hello, World!
}
```
--------------------------------
### Load and Query HTML from a Local File
Source: https://context7.com/antchfx/htmlquery/llms.txt
Opens an HTML file from the local filesystem and parses it. Finds and prints the text content of all paragraph elements. Requires importing the htmlquery package.
```go
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadDoc("/var/www/html/index.html")
if err != nil {
fmt.Printf("failed to load file: %v\n", err)
return
}
// Find all paragraphs
for i, p := range htmlquery.Find(doc, "//p") {
fmt.Printf("p[%d]: %s\n", i+1, htmlquery.InnerText(p))
}
}
```
--------------------------------
### Load HTML Document from File Path
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Load an HTML document from a local file path.
```go
filePath := "/home/user/sample.html"
doc, err := htmlquery.LoadDoc(filePath)
```
--------------------------------
### Load HTML Document from URL
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Load an HTML document directly from a given URL.
```go
doc, err := htmlquery.LoadURL("http://example.com/")
```
--------------------------------
### Load HTML from File
Source: https://context7.com/antchfx/htmlquery/llms.txt
Opens an HTML file from the local filesystem and parses it into a node tree.
```APIDOC
## LoadDoc(path string) (*html.Node, error)
### Description
Opens an HTML file from the local filesystem and parses it into a node tree.
### Method
```go
LoadDoc(path string) (*html.Node, error)
```
### Parameters
#### Path Parameters
None
#### Query Parameters
None
#### Request Body
None
### Parameters
- **path** (string) - The path to the HTML file.
### Request Example
```go
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadDoc("/var/www/html/index.html")
if err != nil {
fmt.Printf("failed to load file: %v\n", err)
return
}
// Find all paragraphs
for i, p := range htmlquery.Find(doc, "//p") {
fmt.Printf("p[%d]: %s\n", i+1, htmlquery.InnerText(p))
}
}
```
### Response
#### Success Response (200)
- **doc** (*html.Node) - The root node of the parsed HTML document.
- **err** (error) - An error if opening or parsing the file fails.
#### Response Example
None provided.
```
--------------------------------
### Load and Query HTML from a URL
Source: https://context7.com/antchfx/htmlquery/llms.txt
Fetches an HTML page from a URL, handling compression and charsets. Extracts the page title and lists all links with their href attributes. Requires importing the htmlquery package.
```go
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadURL("https://example.com/")
if err != nil {
fmt.Printf("failed to load URL: %v\n", err)
return
}
// Extract the page title
title := htmlquery.FindOne(doc, "//title")
if title != nil {
fmt.Println("Page title:", htmlquery.InnerText(title))
}
// List all links
for _, a := range htmlquery.Find(doc, "//a[@href]") {
fmt.Printf("Link: %s -> %s\n",
htmlquery.InnerText(a),
htmlquery.SelectAttr(a, "href"),
)
}
}
```
--------------------------------
### CreateXPathNavigator
Source: https://context7.com/antchfx/htmlquery/llms.txt
Creates an `xpath.NodeNavigator` from an `*html.Node` tree, enabling advanced XPath evaluations.
```APIDOC
## CreateXPathNavigator(top *html.Node) *NodeNavigator
### Description
Creates an `xpath.NodeNavigator` backed by an `*html.Node` tree, enabling direct use of the `github.com/antchfx/xpath` evaluation API for advanced use cases such as XPath `Evaluate()` (returning numbers, booleans, or strings rather than node-sets).
### Parameters
- **top** (*html.Node) - The root node of the HTML document to create a navigator for.
### Return Value
*NodeNavigator - An XPath navigator for the given HTML node tree.
```
--------------------------------
### Create XPath Navigator for Advanced Evaluation
Source: https://context7.com/antchfx/htmlquery/llms.txt
Creates an XPath navigator from an HTML node tree for advanced XPath evaluations, such as returning numbers, booleans, or strings using `xpath.Evaluate()`.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
"github.com/antchfx/xpath"
)
func main() {
html := `
A
B
C
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
// Count nodes using XPath Evaluate (returns float64 for count())
expr, _ := xpath.Compile("count(//li)")
nav := htmlquery.CreateXPathNavigator(doc)
count := expr.Evaluate(nav).(float64)
fmt.Printf("Number of
elements: %.0f\n", count) // Output: 3
// Check a boolean expression
exprBool, _ := xpath.Compile("boolean(//li[text()='B'])")
result := exprBool.Evaluate(htmlquery.CreateXPathNavigator(doc)).(bool)
fmt.Println("Contains 'B':", result) // Output: true
}
```
--------------------------------
### Load HTML from URL
Source: https://context7.com/antchfx/htmlquery/llms.txt
Fetches an HTML page over HTTP, automatically handling gzip and deflate Content-Encoding, charset detection, and response body cleanup.
```APIDOC
## LoadURL(url string) (*html.Node, error)
### Description
Fetches an HTML page over HTTP, automatically handling gzip and deflate Content-Encoding, charset detection, and response body cleanup.
### Method
```go
LoadURL(url string) (*html.Node, error)
```
### Parameters
#### Path Parameters
None
#### Query Parameters
None
#### Request Body
None
### Parameters
- **url** (string) - The URL of the HTML page to fetch.
### Request Example
```go
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadURL("https://example.com/")
if err != nil {
fmt.Printf("failed to load URL: %v\n", err)
return
}
// Extract the page title
title := htmlquery.FindOne(doc, "//title")
if title != nil {
fmt.Println("Page title:", htmlquery.InnerText(title))
}
// List all links
for _, a := range htmlquery.Find(doc, "//a[@href]") {
fmt.Printf("Link: %s -> %s\n",
htmlquery.InnerText(a),
htmlquery.SelectAttr(a, "href"),
)
}
}
```
### Response
#### Success Response (200)
- **doc** (*html.Node) - The root node of the parsed HTML document.
- **err** (error) - An error if fetching or parsing fails.
#### Response Example
None provided.
```
--------------------------------
### Query All Elements with XPath
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Execute an XPath query to find all matching nodes in an HTML document. Panics if the XPath expression is invalid.
```go
nodes, err := htmlquery.QueryAll(doc, "//a")
if err != nil {
panic(`not a valid XPath expression.`)
}
```
--------------------------------
### Load HTML Document from String Reader
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Parse an HTML document from an io.Reader, such as a string reader.
```go
s := `....`
doc, err := htmlquery.Parse(strings.NewReader(s))
```
--------------------------------
### Find Single Node and Attributes (Go)
Source: https://context7.com/antchfx/htmlquery/llms.txt
Use `FindOne` to retrieve the first matching node or `nil` if not found. This function is useful for accessing specific elements like the `` tag or retrieving attribute values. It also handles cases where the target node is missing.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `Test`
doc, _ := htmlquery.Parse(strings.NewReader(html))
htmlNode := htmlquery.FindOne(doc, "//html")
fmt.Println(htmlquery.SelectAttr(htmlNode, "lang")) // Output: en
title := htmlquery.FindOne(doc, "//title")
fmt.Println(htmlquery.InnerText(title)) // Output: Test
missing := htmlquery.FindOne(doc, "//article")
fmt.Println(missing) // Output:
}
```
--------------------------------
### Find All Anchor Elements
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Find all anchor () elements within the loaded HTML document.
```go
list := htmlquery.Find(doc, "//a")
```
--------------------------------
### Configure HTML Query Cache
Source: https://context7.com/antchfx/htmlquery/llms.txt
Control the built-in LRU XPath expression cache using `SelectorCacheMaxEntries` and `DisableSelectorCache`. Caching is enabled by default with a capacity of 50 entries and is safe for concurrent use.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
nodes, err := htmlquery.QueryAll(doc, "//li")
if err != nil {
fmt.Printf("invalid XPath: %v\n", err)
return
}
for _, n := range nodes {
fmt.Println(htmlquery.InnerText(n))
}
// Output:
// One
// Two
// Three
}
```
--------------------------------
### Query First Node
Source: https://context7.com/antchfx/htmlquery/llms.txt
Returns the first matching node, or nil if none match. Returns an error only if the expression is malformed.
```APIDOC
## Query(top *html.Node, expr string) (*html.Node, error)
### Description
Returns the first matching node, or `nil` if none match. Returns an error only if the expression is malformed.
### Method
```go
Query(top *html.Node, expr string) (*html.Node, error)
```
### Parameters
#### Path Parameters
None
#### Query Parameters
None
#### Request Body
None
### Parameters
- **top** (*html.Node) - The root node from which to start the search.
- **expr** (string) - The XPath expression to evaluate.
### Request Example
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
First
Second
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
node, err := htmlquery.Query(doc, `//div[@id="main"]/p[1]`)
if err != nil {
fmt.Printf("XPath error: %v\n", err)
return
}
if node == nil {
fmt.Println("no match")
return
}
fmt.Println(htmlquery.InnerText(node)) // Output: First
}
```
### Response
#### Success Response (200)
- **node** (*html.Node) - The first `*html.Node` pointer matching the XPath expression, or `nil` if no match is found.
- **err** (error) - An error if the XPath expression is malformed.
#### Response Example
None provided.
```
--------------------------------
### Query First Node Matching XPath
Source: https://context7.com/antchfx/htmlquery/llms.txt
Returns the first node that matches the XPath expression, or nil if no match is found. Returns an error only if the XPath expression is malformed. Requires importing the htmlquery package.
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
rows := htmlquery.Find(doc, "//tr")
for _, row := range rows {
cols := htmlquery.Find(row, "td")
for j, col := range cols {
fmt.Printf("cell[%d]: %s\n", j, htmlquery.InnerText(col))
}
}
}
```
```
--------------------------------
### Disable Query Caching
Source: https://github.com/antchfx/htmlquery/blob/master/README.md
Globally disable the query selector cache by setting the htmlquery.DisableSelectorCache variable to true. This can impact performance.
```go
htmlquery.DisableSelectorCache = true
```
--------------------------------
### SelectAttr
Source: https://context7.com/antchfx/htmlquery/llms.txt
Returns the value of the named attribute on an element node, or an empty string if absent. Also handles attribute nodes returned directly from XPath attribute-axis queries.
```APIDOC
## SelectAttr
### Description
Returns the value of the named attribute on an element node, or an empty string if absent. Also handles attribute nodes returned directly from XPath attribute-axis queries.
### Method Signature
`SelectAttr(n *html.Node, name string) string`
### Parameters
- **n** (*html.Node) - The HTML node.
- **name** (string) - The name of the attribute to retrieve.
### Returns
- `string` - The attribute's value, or an empty string if the attribute is not found.
### Example
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := ``
doc, _ := htmlquery.Parse(strings.NewReader(html))
img := htmlquery.FindOne(doc, "//img")
fmt.Println(htmlquery.SelectAttr(img, "src")) // Output: photo.jpg
fmt.Println(htmlquery.SelectAttr(img, "alt")) // Output: A mountain
fmt.Println(htmlquery.SelectAttr(img, "missing")) // Output: (empty string)
// Attribute nodes via XPath
srcNode := htmlquery.FindOne(doc, "//img/@src")
fmt.Println(htmlquery.InnerText(srcNode)) // Output: photo.jpg
}
```
```
--------------------------------
### InnerText
Source: https://context7.com/antchfx/htmlquery/llms.txt
Returns the concatenated text content of a node and all its descendants, stripping all tags and skipping comment nodes.
```APIDOC
## InnerText
### Description
Returns the concatenated text content of a node and all its descendants, stripping all tags and skipping comment nodes.
### Method Signature
`InnerText(n *html.Node) string`
### Parameters
- **n** (*html.Node) - The HTML node from which to extract text.
### Returns
- `string` - The extracted text content.
### Example
```go
package main
import (
"fmt"
"strings"
"github.com/antchfx/htmlquery"
)
func main() {
html := `
Hello, World!
`
doc, _ := htmlquery.Parse(strings.NewReader(html))
div := htmlquery.FindOne(doc, "//div")
fmt.Println(htmlquery.InnerText(div)) // Output: Hello, World!
}
```
```
=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.