### Install Crawlbase Go SDK Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Use `go get` to install the Crawlbase Go SDK. Requires Go 1.21+. ```sh go get github.com/crawlbase/crawlbase-go ``` -------------------------------- ### Quickstart: Basic API Request Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Initialize the CrawlingAPI with your token and make a GET request. Handles basic crawling and checks for a 200 status code. ```go package main import ( "fmt" "log" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI("YOUR_TOKEN") if err != nil { log.Fatal(err) } res, err := api.Get("https://github.com/anthropic", nil) if err != nil { log.Fatal(err) } if res.StatusCode == 200 { fmt.Println(res.Body) } } ``` -------------------------------- ### Fetch URL with Context using CrawlingAPI.GetWithContext in Go Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Execute a GET request with context support for cancellation and deadlines. This is crucial for applications that need to respect upstream timeouts, such as HTTP handlers or gRPC servers. ```go package main import ( "context" "fmt" "log" "time" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI("YOUR_TOKEN") if err != nil { log.Fatal(err) } ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() res, err := api.GetWithContext(ctx, "https://example.com/", nil) if err != nil { log.Fatal(err) // includes context deadline exceeded / canceled errors } fmt.Printf("StatusCode=%d PCStatus=%d\n", res.StatusCode, res.PCStatus) // Output: StatusCode=200 PCStatus=200 } ``` -------------------------------- ### Fetch URL with CrawlingAPI.Get in Go Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Perform a GET request to a URL using the Crawlbase proxy. Check res.PCStatus for crawl success, as HTTP status codes may not reflect the outcome. Consider using a JavaScript token for anti-bot challenges. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get("https://example.com/", nil) if err != nil { log.Fatal(err) } // Branch on PCStatus — the Crawlbase verdict on the target. switch res.PCStatus { case 200: fmt.Printf("Success: %d bytes from %s\n", len(res.Body), res.URL) case 520, 525: // 520 = empty body, 525 = anti-bot block. Switch to JS token. fmt.Println("Anti-bot detected — retry with JavaScript token") case 521, 522, 523: // Target unreachable / timed out. fmt.Println("Target unreachable — backoff and retry") default: fmt.Printf("Crawl failed: pc_status=%d\n", res.PCStatus) } } ``` -------------------------------- ### CrawlingAPI.Get Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Fetches a target URL via GET request through the Crawlbase proxy. Options can be provided as a map[string]string for Crawling API parameters. Always check `res.PCStatus` to determine the success of the crawl. ```APIDOC ## CrawlingAPI.Get ### Description Fetches `targetURL` through the Crawlbase proxy. Pass `nil` for options for a plain crawl, or provide any Crawling API parameters as a `map[string]string`. Always check `res.PCStatus` (not just `res.StatusCode`) to determine whether the target was successfully crawled. ### Method GET ### Endpoint `/v1/` (Implicitly used by the SDK) ### Parameters #### Path Parameters None #### Query Parameters None (Parameters are passed via the `options` map) #### Request Body None ### Parameters (Options Map) - **param_name** (string) - Required/Optional - Description of the Crawling API parameter. ### Request Example ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get("https://example.com/", nil) if err != nil { log.Fatal(err) } // Branch on PCStatus — the Crawlbase verdict on the target. switch res.PCStatus { case 200: fmt.Printf("Success: %d bytes from %s\n", len(res.Body), res.URL) case 520, 525: // 520 = empty body, 525 = anti-bot block. Switch to JS token. fmt.Println("Anti-bot detected — retry with JavaScript token") case 521, 522, 523: // Target unreachable / timed out. fmt.Println("Target unreachable — backoff and retry") default: fmt.Printf("Crawl failed: pc_status=%d\n", res.PCStatus) } } ``` ### Response #### Success Response (200) - **PCStatus** (int) - Crawlbase's verdict on the target. - **Body** ([]byte) - The raw body of the crawled page. - **Headers** (map[string]string) - Lowercased HTTP headers. - **URL** (string) - The final URL after any redirects. - **RID** (string) - The request ID. - **OriginalStatus** (int) - The original HTTP status code from the target. - **JSON** (interface{}) - Parsed JSON response if applicable. #### Response Example ```json { "PCStatus": 200, "Body": "...", "Headers": {"content-type": "text/html"}, "URL": "https://example.com/", "RID": "some-request-id", "OriginalStatus": 200, "JSON": null } ``` ``` -------------------------------- ### CrawlingAPI.GetWithContext Source: https://context7.com/crawlbase/crawlbase-go/llms.txt The `*WithContext` variant of `Get`, allowing for cancellation and deadline propagation. Essential for HTTP handlers, gRPC servers, or any code that needs to respect upstream timeouts or cancellation signals. ```APIDOC ## CrawlingAPI.GetWithContext ### Description The `*WithContext` variant of `Get`. Required in HTTP handlers, gRPC servers, or any code path that must respect upstream timeouts or cancellation signals. ### Method GET ### Endpoint `/v1/` (Implicitly used by the SDK) ### Parameters #### Path Parameters None #### Query Parameters None (Parameters are passed via the `options` map) #### Request Body None ### Parameters (Options Map) - **param_name** (string) - Required/Optional - Description of the Crawling API parameter. ### Request Example ```go package main import ( "context" "fmt" "log" "time" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI("YOUR_TOKEN") if err != nil { log.Fatal(err) } ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() res, err := api.GetWithContext(ctx, "https://example.com/", nil) if err != nil { log.Fatal(err) // includes context deadline exceeded / canceled errors } fmt.Printf("StatusCode=%d PCStatus=%d\n", res.StatusCode, res.PCStatus) // Output: StatusCode=200 PCStatus=200 } ``` ### Response #### Success Response (200) - **PCStatus** (int) - Crawlbase's verdict on the target. - **Body** ([]byte) - The raw body of the crawled page. - **Headers** (map[string]string) - Lowercased HTTP headers. - **URL** (string) - The final URL after any redirects. - **RID** (string) - The request ID. - **OriginalStatus** (int) - The original HTTP status code from the target. - **JSON** (interface{}) - Parsed JSON response if applicable. #### Response Example ```json { "PCStatus": 200, "Body": "...", "Headers": {"content-type": "text/html"}, "URL": "https://example.com/", "RID": "some-request-id", "OriginalStatus": 200, "JSON": null } ``` ``` -------------------------------- ### JavaScript Rendering with Options Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Use a JavaScript token and specify options like `page_wait`, `ajax_wait`, and `scroll` for rendering dynamic content. ```go api, _ := crawlbase.NewCrawlingAPI("YOUR_JS_TOKEN") res, _ := api.Get("https://spa.example.com", map[string]string{ "page_wait": "2000", "ajax_wait": "true", "scroll": "true", }) ``` -------------------------------- ### Construct CrawlingAPI Client in Go Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Instantiate a CrawlingAPI client using either a normal token for static content or a JavaScript token for SPAs. Ensure a token is provided to avoid ErrTokenRequired. ```go package main import ( "fmt" "log" "github.com/crawlbase/crawlbase-go" ) func main() { // Normal token — faster and cheaper for static HTML/JSON. api, err := crawlbase.NewCrawlingAPI("YOUR_NORMAL_TOKEN") if err != nil { log.Fatal(err) // crawlbase: token is required (if empty) } // JavaScript token — required for SPAs and browser-rendered pages. jsAPI, err := crawlbase.NewCrawlingAPI("YOUR_JS_TOKEN") if err != nil { log.Fatal(err) } fmt.Println(api.Token) // YOUR_NORMAL_TOKEN fmt.Println(jsAPI.Token) // YOUR_JS_TOKEN } ``` -------------------------------- ### Manage Multiple Tokens Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Hold separate clients for normal and JavaScript tokens if you need to switch between them. ```go api, _ := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) js, _ := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_JS_TOKEN")) ``` -------------------------------- ### CrawlingAPI.Get with async and callback options Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Enable asynchronous crawling by setting `async=true` and provide a webhook URL with `callback`. Crawlbase will deliver the results to your specified webhook. ```APIDOC ## CrawlingAPI.Get with `async` + `callback` ### Description Use async mode to queue requests without holding a concurrency slot. Crawlbase delivers the result to your webhook URL; the `res.RID` field is the correlation ID. ### Method `Get(url string, options map[string]string) (*Response, error)` ### Parameters #### URL - **url** (string) - The URL to crawl asynchronously. #### Options - **async** (string) - Set to "true" to enable asynchronous crawling. - **callback** (string) - The webhook URL to receive the crawl results. ### Request Example ```go res, err := api.Get("https://example.com/", map[string]string{ "async": "true", "callback": "https://your-app.com/crawlbase-webhook", }) ``` ### Response #### Success Response (200) - **RID** (string) - The correlation ID for the asynchronous request. #### Response Example ```go fmt.Printf("Queued. RID=%s\n", res.RID) ``` ### Note Crawlbase will POST the crawl result to the specified callback URL with the RID in the headers. ``` -------------------------------- ### JavaScript Rendering with CrawlingAPI.Get Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Use the JavaScript token with rendering options like `page_wait`, `ajax_wait`, and `scroll` to handle SPAs and client-side rendered content. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { jsAPI, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_JS_TOKEN")) if err != nil { log.Fatal(err) } res, err := jsAPI.Get("https://spa.example.com", map[string]string{ "page_wait": "2000", // wait 2 s after load "ajax_wait": "true", // also wait for XHR/fetch to settle "scroll": "true", // scroll to trigger lazy-loaded content }) if err != nil { log.Fatal(err) } fmt.Printf("Rendered HTML length: %d\n", len(res.Body)) } ``` -------------------------------- ### Custom http.Client for Tracing and Transport Overrides Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Demonstrates replacing the default http.Client to inject tracing, custom TLS configurations, or a test transport. The Timeout field can also be tuned for per-client request deadlines. ```go package main import ( "crypto/tls" "fmt" "log" "net/http" "os" "time" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } // Override timeout for slow targets. api.Timeout = 120 * time.Second // Swap in a custom http.Client (e.g. with custom TLS or tracing). api.HTTPClient = &http.Client{ Timeout: api.Timeout, Transport: &http.Transport{ TLSClientConfig: &tls.Config{InsecureSkipVerify: false}, MaxIdleConns: 100, IdleConnTimeout: 90 * time.Second, }, } res, err := api.Get("https://example.com/", nil) if err != nil { log.Fatal(err) } fmt.Printf("StatusCode=%d PCStatus=%d\n", res.StatusCode, res.PCStatus) } ``` -------------------------------- ### Async Request with Webhook Callback Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Initiate an asynchronous request and provide a callback URL for receiving the result. The `RID` can be used to correlate the eventual webhook delivery. ```go res, _ := api.Get("https://example.com/", map[string]string{ "async": "true", "callback": "https://your-app.com/webhook", }) fmt.Println(res.RID) // correlate the eventual webhook delivery ``` -------------------------------- ### Use Built-in Scraper for Product Details Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Specify a scraper like `amazon-product-details` in the options to extract structured data from e-commerce sites. ```go api, _ := crawlbase.NewCrawlingAPI("YOUR_TOKEN") res, _ := api.Get( "https://www.amazon.com/dp/B08N5WRWNW", map[string]string{"scraper": "amazon-product-details"}, ) fmt.Println(res.JSON["name"], res.JSON["price"]) ``` -------------------------------- ### Generate Screenshot Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Request a screenshot of a webpage using the `screenshot` option. The response body contains image bytes, which can be saved to a file. ```go api, _ := crawlbase.NewCrawlingAPI("YOUR_JS_TOKEN") res, _ := api.Get("https://www.apple.com/", map[string]string{ "screenshot": "true", }) img, _ := crawlbase.ImageBytes(res) _ = os.WriteFile("apple.png", img, 0o644) ``` -------------------------------- ### Context for Cancellation Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Use `GetWithContext` to pass a `context.Context` for request cancellation, such as timeouts. Ensure `defer cancel()` is used. ```go ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) deffer cancel() res, err := api.GetWithContext(ctx, "https://example.com/", nil) ``` -------------------------------- ### Retry with Exponential Backoff and Jitter Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Implements a production retry pattern using PCStatus to differentiate retriable failures from hard errors. Includes exponential backoff with jitter for retriable server errors. ```go package main import ( "fmt" "log" "math" "math/rand" "os" "time" "github.com/crawlbase/crawlbase-go" ) func crawlWithRetry(api *crawlbase.CrawlingAPI, targetURL string, maxAttempts int) (*crawlbase.Response, error) { for i := 0; i < maxAttempts; i++ { res, err := api.Get(targetURL, nil) if err != nil { return nil, err // network/transport error — not retriable } if res.PCStatus == 200 { return res, nil // success } // Hard client errors are not retriable. if res.StatusCode >= 400 && res.StatusCode < 500 { return nil, fmt.Errorf("client error %d for %s", res.StatusCode, targetURL) } // 520/521/522/523/525 are retriable. Backoff with jitter. jitter := time.Duration(rand.Float64() * math.Pow(2, float64(i)) * float64(time.Second)) time.Sleep(jitter) } return nil, fmt.Errorf("exhausted %d attempts for %s", maxAttempts, targetURL) } func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := crawlWithRetry(api, "https://example.com/", 5) if err != nil { log.Fatal(err) } fmt.Printf("Crawled %d bytes from %s\n", len(res.Body), res.URL) } ``` -------------------------------- ### CrawlingAPI.Get with JavaScript rendering options Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Use the JavaScript token with rendering options like `page_wait`, `ajax_wait`, and `scroll` to handle SPAs and pages with client-side rendering. ```APIDOC ## CrawlingAPI.Get with JavaScript rendering options ### Description Pass the JavaScript token together with rendering options to handle SPAs and pages that hide content behind client-side execution. Combine `page_wait` (fixed delay in ms), `ajax_wait` (network-idle), and `scroll` (lazy-load trigger) as needed. ### Method `Get(url string, options map[string]string) (*Response, error)` ### Parameters #### URL - **url** (string) - The URL to crawl. #### Options - **page_wait** (string) - Fixed delay in milliseconds after page load. - **ajax_wait** (string) - Wait for XHR/fetch requests to settle (e.g., "true"). - **scroll** (string) - Scroll to trigger lazy-loaded content (e.g., "true"). ### Request Example ```go res, err := jsAPI.Get("https://spa.example.com", map[string]string{ "page_wait": "2000", "ajax_wait": "true", "scroll": "true", }) ``` ### Response #### Success Response (200) - **Body** ([]byte) - The rendered HTML content. #### Response Example ```go fmt.Printf("Rendered HTML length: %d\n", len(res.Body)) ``` ``` -------------------------------- ### Submit PUT requests with Crawlbase Go SDK Source: https://context7.com/crawlbase/crawlbase-go/llms.txt The PUT counterpart to Post. Same body-encoding rules and options bag; useful for targets that expose REST APIs requiring PUT for updates. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } payload := `{"status": "active", "id": "42"}` res, err := api.Put( "https://api.example.com/items/42", payload, map[string]string{"post_content_type": "application/json"}, ) if err != nil { log.Fatal(err) } fmt.Printf("PUT response: status=%d pc_status=%d\n", res.StatusCode, res.PCStatus) } ``` -------------------------------- ### Screenshot Capture with CrawlingAPI.Get Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Capture a full-page screenshot by passing `screenshot=true` (requires JavaScript token). The response body is a base64-encoded PNG; use `ImageBytes` to decode it. ```go package main import ( "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_JS_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get("https://www.apple.com/", map[string]string{ "screenshot": "true", }) if err != nil { log.Fatal(err) } if res.PCStatus != 200 { log.Fatalf("screenshot failed: pc_status=%d", res.PCStatus) } // Decode base64 body → raw PNG bytes. img, err := crawlbase.ImageBytes(res) if err != nil { log.Fatal(err) } if err := os.WriteFile("apple.png", img, 0o644); err != nil { log.Fatal(err) } // apple.png is a full-page PNG screenshot of apple.com } ``` -------------------------------- ### CrawlingAPI.Get with screenshot option Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Capture a full-page screenshot by setting `screenshot=true`. This requires a JavaScript token, and the response body will be a base64-encoded PNG. ```APIDOC ## CrawlingAPI.Get with `screenshot` option ### Description Take a full-page screenshot by passing `screenshot=true` (requires JavaScript token). The response body is a base64-encoded PNG; use `ImageBytes` to decode it. ### Method `Get(url string, options map[string]string) (*Response, error)` ### Parameters #### URL - **url** (string) - The URL to capture a screenshot of. #### Options - **screenshot** (string) - Set to "true" to capture a screenshot. ### Request Example ```go res, err := api.Get("https://www.apple.com/", map[string]string{ "screenshot": "true", }) ``` ### Response #### Success Response (200) - **Body** ([]byte) - The base64-encoded PNG image data. #### Response Example ```go img, err := crawlbase.ImageBytes(res) if err != nil { log.Fatal(err) } os.WriteFile("apple.png", img, 0o644) ``` ``` -------------------------------- ### Async Crawling with Webhook Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Use async mode with `async=true` and `callback=` to queue requests without holding a concurrency slot. Crawlbase delivers the result to your webhook URL; the `res.RID` field is the correlation ID. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get("https://example.com/", map[string]string{ "async": "true", "callback": "https://your-app.com/crawlbase-webhook", }) if err != nil { log.Fatal(err) } // RID correlates the eventual webhook POST delivery. fmt.Printf("Queued. RID=%s\n", res.RID) // Crawlbase will POST the crawl result to your-app.com/crawlbase-webhook // with the same RID in the headers. } ``` -------------------------------- ### Geo-Routing with CrawlingAPI.Get Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Route requests through proxies in a specific country using the `country` option (ISO 3166-1 alpha-2 code). This is useful for accessing geo-gated content or performing price comparisons. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } // Crawl as if from Germany. res, err := api.Get( "https://www.amazon.de/dp/B08N5WRWNW", map[string]string{ "country": "DE", "scraper": "amazon-product-details", }, ) if err != nil { log.Fatal(err) } fmt.Println(res.JSON["price"]) // German price in EUR } ``` -------------------------------- ### Retry with Exponential Backoff Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Implement a retry mechanism with exponential backoff for handling transient errors. The function returns the response on success or an error after exceeding attempts. ```go func crawl(api *crawlbase.CrawlingAPI, url string, attempts int) (*crawlbase.Response, error) { for i := 0; i < attempts; i++ { res, err := api.Get(url, nil) if err != nil { return nil, err } if res.StatusCode == 200 && res.PCStatus == 200 { return res, nil } if res.StatusCode >= 400 && res.StatusCode < 500 { return nil, fmt.Errorf("client error %d: %s", res.StatusCode, url) } d := time.Duration(rand.Float64() * math.Pow(2, float64(i)) * float64(time.Second)) time.Sleep(d) } return nil, fmt.Errorf("failed: %s", url) } ``` -------------------------------- ### Handle Crawlbase API Responses and Retries in Go Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Use this code to inspect the `PCStatus` from a Crawlbase API response to determine appropriate retry strategies. Different status codes indicate issues like empty bodies, anti-bot measures, or target unreachability, each requiring a specific retry approach. ```go res, err := api.Get(url, nil) if err != nil { return err } switch res.PCStatus { case 200: use(res.Body) case 520, 525: // 520 = empty body, 525 = anti-bot couldn't be solved. // Switch to JS token and retry. case 521, 522, 523: // Target unreachable / timed out. Backoff + retry. default: log.Printf("crawl failed: url=%s pc_status=%d", url, res.PCStatus) } ``` -------------------------------- ### NewCrawlingAPI Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Constructs a CrawlingAPI client bound to a single token. Use the normal token for static content or the JavaScript token for SPAs. Returns ErrTokenRequired if the token is empty. The client is safe for concurrent use. ```APIDOC ## NewCrawlingAPI ### Description Creates a `CrawlingAPI` client bound to a single token. Pass the normal (TCP) token for static HTML/JSON targets or the JavaScript token for SPA pages requiring browser rendering. The constructor returns `ErrTokenRequired` on an empty token. The client is goroutine-safe; construct once and share across goroutines. ### Usage ```go package main import ( "fmt" "log" "github.com/crawlbase/crawlbase-go" ) func main() { // Normal token — faster and cheaper for static HTML/JSON. api, err := crawlbase.NewCrawlingAPI("YOUR_NORMAL_TOKEN") if err != nil { log.Fatal(err) // crawlbase: token is required (if empty) } // JavaScript token — required for SPAs and browser-rendered pages. jsAPI, err := crawlbase.NewCrawlingAPI("YOUR_JS_TOKEN") if err != nil { log.Fatal(err) } fmt.Println(api.Token) // YOUR_NORMAL_TOKEN fmt.Println(jsAPI.Token) // YOUR_JS_TOKEN } ``` ``` -------------------------------- ### CrawlingAPI.Get with scraper option Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Utilize built-in scrapers by passing `scraper=` in options to extract structured data from supported websites. The response body and `res.JSON` will be populated with extracted fields. ```APIDOC ## CrawlingAPI.Get with `scraper` option ### Description Built-in scrapers extract structured data from supported sites without writing a parser. Pass `scraper=` in options; the response body and `res.JSON` are populated with the extracted fields. ### Method `Get(url string, options map[string]string) (*Response, error)` ### Parameters #### URL - **url** (string) - The URL to crawl. #### Options - **scraper** (string) - The name of the built-in scraper to use (e.g., "amazon-product-details"). ### Request Example ```go res, err := api.Get( "https://www.amazon.com/dp/B08N5WRWNW", map[string]string{"scraper": "amazon-product-details"}, ) ``` ### Response #### Success Response (200) - **PCStatus** (int) - The status code of the scraping process. - **JSON** (map[string]interface{}) - A map containing the extracted structured data. #### Response Example ```go fmt.Println(res.JSON["name"]) fmt.Println(res.JSON["price"]) fmt.Println(res.JSON["rating"]) ``` ### Available Scrapers - "google-serp" - "google-shopping" - "walmart-product-details" - "amazon-product-details" - "email-extractor" - "twitter-profile" - "linkedin-profile" - etc. ``` -------------------------------- ### Inspect Crawlbase Go SDK response struct Source: https://context7.com/crawlbase/crawlbase-go/llms.txt The Response struct is returned by every API verb. StatusCode reflects the SDK→Crawlbase HTTP status; PCStatus reflects Crawlbase's verdict on the crawled target. Always branch on PCStatus for retry logic. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get("https://httpbin.org/headers", nil) if err != nil { log.Fatal(err) } fmt.Println("StatusCode: ", res.StatusCode) // HTTP status of SDK→Crawlbase request fmt.Println("PCStatus: ", res.PCStatus) // Crawlbase verdict on the target (200 = success) fmt.Println("OriginalStatus: ", res.OriginalStatus) // HTTP status the target returned to Crawlbase fmt.Println("URL: ", res.URL) // Final URL after target-side redirects fmt.Println("RID: ", res.RID) // Request ID (set for async/store calls) fmt.Println("Body length: ", len(res.Body)) // Raw page content fmt.Println("Headers[url]: ", res.Headers["url"]) // All response headers, lowercased // res.JSON is auto-populated when Content-Type is JSON (scraper/format=json calls). if res.JSON != nil { fmt.Println("JSON keys:", res.JSON) } } ``` -------------------------------- ### CrawlingAPI.Put Source: https://context7.com/crawlbase/crawlbase-go/llms.txt The PUT counterpart to `Post`. It allows submitting PUT requests to target URLs, supporting the same body-encoding rules and options bag. This is useful for targets that expose REST APIs requiring PUT for updates. ```APIDOC ## PUT /crawl ### Description Submits a PUT request to the target URL through Crawlbase. Supports the same body-encoding rules and options bag as the `Post` method. Useful for targets that expose REST APIs requiring PUT for updates. ### Method PUT ### Endpoint `/crawl` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body - **body** (string | []byte) - Required - The data to be sent in the PUT request. - **options** (map[string]string) - Optional - Additional options, such as `post_content_type` to specify the Content-Type. ### Request Example ```go payload := `{"status": "active", "id": "42"}` res, err := api.Put("https://api.example.com/items/42", payload, map[string]string{"post_content_type": "application/json"}) ``` ### Response #### Success Response (200) - **Body** ([]byte) - The raw response body from the crawled page. - **StatusCode** (int) - HTTP status of the SDK to Crawlbase request. - **PCStatus** (int) - Crawlbase's verdict on the crawled target (200 = success). - **OriginalStatus** (int) - HTTP status the target returned to Crawlbase. - **URL** (string) - Final URL after target-side redirects. - **RID** (string) - Request ID (set for async/store calls). - **Headers** (map[string]string) - All response headers, lowercased. - **JSON** (interface{}) - Auto-populated when Content-Type is JSON. ``` -------------------------------- ### Built-in Scrapers with CrawlingAPI.Get Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Utilize built-in scrapers by passing `scraper=` in options to extract structured data from supported sites. The response body and `res.JSON` are populated with extracted fields. ```go package main import ( "fmt" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } res, err := api.Get( "https://www.amazon.com/dp/B08N5WRWNW", map[string]string{"scraper": "amazon-product-details"}, ) if err != nil { log.Fatal(err) } if res.PCStatus != 200 { log.Fatalf("scrape failed: pc_status=%d", res.PCStatus) } // res.JSON is auto-parsed — no json.Unmarshal needed. fmt.Println(res.JSON["name"]) // e.g. "Echo Dot (4th Gen)" fmt.Println(res.JSON["price"]) // e.g. "$49.99" fmt.Println(res.JSON["rating"]) // e.g. "4.7" // Other available scrapers: // "google-serp", "google-shopping", "walmart-product-details", // "email-extractor", "twitter-profile", "linkedin-profile", etc. } ``` -------------------------------- ### Submit POST requests with Crawlbase Go SDK Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Submits a POST request to the target URL through Crawlbase. Accepts url.Values (form-encoded), map[string]string, string, or []byte as the body. Set post_content_type in options to override Content-Type for JSON bodies. ```go package main import ( "fmt" "log" "net/url" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, err := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_TOKEN")) if err != nil { log.Fatal(err) } // Form-encoded POST (default). formData := url.Values{} formData.Set("q", "golang") formData.Set("page", "1") res, err := api.Post("https://producthunt.com/search", formData, nil) if err != nil { log.Fatal(err) } fmt.Printf("Form POST: %d bytes\n", len(res.Body)) // JSON POST — override Content-Type via post_content_type option. jsonBody := `{"query": "golang", "page": 1}` res2, err := api.Post( "https://api.example.com/search", jsonBody, map[string]string{"post_content_type": "application/json"}, ) if err != nil { log.Fatal(err) } fmt.Printf("JSON POST: %d bytes\n", len(res2.Body)) } ``` -------------------------------- ### CrawlingAPI.Get with geo-routing Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Route requests through proxies in a specific country using the `country` option (ISO 3166-1 alpha-2 code). This is useful for accessing geo-gated content or comparing prices. ```APIDOC ## CrawlingAPI.Get with geo-routing ### Description Route requests through proxies in a specific country with the `country` option (ISO 3166-1 alpha-2 code). Useful for geo-gated content or price comparisons. ### Method `Get(url string, options map[string]string) (*Response, error)` ### Parameters #### URL - **url** (string) - The URL to crawl from a specific country. #### Options - **country** (string) - The ISO 3166-1 alpha-2 code of the country to route through (e.g., "DE" for Germany). - **scraper** (string) - Optional: The name of the built-in scraper to use. ### Request Example ```go res, err := api.Get( "https://www.amazon.de/dp/B08N5WRWNW", map[string]string{ "country": "DE", "scraper": "amazon-product-details", }, ) ``` ### Response #### Success Response (200) - **JSON** (map[string]interface{}) - A map containing the extracted structured data, relevant to the specified country. #### Response Example ```go fmt.Println(res.JSON["price"]) ``` ``` -------------------------------- ### ImageBytes Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Decodes the base64-encoded body of a screenshot response into raw PNG bytes. It is recommended to verify `res.PCStatus == 200` before calling this function to ensure the body is a valid screenshot and not an error payload. ```APIDOC ## ImageBytes ### Description Decodes the base64-encoded body of a screenshot response into raw PNG bytes. Always verify `res.PCStatus == 200` before calling to avoid decoding an error payload. ### Parameters - **res** (*crawlbase.Response) - Required - The response object from a Crawlbase API call, expected to contain a screenshot in its body. ### Returns - **[]byte** - The raw PNG bytes of the screenshot. - **error** - An error if the body cannot be decoded as a base64 string or is not a valid PNG. ### Usage Example ```go raw, err := crawlbase.ImageBytes(res) if err != nil { log.Fatal(err) // base64 decode error — body was not a valid screenshot } _ = os.WriteFile("github.png", raw, 0o644) img, format, err := image.Decode(bytes.NewReader(raw)) ``` ``` -------------------------------- ### Decode screenshot response body with Crawlbase Go SDK Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Decodes the base64-encoded body of a screenshot response into raw PNG bytes. Always verify res.PCStatus == 200 before calling to avoid decoding an error payload. ```go package main import ( "fmt" "image" _ "image/png" "bytes" "log" "os" "github.com/crawlbase/crawlbase-go" ) func main() { api, _ := crawlbase.NewCrawlingAPI(os.Getenv("CRAWLBASE_JS_TOKEN")) res, err := api.Get("https://github.com", map[string]string{"screenshot": "true"}) if err != nil { log.Fatal(err) } if res.PCStatus != 200 { log.Fatalf("capture failed: pc_status=%d", res.PCStatus) } raw, err := crawlbase.ImageBytes(res) if err != nil { log.Fatal(err) // base64 decode error — body was not a valid screenshot } // Write to disk. _ = os.WriteFile("github.png", raw, 0o644) // Or decode directly into an image.Image for processing. img, format, err := image.Decode(bytes.NewReader(raw)) if err != nil { log.Fatal(err) } bounds := img.Bounds() fmt.Printf("Format=%s Width=%d Height=%d\n", format, bounds.Max.X, bounds.Max.Y) // Output: Format=png Width=1280 Height=... } ``` -------------------------------- ### CrawlingAPI.Post Source: https://context7.com/crawlbase/crawlbase-go/llms.txt Submits a POST request to the target URL through Crawlbase. It supports various body types including form-encoded data, JSON, and raw bytes. The `post_content_type` option can be used to specify the Content-Type for JSON bodies. ```APIDOC ## POST /crawl ### Description Submits a POST request to the target URL through Crawlbase. Accepts `url.Values` (form-encoded), `map[string]string`, `string`, or `[]byte` as the body. Set `post_content_type` in options to override `Content-Type` for JSON bodies. ### Method POST ### Endpoint `/crawl` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body - **body** (url.Values | map[string]string | string | []byte) - Required - The data to be sent in the POST request. - **options** (map[string]string) - Optional - Additional options, such as `post_content_type` to override Content-Type. ### Request Example ```go formData := url.Values{} formData.Set("q", "golang") formData.Set("page", "1") res, err := api.Post("https://producthunt.com/search", formData, nil) jsonBody := `{"query": "golang", "page": 1}` res2, err := api.Post("https://api.example.com/search", jsonBody, map[string]string{"post_content_type": "application/json"}) ``` ### Response #### Success Response (200) - **Body** ([]byte) - The raw response body from the crawled page. - **StatusCode** (int) - HTTP status of the SDK to Crawlbase request. - **PCStatus** (int) - Crawlbase's verdict on the crawled target (200 = success). - **OriginalStatus** (int) - HTTP status the target returned to Crawlbase. - **URL** (string) - Final URL after target-side redirects. - **RID** (string) - Request ID (set for async/store calls). - **Headers** (map[string]string) - All response headers, lowercased. - **JSON** (interface{}) - Auto-populated when Content-Type is JSON. ``` -------------------------------- ### Response Struct Source: https://context7.com/crawlbase/crawlbase-go/llms.txt The `Response` struct is returned by all Crawlbase API methods. It contains detailed information about the request and the crawled target, including status codes, final URLs, and response headers. It's crucial to check `PCStatus` for retry logic. ```APIDOC ## Response Struct ### Description The `Response` struct is returned by every API verb. `StatusCode` reflects the SDK→Crawlbase HTTP status; `PCStatus` reflects Crawlbase's verdict on the crawled target. Always branch on `PCStatus` for retry logic. ### Fields - **StatusCode** (int) - HTTP status of the SDK→Crawlbase request. - **PCStatus** (int) - Crawlbase verdict on the target (200 = success). - **OriginalStatus** (int) - HTTP status the target returned to Crawlbase. - **URL** (string) - Final URL after target-side redirects. - **RID** (string) - Request ID (set for async/store calls). - **Body** ([]byte) - Raw page content. - **Headers** (map[string]string) - All response headers, lowercased. - **JSON** (interface{}) - Auto-populated when Content-Type is JSON (scraper/format=json calls). ### Usage Example ```go fmt.Println("StatusCode: ", res.StatusCode) fmt.Println("PCStatus: ", res.PCStatus) fmt.Println("OriginalStatus: ", res.OriginalStatus) fmt.Println("URL: ", res.URL) fmt.Println("RID: ", res.RID) fmt.Println("Body length: ", len(res.Body)) fmt.Println("Headers[url]: ", res.Headers["url"]) if res.JSON != nil { fmt.Println("JSON keys:", res.JSON) } ``` ``` -------------------------------- ### Geo-routing Request Source: https://github.com/crawlbase/crawlbase-go/blob/main/README.md Specify a country code in the options to route your request through a specific geographic location. ```go res, _ := api.Get( "https://www.amazon.com/dp/B08N5WRWNW", map[string]string{"country": "DE"}, ) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.