### Perform GET Requests with Crawlbase API Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Examples of making GET requests using the Crawlbase API, including basic usage and passing additional options like user agent and format. Includes error handling. ```Ruby begin response = api.get('https://www.facebook.com/britneyspears') puts response.status_code puts response.original_status puts response.pc_status puts response.body rescue => exception puts exception.backtrace end ``` ```Ruby options = { user_agent: 'Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20121202 Firefox/30.0', format: 'json' } response = api.get('https://www.reddit.com/r/pics/comments/5bx4bx/thanks_obama/', options) puts response.status_code puts response.body # read the API json response ``` -------------------------------- ### Install Crawlbase Ruby Gem Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Instructions for adding the Crawlbase gem to your Ruby application's Gemfile and installing it using Bundler or directly via `gem install`. ```Ruby gem 'crawlbase' ``` ```Bash bundle ``` ```Bash gem install crawlbase ``` -------------------------------- ### Crawlbase Leads API Usage Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Initialize and use the Crawlbase Leads API to retrieve lead information for domains. Includes initialization and an example GET request with error handling. ```Ruby leads_api = Crawlbase::LeadsAPI.new(token: 'YOUR_TOKEN') ``` ```APIDOC Crawlbase::LeadsAPI.get(domain: string) - Makes a GET request to retrieve lead information for a specified domain. - Parameters: - domain: The domain for which to retrieve lead information (string). - Returns: - A response object with properties like `success`, `remaining_requests`, `status_code`, and `body`. ``` ```Ruby begin response = leads_api.get('stripe.com') puts response.success puts response.remaining_requests puts response.status_code puts response.body rescue => exception puts exception.backtrace end ``` -------------------------------- ### Crawlbase Scraper API Usage Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Initialize and use the Crawlbase Scraper API for specialized scraping tasks. Includes initialization and an example GET request with error handling. ```Ruby scraper_api = Crawlbase::ScraperAPI.new(token: 'YOUR_TOKEN') ``` ```APIDOC Crawlbase::ScraperAPI.get(url: string, options: Hash) - Makes a GET request using the Scraper API. - Parameters: - url: The URL to scrape (string). - options: A hash of additional parameters supported by the Scraper API (Hash). - Returns: - A response object with properties like `remaining_requests`, `status_code`, and `body`. ``` ```Ruby begin response = scraper_api.get('https://www.amazon.com/Halo-SleepSack-Swaddle-Triangle-Neutral/dp/B01LAG1TOS') puts response.remaining_requests puts response.status_code puts response.body rescue => exception puts exception.backtrace end ``` -------------------------------- ### Crawlbase Crawling API Methods (GET/POST) Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Documentation for the primary Crawlbase API methods for making GET and POST requests. Includes method signatures, parameters, and expected return values. ```APIDOC Crawlbase::API.get(url: string, options: Hash) - Makes a GET request to the specified URL using the Crawlbase API. - Parameters: - url: The URL to scrape (string). - options: A hash of additional parameters supported by the Crawlbase API (Hash). - Returns: - A response object with properties like `status_code`, `original_status`, `pc_status`, and `body`. Crawlbase::API.post(url: string, data: string|Hash, options: Hash) - Makes a POST request to the specified URL using the Crawlbase API. - Parameters: - url: The URL to send the POST request to (string). - data: The data to send in the POST request body. Can be a JSON hash or a string. - options: A hash of additional parameters supported by the Crawlbase API, e.g., `post_content_type: 'json'` (Hash). - Returns: - A response object with properties like `status_code`, `original_status`, `pc_status`, and `body`. ``` -------------------------------- ### Crawlbase Javascript Requests Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md How to use the Crawlbase API for scraping JavaScript-rendered websites. Requires a JavaScript token and only supports GET requests. Includes examples with and without additional options. ```Ruby api = Crawlbase::API.new(token: 'YOUR_JAVASCRIPT_TOKEN') ``` ```Ruby response = api.get('https://www.nfl.com') puts response.status_code puts response.body ``` ```Ruby response = api.get('https://www.freelancer.com', options: { page_wait: 5000 }) puts response.status_code ``` -------------------------------- ### Perform POST Requests with Crawlbase API Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Examples of making POST requests using the Crawlbase API, demonstrating how to send data and specify content types like JSON. ```Ruby api.post('https://producthunt.com/search', { text: 'example search' }) ``` ```Ruby response = api.post('https://httpbin.org/post', { some_json: 'with some value' }, { post_content_type: 'json' }) puts response.status_code puts response.body ``` -------------------------------- ### Retrieve and Manage Stored Data using Crawlbase Storage API in Ruby Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Provides examples for retrieving, deleting, and querying stored data using the Crawlbase Storage API. It covers fetching data by URL or RID, deleting specific items, performing bulk operations, listing RIDs, and getting the total count of stored documents. Error handling is included for robust operations. ```Ruby begin response = storage_api.get('https://www.apple.com') puts response.original_status puts response.pc_status puts response.url puts response.status_code puts response.rid puts response.body puts response.stored_at rescue => exception puts exception.backtrace end ``` ```Ruby begin response = storage_api.get(RID) puts response.original_status puts response.pc_status puts response.url puts response.status_code puts response.rid puts response.body puts response.stored_at rescue => exception puts exception.backtrace end ``` ```Ruby if storage_api.delete(RID) puts 'delete success' else puts "Unable to delete: #{storage_api.body['error']}" end ``` ```Ruby begin response = storage_api.bulk([RID1, RID2, RID3, ...]) puts response.original_status puts response.pc_status puts response.url puts response.status_code puts response.rid puts response.body puts response.stored_at rescue => exception puts exception.backtrace end ``` ```Ruby begin response = storage_api.rids puts response.status_code puts response.rid puts response.body rescue => exception puts exception.backtrace end ``` ```Ruby storage_api.rids(100) ``` ```Ruby total_count = storage_api.total_count puts "total_count: #{total_count}" ``` -------------------------------- ### Retrieve Screenshots with Crawlbase Screenshots API in Ruby Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Demonstrates various ways to retrieve screenshots using the Crawlbase Screenshots API's `get` method. It covers basic retrieval, processing the image data using a block, and saving the screenshot directly to a specified file path. Error handling is included for robust implementation. ```Ruby screenshots_api = Crawlbase::ScreenshotsAPI.new(token: 'YOUR_TOKEN') begin response = screenshots_api.get('https://www.apple.com') puts response.success puts response.remaining_requests puts response.status_code puts response.screenshot_path # do something with screenshot_path here rescue => exception puts exception.backtrace end ``` ```Ruby screenshots_api = Crawlbase::ScreenshotsAPI.new(token: 'YOUR_TOKEN') begin response = screenshots_api.get('https://www.apple.com') do |file| # do something (reading/writing) with the image file here end puts response.success puts response.remaining_requests puts response.status_code rescue => exception puts exception.backtrace end ``` ```Ruby screenshots_api = Crawlbase::ScreenshotsAPI.new(token: 'YOUR_TOKEN') begin response = screenshots_api.get('https://www.apple.com', save_to_path: '~/screenshot.jpg') do |file| # do something (reading/writing) with the image file here end puts response.success puts response.remaining_requests puts response.status_code rescue => exception puts exception.backtrace end ``` -------------------------------- ### Initialize Crawlbase API Client Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Demonstrates how to require the Crawlbase gem and initialize the API client with your token. It also shows how to configure a custom timeout for API requests. ```Ruby require 'crawlbase' ``` ```Ruby api = Crawlbase::API.new(token: 'YOUR_TOKEN') ``` ```Ruby api = Crawlbase::API.new(token: 'YOUR_TOKEN', timeout: 120) ``` -------------------------------- ### Initialize Crawlbase Screenshots API Client in Ruby Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Initializes a new instance of the Crawlbase Screenshots API client using a provided API token. This client is essential for interacting with the Crawlbase Screenshots service to capture web page screenshots. ```Ruby screenshots_api = Crawlbase::ScreenshotsAPI.new(token: 'YOUR_TOKEN') ``` -------------------------------- ### Initialize Crawlbase Storage API Client in Ruby Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Initializes a new instance of the Crawlbase Storage API client using a private API token. This client is used to interact with the Crawlbase Storage service for retrieving and managing stored data. ```Ruby storage_api = Crawlbase::StorageAPI.new(token: 'YOUR_TOKEN') ``` -------------------------------- ### Access Original and Crawlbase Status Codes Source: https://github.com/crawlbase/crawlbase-ruby/blob/main/README.md Demonstrates how to retrieve the original HTTP status code and the Crawlbase processing status code from the API response object. ```Ruby response = api.get('https://sfbay.craigslist.org/') puts response.original_status puts response.pc_status ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.