### Setup and Run Basic Scrape Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/examples/basic_scrape/README.md This snippet outlines the steps to set up a Python virtual environment, install the necessary SDK, configure the API key, and execute a basic scrape script. ```bash python -m venv .venv . .venv/Scripts/activate # Windows PowerShell: .venv\Scripts\Activate.ps1 pip install scrapi-sdk $env:SCRAPI_API_KEY="your_api_key_here_or_blank" python main.py ``` -------------------------------- ### Install ScrAPI SDK Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Install the ScrAPI SDK using pip. ```bash pip install scrapi-sdk ``` -------------------------------- ### Install ScrAPI SDK with HTML Helpers Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Install the ScrAPI SDK with optional HTML helpers. ```bash pip install "scrapi-sdk[html]" ``` -------------------------------- ### Browser Commands Example Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Chains multiple browser commands to interact with a web page, including inputting text, selecting options, waiting, clicking, scrolling, and evaluating JavaScript. This example requires `use_browser=True`. ```python from scrapi_sdk import ScrapeRequest request = ScrapeRequest("https://www.roboform.com/filling-test-all-fields") request.use_browser = True request.accept_dialogs = True request.browser_commands \ .input("input[name='01___title']", "Mr") \ .input("input[name='02frstname']", "Werner") \ .input("input[name='04lastname']", "van Deventer") \ .select("select[name='40cc__type']", "Discover") \ .wait(3000) \ .wait_for("input[type='reset']") \ .click("input[type='reset']") \ .wait(1000) \ .scroll(1000) \ .evaluate("console.log('any valid code...')") ``` -------------------------------- ### Set up Python Virtual Environment Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Creates a Python virtual environment and activates it. Installs the scapi-sdk with development and HTML extras, and sets up pytest. ```bash python -m venv .venv . .venv/Scripts/activate # Windows PowerShell: .venv\Scripts\Activate.ps1 pip install -e .[dev,html] pytest ``` -------------------------------- ### Quick Start: Synchronous Scraping Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Demonstrates how to perform a synchronous web scrape using ScrapiClient. Ensure you replace 'YOUR_API_KEY' with your actual API key. ```python from scrapi_sdk import ScrapeRequest, ScrapiClient with ScrapiClient("YOUR_API_KEY") as client: response = client.scrape(ScrapeRequest("https://deventerprise.com")) print(response.content if response else "No response") ``` -------------------------------- ### Quick Start: Asynchronous Scraping Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Demonstrates how to perform an asynchronous web scrape using AsyncScrapiClient. Ensure you replace 'YOUR_API_KEY' with your actual API key. ```python import asyncio from scrapi_sdk import AsyncScrapiClient async def main() -> None: async with AsyncScrapiClient("YOUR_API_KEY") as client: response = await client.scrape("https://deventerprise.com") print(response.content if response else "No response") asyncio.run(main()) ``` -------------------------------- ### Use HTML Helper Utilities Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Provides examples of using `numbers_only` to extract numbers from text and `html_with_no_script` to clean HTML by removing script tags. ```python from scrapi_sdk import html_with_no_script, numbers_only print(numbers_only("USD 1,299.95", include_decimal_points=True)) print(html_with_no_script("

safe

")) ``` -------------------------------- ### Get Supported Countries Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Fetches a list of countries supported by the service. Each country object contains its key, name, and proxy count. ```python countries = client.get_supported_countries() for country in countries: print(country.key, country.name, country.proxy_count) ``` -------------------------------- ### Get Supported Countries Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Fetches a list of countries supported by the API, including their keys, names, and available proxy counts. ```APIDOC ## Supported countries ```python countries = client.get_supported_countries() for country in countries: print(country.key, country.name, country.proxy_count) ``` ``` -------------------------------- ### Get Credit Balance Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Retrieves the current credit balance using the client. This is a simple lookup operation. ```python balance = client.get_credit_balance() print(balance) ``` -------------------------------- ### Get Credit Balance Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Retrieves the current credit balance for the account. ```APIDOC ## Credit balance ```python balance = client.get_credit_balance() print(balance) ``` ``` -------------------------------- ### Get Supported Cities Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Fetches a list of cities supported by the API for a given country, including their keys, names, and available proxy counts. ```APIDOC ## Supported cities ```python cities = client.get_supported_cities("USA") for city in cities: print(city.key, city.name, city.proxy_count) ``` ``` -------------------------------- ### Get Supported Cities Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Retrieves a list of cities supported for a given country. Each city object includes its key, name, and proxy count. ```python cities = client.get_supported_cities("USA") for city in cities: print(city.key, city.name, city.proxy_count) ``` -------------------------------- ### Scrape Request Options Example Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Configures a ScrapeRequest with various options including proxy settings, browser mode, and response format. The response format must be ResponseFormat.JSON when using this SDK client. ```python from scrapi_sdk import ProxyType, ResponseFormat, ScrapeRequest request = ScrapeRequest("https://deventerprise.com") request.proxy_type = ProxyType.RESIDENTIAL request.proxy_country = "USA" request.use_browser = True request.solve_captchas = True request.include_screenshot = True request.response_format = ResponseFormat.JSON ``` -------------------------------- ### Build Python Package Locally Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Upgrades pip, build, and twine, then builds the package locally. Verifies the build artifacts using twine. ```bash python -m pip install --upgrade pip build twine python -m build python -m twine check dist/* ``` -------------------------------- ### Upload Python Package to PyPI Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Uploads the built Python package to the main PyPI repository. Requires setting TWINE_USERNAME and TWINE_PASSWORD environment variables. ```bash # PowerShell $env:TWINE_USERNAME="__token__" $env:TWINE_PASSWORD="pypi-..." python -m twine upload dist/* ``` -------------------------------- ### Upload Python Package to TestPyPI Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Uploads the built Python package to the TestPyPI repository. Requires setting TWINE_USERNAME and TWINE_PASSWORD environment variables. ```bash # PowerShell $env:TWINE_USERNAME="__token__" $env:TWINE_PASSWORD="pypi-..." python -m twine upload -r testpypi dist/* ``` -------------------------------- ### Handle Scrapi Exceptions Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Demonstrates how to catch and handle `ScrapiException` which is raised for client or API errors. Includes the HTTP status code in the exception details. ```python from scrapi_sdk import ScrapeRequest, ScrapiClient, ScrapiException with ScrapiClient("YOUR_API_KEY") as client: try: response = client.scrape(ScrapeRequest("https://deventerprise.com")) except ScrapiException as ex: print(f"Error ({ex.status_code}): {ex}") raise ``` -------------------------------- ### Configure Scrape Request Defaults Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Sets default options for all scrape requests, such as proxy type and browser usage. Allows for explicit overrides on individual requests. ```python from scrapi_sdk import ProxyType, ScrapeRequest, ScrapeRequestDefaults ScrapeRequestDefaults.proxy_type = ProxyType.RESIDENTIAL ScrapeRequestDefaults.use_browser = True ScrapeRequestDefaults.solve_captchas = True ScrapeRequestDefaults.headers["Sample"] = "Custom-Value" request = ScrapeRequest("https://deventerprise.com") request.proxy_type = ProxyType.TOR # explicit override assert request.proxy_type == ProxyType.TOR assert request.use_browser is True assert request.solve_captchas is True assert request.headers["Sample"] == "Custom-Value" ``` -------------------------------- ### Scrape Request Defaults Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Configures default settings for all new ScrapeRequests, such as proxy type, browser usage, and captcha solving. ```APIDOC ## Scrape Request Defaults `ScrapeRequestDefaults` applies defaults to every new `ScrapeRequest`. ```python from scrapi_sdk import ProxyType, ScrapeRequest, ScrapeRequestDefaults ScrapeRequestDefaults.proxy_type = ProxyType.RESIDENTIAL ScrapeRequestDefaults.use_browser = True ScrapeRequestDefaults.solve_captchas = True ScrapeRequestDefaults.headers["Sample"] = "Custom-Value" request = ScrapeRequest("https://deventerprise.com") request.proxy_type = ProxyType.TOR # explicit override assert request.proxy_type == ProxyType.TOR assert request.use_browser is True assert request.solve_captchas is True assert request.headers["Sample"] == "Custom-Value" ``` ``` -------------------------------- ### HTML Helper Utilities Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Provides optional utility functions for parsing and manipulating HTML content, such as extracting numbers or removing script tags. ```APIDOC ## HTML helper utilities (optional) Install optional dependency first: ```bash pip install "scrapi-sdk[html]" ``` Helpers exported from `scrapi_sdk`: - `numbers_only(text, include_decimal_points=False, trim=True)` - `html_with_no_script(html)` - `next_element(node)` - `is_visible(node, check_parent_nodes=True)` Example: ```python from scrapi_sdk import html_with_no_script, numbers_only print(numbers_only("USD 1,299.95", include_decimal_points=True)) print(html_with_no_script("

safe

")) ``` ``` -------------------------------- ### ScrapiException Handling Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Handles API or client errors by catching ScrapiException, which includes HTTP status code details. ```APIDOC ## Exceptions Any client/API errors are raised as `ScrapiException` with HTTP status code details. ```python from scrapi_sdk import ScrapeRequest, ScrapiClient, ScrapiException with ScrapiClient("YOUR_API_KEY") as client: try: response = client.scrape(ScrapeRequest("https://deventerprise.com")) except ScrapiException as ex: print(f"Error ({ex.status_code}): {ex}") raise ``` ``` -------------------------------- ### Scrape Response Data Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Accesses all details from an API scrape response, including URLs, timing, status, content, and headers. ```APIDOC ## Scrape Response Data `ScrapeResponse` includes all API response details. ```python response = client.scrape("https://deventerprise.com") if response: print(response.request_url) print(response.response_url) print(response.duration) print(response.attempts) print(response.credits_used) print(response.status_code) print(response.screenshot_url) print(response.pdf_url) print(response.video_url) print(response.content) print(response.content_hash) # SHA1 of UTF-16LE content to match .NET SDK parity. for captcha_name, solved_count in response.captchas_solved.items(): print(f"{captcha_name}: {solved_count}") for key, value in response.headers.items(): print(f"{key}: {value}") for key, value in response.cookies.items(): print(f"{key}: {value}") for message in response.error_messages or []: print(message) ``` If `beautifulsoup4` is installed, `response.html` returns a parsed `BeautifulSoup` object. ``` -------------------------------- ### Scrape Response Data Source: https://github.com/deventerprisesoftware/scrapi-sdk-python/blob/master/README.md Accesses various details from a scrape response, including URLs, timing, status, content, and headers. Handles captcha solutions and error messages. ```python response = client.scrape("https://deventerprise.com") if response: print(response.request_url) print(response.response_url) print(response.duration) print(response.attempts) print(response.credits_used) print(response.status_code) print(response.screenshot_url) print(response.pdf_url) print(response.video_url) print(response.content) print(response.content_hash) # SHA1 of UTF-16LE content to match .NET SDK parity. for captcha_name, solved_count in response.captchas_solved.items(): print(f"{captcha_name}: {solved_count}") for key, value in response.headers.items(): print(f"{key}: {value}") for key, value in response.cookies.items(): print(f"{key}: {value}") for message in response.error_messages or []: print(message) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.