### Install and Run Development Tasks with uv and Poe Source: https://github.com/apify/apify-client-python/blob/master/CLAUDE.md Commands for installing development dependencies, running code checks (linting, type-checking, tests), formatting code, and managing documentation. ```bash uv run poe install-dev # Install dev deps + git hooks uv run poe check-code # Run all checks (lint, type-check, unit-tests, docstring check) uv run poe lint # Ruff format check + ruff check uv run poe format # Auto-fix lint issues and format uv run poe type-check # Run ty type checker uv run poe unit-tests # Run unit tests uv run poe check-docstrings # Verify async docstrings match sync uv run poe fix-docstrings # Auto-fix async docstrings # Run a single test uv run pytest tests/unit/test_file.py uv run pytest tests/unit/test_file.py::test_name ``` -------------------------------- ### Install and Run Development Tasks with uv and Poe Source: https://github.com/apify/apify-client-python/blob/master/GEMINI.md Commands for installing development dependencies, running code checks (linting, type-checking, tests), formatting code, and managing docstrings using uv and Poe. ```bash uv run poe install-dev uv run poe check-code uv run poe lint uv run poe format uv run poe type-check uv run poe unit-tests uv run poe check-docstrings uv run poe fix-docstrings ``` -------------------------------- ### Run Documentation Locally Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Starts the local documentation server to preview changes made to the project documentation. ```shell uv run poe run-docs ``` -------------------------------- ### Install Development Dependencies Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Installs all necessary dependencies for local development using the project task runner. ```shell uv run poe install-dev ``` -------------------------------- ### Install Apify Client for Python Source: https://context7.com/apify/apify-client-python/llms.txt Installs the Apify API Client for Python using pip. This is the first step to integrate Apify's services into your Python applications. ```bash pip install apify-client ``` -------------------------------- ### Python: Manage Key-Value Store with Apify Client Source: https://context7.com/apify/apify-client-python/llms.txt Illustrates how to manage key-value storage using the Apify Python client. Covers getting or creating stores, setting records (JSON, text, binary), getting records, checking existence, listing keys, and deleting records. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Get or create a key-value store kvs_collection = client.key_value_stores() kvs = kvs_collection.get_or_create(name='my-store') kvs_client = client.key_value_store(kvs.id) # Set a JSON record kvs_client.set_record('config', {'api_key': 'xyz', 'timeout': 30}) # Set a text record kvs_client.set_record('readme', 'This is a text file', content_type='text/plain') # Set binary data with open('image.png', 'rb') as f: kvs_client.set_record('screenshot', f.read(), content_type='image/png') # Get a record (automatically parsed based on content type) record = kvs_client.get_record('config') if record: print(f"Key: {record['key']}") print(f"Value: {record['value']}") # Parsed JSON object print(f"Content-Type: {record['content_type']}") # Check if a key exists if kvs_client.record_exists('config'): print("Config exists!") # List all keys keys = kvs_client.list_keys(limit=100) for key_info in keys.items: print(f"Key: {key_info.key}, Size: {key_info.size}") # Delete a record kvs_client.delete_record('old-config') ``` -------------------------------- ### Create and Execute Apify Tasks Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/02_manage_tasks_for_reusable_input.mdx Demonstrates how to initialize an Apify client, create a task with specific input for an Actor, and execute that task. The examples cover both asynchronous and synchronous patterns using the Apify Python SDK. ```python from apify_client import ApifyClientAsync client = ApifyClientAsync("MY-APIFY-TOKEN") # Create a task task = await client.tasks().create(task_input={"actorId": "apify/instagram-hashtag-scraper", "name": "my-task", "input": {"hashtags": ["apify"]}}) # Run the task run = await client.task(task["id"]).call() ``` ```python from apify_client import ApifyClient client = ApifyClient("MY-APIFY-TOKEN") # Create a task task = client.tasks().create(task_input={"actorId": "apify/instagram-hashtag-scraper", "name": "my-task", "input": {"hashtags": ["apify"]}}) # Run the task run = client.task(task["id"]).call() ``` -------------------------------- ### Configure Retry Behavior in ApifyClient Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/05_retries.mdx This example demonstrates how to initialize the ApifyClient with custom retry settings. It shows both asynchronous and synchronous client configurations for handling network instability. ```python from apify_client import ApifyClientAsync # Async client with custom retry settings client = ApifyClientAsync( token="MY_TOKEN", max_retries=5, min_delay_between_retries_millis=1000 ) ``` ```python from apify_client import ApifyClient # Sync client with custom retry settings client = ApifyClient( token="MY_TOKEN", max_retries=5, min_delay_between_retries_millis=1000 ) ``` -------------------------------- ### Retrieve and Paginate Actor Datasets Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/03_retrieve_actor_data.mdx Demonstrates how to fetch dataset items from an Actor run. The examples show both asynchronous and synchronous approaches to iterating through paginated results. ```python import asyncio from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync("MY-APIFY-TOKEN") run = client.actor("my-actor").run() dataset = client.dataset(run["defaultDatasetId"]) async for item in dataset.iterate_items(): print(item) asyncio.run(main()) ``` ```python from apify_client import ApifyClient client = ApifyClient("MY-APIFY-TOKEN") run = client.actor("my-actor").run() dataset = client.dataset(run["defaultDatasetId"]) for item in dataset.iterate_items(): print(item) ``` -------------------------------- ### Error Handling Examples Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/04_error_handling.mdx Demonstrates how to handle API errors using both asynchronous and synchronous clients. ```APIDOC ## Error Handling with Apify Client Python ### Description The Apify client for Python automatically processes API responses. Date strings are converted to Python `datetime.datetime` objects. In case of an error, the client raises an `ApifyApiError` exception. This exception encapsulates the raw JSON errors from the API and offers extra information to aid in debugging. ### Usage This section provides examples for handling errors with both asynchronous and synchronous clients. #### Async Client Example ```python # ErrorAsyncExample code goes here ``` #### Sync Client Example ```python # ErrorSyncExample code goes here ``` ### Error Type - **ApifyApiError** (Exception) - Raised when an API error occurs. It wraps the raw API error response and provides additional context. ``` -------------------------------- ### Initialize Asynchronous ApifyClientAsync Source: https://context7.com/apify/apify-client-python/llms.txt Shows how to initialize the asynchronous ApifyClientAsync for use with asyncio. It includes an example of calling an Actor and retrieving dataset items asynchronously. ```python import asyncio from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync(token='MY-APIFY-TOKEN') # All operations return awaitables actor_client = client.actor('apify/web-scraper') run = await actor_client.call() if run: dataset_client = client.dataset(run.default_dataset_id) items = await dataset_client.list_items() print(items.items) asyncio.run(main()) ``` -------------------------------- ### Python: Run Actor and Retrieve Results (Async/Sync) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/01_introduction/index.mdx Demonstrates how to use the Apify Python client to run an Actor and retrieve its results. It includes examples for both asynchronous and synchronous client interfaces, showcasing basic interaction with the Apify API. ```python from apify_client import ApifyClient # Example using the asynchronous client async def run_actor_async(): client = ApifyClient("YOUR_API_TOKEN") run_result = await client.actor("apify/hello-world").call() print("Async run result:", run_result) # Example using the synchronous client def run_actor_sync(): client = ApifyClient("YOUR_API_TOKEN") run_result = client.actor("apify/hello-world").call() print("Sync run result:", run_result) import asyncio if __name__ == "__main__": # To run the async example: # asyncio.run(run_actor_async()) # To run the sync example: run_actor_sync() ``` -------------------------------- ### Run Actor and Wait for Results (Synchronous) Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to use `ActorClient.call()` to start an Actor, provide input, and wait for its completion. It includes fetching and printing results from the Actor's dataset. ```python from datetime import timedelta from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Get Actor client and run with input actor_client = client.actor('apify/instagram-hashtag-scraper') # Run with input data and wait for completion run = actor_client.call( run_input={'hashtags': ['travel', 'photography'], 'resultsLimit': 100}, run_timeout=timedelta(minutes=10), # Max time for the run itself memory_mbytes=1024, # Memory allocation in MB ) if run: print(f"Run finished with status: {run.status}") print(f"Dataset ID: {run.default_dataset_id}") # Fetch results from the run's dataset dataset = client.dataset(run.default_dataset_id) results = dataset.list_items() for item in results.items: print(item) ``` -------------------------------- ### Start Actor Without Waiting (Synchronous) Source: https://context7.com/apify/apify-client-python/llms.txt Illustrates using `ActorClient.start()` to initiate an Actor run without blocking. It returns a Run object immediately, allowing for asynchronous monitoring of the run's progress. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') actor_client = client.actor('apify/web-scraper') # Start the Actor and get run reference immediately run = actor_client.start( run_input={ 'startUrls': [{'url': 'https://example.com'}], 'maxRequestsPerCrawl': 100, }, memory_mbytes=2048, ) print(f"Run started with ID: {run.id}") print(f"Status: {run.status}") # Later, check the run status run_client = client.run(run.id) updated_run = run_client.get() print(f"Current status: {updated_run.status if updated_run else 'Unknown'}") ``` -------------------------------- ### Run Development Tasks with uv and Poe Source: https://github.com/apify/apify-client-python/blob/master/.rules.md Common development commands for the Apify Python client, including installation, linting, formatting, type checking, and testing. ```bash uv run poe install-dev uv run poe check-code uv run poe lint uv run poe format uv run poe type-check uv run poe unit-tests uv run poe check-docstrings uv run poe fix-docstrings uv run pytest tests/unit/test_file.py ``` -------------------------------- ### Load Dataset Items into Pandas DataFrame Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/04_integration_with_data_libraries.mdx Demonstrates how to retrieve items from an Apify dataset and convert them into a Pandas DataFrame. This example covers both asynchronous and synchronous client implementations. ```python import pandas as pd from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync("MY-APIFY-TOKEN") dataset_client = client.dataset("MY-DATASET-ID") items = await dataset_client.list_items().items() df = pd.DataFrame(items) print(df.head()) ``` ```python import pandas as pd from apify_client import ApifyClient client = ApifyClient("MY-APIFY-TOKEN") dataset_client = client.dataset("MY-DATASET-ID") items = dataset_client.list_items().items() df = pd.DataFrame(items) print(df.head()) ``` -------------------------------- ### Python: Push Items to Dataset with Apify Client Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to add one or more items to an Apify dataset using the Python client. Items can be dictionaries or lists of dictionaries. It also shows how to get or create a dataset. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Get or create a dataset datasets_client = client.datasets() dataset = datasets_client.get_or_create(name='my-results') dataset_client = client.dataset(dataset.id) # Push a single item dataset_client.push_items({'title': 'Product A', 'price': 29.99}) # Push multiple items dataset_client.push_items([ {'title': 'Product A', 'price': 29.99}, {'title': 'Product B', 'price': 49.99}, {'title': 'Product C', 'price': 19.99}, ]) ``` -------------------------------- ### Fetch Dataset Items with Pagination Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/08_pagination.mdx Demonstrates how to retrieve all items from a dataset using the Apify client. The examples show both asynchronous and synchronous approaches to handling paginated results. ```python from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync("MY-TOKEN") dataset_client = client.dataset("my-dataset-id") # Fetch all items using pagination items = [] async for item in dataset_client.iterate_items(): items.append(item) print(f"Fetched {len(items)} items") ``` ```python from apify_client import ApifyClient client = ApifyClient("MY-TOKEN") dataset_client = client.dataset("my-dataset-id") # Fetch all items using pagination items = [] for item in dataset_client.iterate_items(): items.append(item) print(f"Fetched {len(items)} items") ``` -------------------------------- ### Call Actor and Wait for Completion (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/07_convenience_methods.mdx Demonstrates how to use the `call` method to start an Actor and wait for its asynchronous or synchronous completion. This method handles network timeouts internally, simplifying the process of running Actors and retrieving their results. ```python from apify_client import ApifyClient client = ApifyClient("YOUR_API_TOKEN") # Example for asynchronous client async def call_actor_async(): run = await client.actor("apify/hello-world").call() print(f"Async run finished with status: {run['status']}") # Example for synchronous client def call_actor_sync(): run = client.actor("apify/hello-world").call() print(f"Sync run finished with status: {run['status']}") # To run the async function: # import asyncio # asyncio.run(call_actor_async()) # To run the sync function: # call_actor_sync() ``` -------------------------------- ### Managing Webhooks with Apify Python Client Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to set up and manage webhooks using the Apify Python client to receive notifications for Actor run events. It includes creating a webhook with specified event types and request URLs, and starting an Actor with an ad-hoc webhook. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Create a webhook webhooks_client = client.webhooks() webhook = webhooks_client.create( event_types=['ACTOR.RUN.SUCCEEDED', 'ACTOR.RUN.FAILED'], request_url='https://my-server.com/webhook', payload_template='{"runId": {{resource.id}}, "status": "{{resource.status}}"}', ) # Run Actor with ad-hoc webhook actor_client = client.actor('apify/web-scraper') run = actor_client.start( run_input={'startUrls': [{'url': 'https://example.com'}]}, webhooks=[{ 'event_types': ['ACTOR.RUN.SUCCEEDED'], 'request_url': 'https://my-server.com/run-complete', }], ) ``` -------------------------------- ### Managing Schedules with Apify Python Client Source: https://context7.com/apify/apify-client-python/llms.txt Explains how to create, update, and list schedules for automatically running Actors at specified intervals using the Apify Python client. It includes examples of defining cron expressions and actions for schedules. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Create a schedule schedules_client = client.schedules() schedule = schedules_client.create( name='daily-scrape', cron_expression='0 8 * * *', # Every day at 8 AM actions=[{ 'type': 'RUN_ACTOR', 'actorId': 'apify/web-scraper', 'runInput': {'startUrls': [{'url': 'https://example.com'}]}, }], ) print(f"Schedule created: {schedule.id}") # Update a schedule schedule_client = client.schedule(schedule.id) schedule_client.update(cron_expression='0 */6 * * *') # Every 6 hours # List all schedules for sched in schedules_client.list().items: print(f"{sched.name}: {sched.cron_expression}") ``` -------------------------------- ### Unit Test Mocking with pytest-httpserver Source: https://github.com/apify/apify-client-python/blob/master/CLAUDE.md Example of how to write unit tests using `pytest-httpserver` to mock HTTP responses, enabling tests to run without network access. ```python def test_example(httpserver: HTTPServer) -> None: httpserver.expect_request('/v2/endpoint').respond_with_json({'data': ...}) client = ApifyClient(token='test', api_url=httpserver.url_for('/').removesuffix('/')) # assert client behavior ``` -------------------------------- ### Python: Export Dataset Items as Bytes with Apify Client Source: https://context7.com/apify/apify-client-python/llms.txt Explains how to export dataset items as raw bytes in various formats like CSV, JSON, and XLSX using the Apify Python client. Includes examples for saving to files. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') dataset_client = client.dataset('dataset-id') # Export as CSV csv_data = dataset_client.get_items_as_bytes( item_format='csv', fields=['title', 'url', 'price'], ) with open('results.csv', 'wb') as f: f.write(csv_data) # Export as JSON json_data = dataset_client.get_items_as_bytes(item_format='json') # Export as Excel xlsx_data = dataset_client.get_items_as_bytes(item_format='xlsx') with open('results.xlsx', 'wb') as f: f.write(xlsx_data) ``` -------------------------------- ### Integrating Apify Datasets with Pandas in Python Source: https://context7.com/apify/apify-client-python/llms.txt Provides a Python example of loading data from an Apify Actor run directly into a Pandas DataFrame. It covers running an Actor, retrieving dataset items, creating a DataFrame, and performing basic analysis and export. ```python import pandas as pd from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Run an Actor and get results actor_client = client.actor('apify/web-scraper') run = actor_client.call(run_input={ 'startUrls': [{'url': 'https://example.com'}], }) if run: # Get dataset items dataset_client = client.dataset(run.default_dataset_id) data = dataset_client.list_items() # Load into Pandas DataFrame df = pd.DataFrame(data.items) # Analyze the data print(df.head()) print(df.describe()) # Filter and process filtered = df[df['price'] > 50] # Export to various formats df.to_csv('results.csv', index=False) df.to_excel('results.xlsx', index=False) ``` -------------------------------- ### Retrieve Actor Dataset Results (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/01_introduction/quick-start.mdx Fetches results from an Actor's default dataset using the dataset ID. This example shows how to list items from the dataset using both asynchronous and synchronous clients. The dataset ID can be obtained from the Actor run details. ```python run_id = run_async['id'] # Or run_sync['id'] dataset_id = run_id # In this case, defaultDatasetId is the run ID # For asynchronous client dataset_items_async = client_async.dataset(dataset_id).list_items() print("Async dataset items:") for item in dataset_items_async: print(item) # For synchronous client dataset_items_sync = client_sync.dataset(dataset_id).list_items() print("Sync dataset items:") for item in dataset_items_sync: print(item) ``` -------------------------------- ### Custom Python Logging Formatter for Apify Client Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/06_logging.mdx This example demonstrates how to create and use a custom log formatter for the Apify client's logger in Python. It allows you to include additional debugging properties provided via the 'extra' argument, such as attempt count, status code, and URL, in your log messages. ```python import logging class ApifyClientFormatter(logging.Formatter): def format(self, record): # Define the default format string default_format = '%(asctime)s - %(name)s - %(levelname)s - %(message)s' # Check for extra attributes and include them if they exist extra_attrs = [] if hasattr(record, 'attempt'): extra_attrs.append(f'attempt={record.attempt}') if hasattr(record, 'status_code'): extra_attrs.append(f'status_code={record.status_code}') if hasattr(record, 'url'): extra_attrs.append(f'url={record.url}') if hasattr(record, 'client_method'): extra_attrs.append(f'client_method={record.client_method}') if hasattr(record, 'resource_id'): extra_attrs.append(f'resource_id={record.resource_id}') if extra_attrs: return f'{default_format} - { " ".join(extra_attrs) }' else: return default_format # Assuming logger and console_handler are already set up as in the previous example: # logger = logging.getLogger('apify_client') # console_handler = logging.StreamHandler() # Create an instance of the custom formatter formatter = ApifyClientFormatter() # Set the custom formatter for the handler console_handler.setFormatter(formatter) # Add the handler to the logger if it's not already there if not logger.handlers: logger.addHandler(console_handler) print("Custom Apify client log formatter applied.") # Example of logging with extra data (would typically be done by the client library) # logger.info('API request sent', extra={'attempt': 1, 'status_code': 200, 'url': 'https://api.apify.com/v2/some/endpoint', 'client_method': 'get_run', 'resource_id': 'some_run_id'}) ``` -------------------------------- ### Wait for Actor Run Completion (Synchronous) Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates using `RunClient.wait_for_finish()` to pause execution until a previously started Actor run completes. This is useful for managing long-running tasks initiated with `start()`. ```python from datetime import timedelta from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Start an Actor actor_client = client.actor('apify/web-scraper') run = actor_client.start(run_input={'startUrls': [{'url': 'https://example.com'}]}) # Do other work... # Later, wait for the run to finish run_client = client.run(run.id) finished_run = run_client.wait_for_finish(wait_duration=timedelta(minutes=30)) if finished_run: print(f"Run completed with status: {finished_run.status}") ``` -------------------------------- ### Initialize Synchronous ApifyClient Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to initialize the synchronous ApifyClient for interacting with the Apify API. It shows basic initialization with an API token and configuration with custom settings like maximum retries. ```python from apify_client import ApifyClient # Initialize with API token client = ApifyClient(token='MY-APIFY-TOKEN') # Configure with custom settings client = ApifyClient( token='MY-APIFY-TOKEN', max_retries=8, # Number of retry attempts for failed requests ) ``` -------------------------------- ### Run Actor and Provide Input (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/01_introduction/quick-start.mdx Demonstrates how to run an Actor using its ID and provide input data. The input is passed as a dictionary matching the Actor's input schema. Supports both async and sync clients. ```python actor_id = "apify/hello-world" input_data = { "message": "Hello from Python!" } # For asynchronous client un_async = client_async.actor(actor_id).call(input_data) print(f"Async run finished with ID: {run_async['id']}") # For synchronous client run_sync = client_sync.actor(actor_id).call(input_data) print(f"Sync run finished with ID: {run_sync['id']}") ``` -------------------------------- ### Initialize Client and Execute Actor Source: https://github.com/apify/apify-client-python/blob/master/README.md Demonstrates how to instantiate the ApifyClient with an API token, trigger an Actor execution, and retrieve the resulting dataset items. This snippet highlights the synchronous workflow for managing Actor lifecycles. ```python from apify_client import ApifyClient apify_client = ApifyClient('MY-APIFY-TOKEN') # Start an Actor and wait for it to finish actor_call = apify_client.actor('john-doe/my-cool-actor').call() # Fetch results from the Actor's default dataset dataset_items = apify_client.dataset(actor_call['defaultDatasetId']).list_items().items ``` -------------------------------- ### Asynchronous Actor Execution and Log Streaming Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/01_async_support.mdx Demonstrates how to initialize the asynchronous client, run an Actor, and stream its logs concurrently. ```APIDOC ## ASYNC /actors/{actorId}/runs ### Description Initiates an Actor run asynchronously and streams the logs to the console using the ApifyClientAsync client. ### Method POST ### Endpoint /v2/acts/{actorId}/runs ### Parameters #### Path Parameters - **actorId** (string) - Required - The unique identifier of the Actor to run. ### Request Example ```python from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync('MY-APIFY-TOKEN') run = await client.actor('my-actor-id').call() async for log in client.run(run['id']).log().stream(): print(log) import asyncio asyncio.run(main()) ``` ### Response #### Success Response (201) - **id** (string) - The ID of the created Actor run. - **status** (string) - The initial status of the run (e.g., RUNNING). #### Response Example { "id": "run-123", "status": "RUNNING" } ``` -------------------------------- ### Create, Run, and Update an Actor Task with Python Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to create, run, update, and retrieve information about an Actor task using the Apify Python client. It covers task creation with specific configurations, initiating a run, updating task settings like memory, and fetching the last run's details. ```python from apify_client import ApifyClient client = ApifyClient(token='YOUR_APIFY_TOKEN') # Create a task for an Actor tasks_client = client.tasks() task = tasks_client.create( actor_id='apify/web-scraper', name='daily-scrape', task_input={ 'body': { 'startUrls': [{'url': 'https://example.com'}], 'maxRequestsPerCrawl': 100, } }, memory_mbytes=2048, ) # Run the task task_client = client.task(task.id) run = task_client.call() # Runs with saved configuration if run: print(f"Task run completed: {run.status}") # Update task settings task_client.update( task_input={'body': {'maxRequestsPerCrawl': 200}}, memory_mbytes=4096, ) # Get the last run of a task last_run_client = task_client.last_run() last_run = last_run_client.get() ``` -------------------------------- ### Run Actor and stream logs asynchronously Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/01_async_support.mdx Demonstrates how to initialize the ApifyClientAsync, trigger an Actor run, and asynchronously stream the logs to the console. This implementation uses the await keyword to ensure non-blocking execution during the Actor's lifecycle. ```python import asyncio from apify_client import ApifyClientAsync async def main(): client = ApifyClientAsync("MY-APIFY-TOKEN") actor_run = await client.actor("my-actor-id").call() async for log_line in client.run(actor_run["id"]).log().stream(): print(log_line) asyncio.run(main()) ``` -------------------------------- ### Authenticate Apify Client (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/01_introduction/quick-start.mdx Initializes the ApifyClient with an API token for authentication. Supports both asynchronous and synchronous client initialization. Ensure your API token is kept secure. ```python from apify_client import ApifyClient # For asynchronous client client_async = ApifyClient("MY-APIFY-TOKEN") # For synchronous client client_sync = ApifyClient("MY-APIFY-TOKEN") ``` -------------------------------- ### Execute Linting and Formatting Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Runs linting to identify issues and formatting to ensure consistent code style using Ruff. ```shell uv run poe lint uv run poe format ``` -------------------------------- ### Upload Apify Python Package to PyPI Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Upload the built Apify Python client package to PyPI using the 'uv publish' command. This requires a valid API token for authentication and should be used cautiously, preferring automated release workflows when possible. ```sh uv publish --token YOUR_API_TOKEN ``` -------------------------------- ### Manage Actor Collections with Apify Client Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/02_single_collection_clients.mdx Demonstrates how to interact with a collection of actors using the Apify client. It shows both asynchronous and synchronous approaches to listing or creating resources within a collection. ```python from apify_client import ApifyClientAsync client = ApifyClientAsync("MY-TOKEN") actors = client.actors() actor_list = await actors.list() ``` ```python from apify_client import ApifyClient client = ApifyClient("MY-TOKEN") actors = client.actors() actor_list = actors.list() ``` -------------------------------- ### Retrieve User Profile and Account Usage Source: https://context7.com/apify/apify-client-python/llms.txt This snippet demonstrates how to initialize the ApifyClient and fetch the authenticated user's profile information, account limits, and current monthly usage statistics. It requires a valid API token to authenticate successfully. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') # Get current user info user_client = client.user() user = user_client.get() if user: print(f"Username: {user.username}") print(f"Email: {user.email}") # Get account limits and usage limits = user_client.limits() monthly_usage = user_client.monthly_usage() ``` -------------------------------- ### Stream Actor Logs Incrementally (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/09_streaming.mdx Demonstrates how to stream logs from an Actor run incrementally using both asynchronous and synchronous clients. This approach is memory-efficient and allows for real-time processing of log data. Ensure the response is consumed within a 'with' block to manage connections properly. ```python from apify_client import ApifyClient # Example for asynchronous client async def stream_logs_async(): client = ApifyClient("YOUR_API_TOKEN") async with client.log("ACTOR_RUN_ID").stream() as stream: async for line in stream: print(line, end='') # Example for synchronous client def stream_logs_sync(): client = ApifyClient("YOUR_API_TOKEN") with client.log("ACTOR_RUN_ID").stream() as stream: for line in stream: print(line, end='') ``` -------------------------------- ### Retrieve Actor Data (Async and Sync) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/03_retrieve_actor_data.mdx Demonstrates how to fetch datasets from an Actor's runs, paginate through their items, and merge them into a single dataset for unified analysis using both asynchronous and synchronous Python clients. ```APIDOC ## Retrieve Actor Data ### Description Actor output data is stored in datasets, which can be retrieved from individual Actor runs. Dataset items support pagination for efficient retrieval, and multiple datasets can be merged into a single dataset for further analysis. This merged dataset can then be exported into various formats such as CSV, JSON, XLSX, or XML. Additionally, integrations provide powerful tools to automate data workflows. ### Method N/A (Client-side operation) ### Endpoint N/A (Client-side operation) ### Parameters N/A ### Request Example N/A ### Response #### Success Response (200) N/A #### Response Example N/A ### Code Examples #### Async Client ```python import asyncio from apify_client import ApifyClient async def main(): # Initialize the ApifyClient with your API token client = ApifyClient("YOUR_API_TOKEN") # Example: Retrieve dataset items from a specific Actor run run_id = "YOUR_RUN_ID" dataset_client = client.dataset("YOUR_DATASET_ID") # Fetch all items from the dataset with pagination items = await dataset_client.list_items().flatten() # Process the retrieved items for item in items: print(item) # Example: Merging datasets (if applicable) # merged_dataset_items = await client.dataset("YOUR_MERGED_DATASET_ID").list_items().flatten() # print(f"Merged dataset items: {merged_dataset_items}") if __name__ == "__main__": asyncio.run(main()) ``` #### Sync Client ```python from apify_client import ApifyClient def main(): # Initialize the ApifyClient with your API token client = ApifyClient("YOUR_API_TOKEN") # Example: Retrieve dataset items from a specific Actor run run_id = "YOUR_RUN_ID" dataset_client = client.dataset("YOUR_DATASET_ID") # Fetch all items from the dataset with pagination items = dataset_client.list_items().flatten() # Process the retrieved items for item in items: print(item) # Example: Merging datasets (if applicable) # merged_dataset_items = client.dataset("YOUR_MERGED_DATASET_ID").list_items().flatten() # print(f"Merged dataset items: {merged_dataset_items}") if __name__ == "__main__": main() ``` ``` -------------------------------- ### Passing Input to Actor (Async and Sync Clients) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/01_passing_input_to_actor.mdx Demonstrates how to pass input to an Actor, such as the `apify/instagram-hashtag-scraper`, using both asynchronous and synchronous Python clients. ```APIDOC ## Passing Input to Actor ### Description This section details how to pass input data to an Actor using the `call` method, which allows for configuration of the Actor's input, execution, and retrieval of results. ### Method `client.call()` ### Endpoint N/A (Client-side method) ### Parameters #### Request Body (Implicit via `run_input` parameter) - **run_input** (object) - Required - The input object for the Actor. ### Request Example (Async Client) ```python from apify_client import ApifyClient # Initialize the client client = ApifyClient("YOUR_API_TOKEN") # Actor input run_input = { "hashtags": ["python", "apify"], "resultsLimit": 10, "sortBy": "recentPost" } # Call the Actor and wait for it to finish run = client.actor("apify/instagram-hashtag-scraper").call(run_input=run_input) # Print Actor run details print(run) ``` ### Request Example (Sync Client) ```python from apify_client import ApifyClient # Initialize the client client = ApifyClient("YOUR_API_TOKEN") # Actor input run_input = { "hashtags": ["python", "apify"], "resultsLimit": 10, "sortBy": "recentPost" } # Call the Actor and wait for it to finish run = client.actor("apify/instagram-hashtag-scraper").call(run_input=run_input) # Print Actor run details print(run) ``` ### Response #### Success Response (200) - **status** (string) - The status of the Actor run. - **id** (string) - The unique ID of the Actor run. - **defaultDatasetId** (string) - The ID of the default dataset for the run. - **defaultKeyValueStoreId** (string) - The ID of the default key-value store for the run. - **defaultLogId** (string) - The ID of the default log for the run. #### Response Example ```json { "id": "someRunId", "status": "SUCCEEDED", "defaultDatasetId": "someDatasetId", "defaultKeyValueStoreId": "someKeyValueStoreId", "defaultLogId": "someLogId" } ``` ``` -------------------------------- ### Apify Client Resource Hierarchy Source: https://github.com/apify/apify-client-python/blob/master/CLAUDE.md Illustrates the hierarchical structure of resource clients within the ApifyClient, showing how to access specific resources like actors, datasets, and runs. ```python ApifyClient ├── .actor(id) → ActorClient (single resource operations) ├── .actors() → ActorCollectionClient (list/create) ├── .dataset(id) → DatasetClient ├── .datasets() → DatasetCollectionClient ├── .run(id) → RunClient ├── .runs() → RunCollectionClient └── ... (schedules, tasks, webhooks, key-value stores, request queues, etc.) ``` -------------------------------- ### Run Integration Tests Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Executes integration tests that interact with the Apify platform. Requires specific environment variables like API tokens to be set. ```shell uv run poe integration-tests ``` -------------------------------- ### Python: Iterate Over All Dataset Items with Apify Client Source: https://context7.com/apify/apify-client-python/llms.txt Shows how to use a generator to iterate over all items in an Apify dataset one by one, automatically handling pagination. This is memory-efficient for large datasets. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') dataset_client = client.dataset('dataset-id') # Iterate over all items (handles pagination automatically) for item in dataset_client.iterate_items(): print(item) # With filters for item in dataset_client.iterate_items( limit=5000, # Max items to return fields=['name', 'price'], skip_empty=True, ): process_item(item) ``` -------------------------------- ### Configuring Retries in ApifyClient Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/05_retries.mdx Explains how to configure retry logic using the ApifyClient constructor options. ```APIDOC ## Configuring Retries ### Description The Apify client automatically retries requests that fail due to network errors, internal server errors (5xx), or rate limits (429). The default retry count is 8, using an exponential backoff strategy. ### Constructor Parameters - **max_retries** (int) - Optional - The maximum number of retry attempts. Defaults to 8. - **min_delay_between_retries_millis** (int) - Optional - The minimum delay between retries in milliseconds. ### Request Example ```python from apify_client import ApifyClient # Initialize client with custom retry settings client = ApifyClient( token='MY_API_TOKEN', max_retries=5, min_delay_between_retries_millis=1000 ) ``` ``` -------------------------------- ### Pass Input to Apify Actor (Python) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/03_guides/01_passing_input_to_actor.mdx Demonstrates how to pass input to an Apify Actor using the `call` method. This method allows for configuring the Actor's input, executing it, and waiting for completion. It supports both asynchronous and synchronous Python clients. ```python from apify_client import ApifyClient # Example using the asynchronous client async def run_actor_async(): client = ApifyClient("YOUR_APIFY_API_TOKEN") actor_id = "apify/instagram-hashtag-scraper" input_data = { "hashtags": "example" } run = await client.actor(actor_id).call(input_data) print(f"Actor run finished with status: {run['status']}") # Example using the synchronous client def run_actor_sync(): client = ApifyClient("YOUR_APIFY_API_TOKEN") actor_id = "apify/instagram-hashtag-scraper" input_data = { "hashtags": "example" } run = client.actor(actor_id).call(input_data) print(f"Actor run finished with status: {run['status']}") # To run the async example: # import asyncio # asyncio.run(run_actor_async()) # To run the sync example: # run_actor_sync() ``` -------------------------------- ### Build Apify Python Package Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Build the Apify Python client package using the 'poe build' command. This command, executed within a 'uv' environment, compiles the project into a distributable format suitable for uploading to package repositories like PyPI. ```sh uv run poe build ``` -------------------------------- ### Python: Retrieve Dataset Items with Apify Client Source: https://context7.com/apify/apify-client-python/llms.txt Demonstrates how to retrieve items from an Apify dataset using the Python client. Supports basic retrieval, pagination, filtering, and field selection. Returns a DatasetItemsPage object containing items and pagination metadata. ```python from apify_client import ApifyClient client = ApifyClient(token='MY-APIFY-TOKEN') dataset_client = client.dataset('dataset-id') # Basic retrieval items_page = dataset_client.list_items() print(f"Total items: {items_page.total}") print(f"Items in page: {items_page.count}") for item in items_page.items: print(item) # With pagination and filtering items_page = dataset_client.list_items( offset=0, limit=1000, clean=True, # Skip empty items and hidden fields fields=['title', 'url', 'price'], # Only return these fields desc=True, # Reverse order ) # Paginate through all items all_items = [] offset = 0 limit = 1000 while True: page = dataset_client.list_items(limit=limit, offset=offset) all_items.extend(page.items) if offset + limit >= page.total: break offset += limit print(f"Fetched {len(all_items)} total items") ``` -------------------------------- ### Run Code Quality Checks Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Executes the full suite of code quality tools, including linting, type checking, and unit tests. ```shell uv run poe check-code ``` -------------------------------- ### Run Type Checking Source: https://github.com/apify/apify-client-python/blob/master/CONTRIBUTING.md Verifies the codebase against type annotations using the ty tool. ```shell uv run poe type-check ``` -------------------------------- ### Configure Python Logging for Apify Client Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/06_logging.mdx This snippet shows how to configure the Python 'apify_client' logger to output debug information. It involves adding a handler to the logger to direct log messages to standard output, which is useful for monitoring API requests. ```python import logging # Get the apify_client logger logger = logging.getLogger('apify_client') # Set the logging level to DEBUG to capture detailed information logger.setLevel(logging.DEBUG) # Create a console handler console_handler = logging.StreamHandler() # Set the logging level for the handler console_handler.setLevel(logging.DEBUG) # Add the handler to the logger if not logger.handlers: logger.addHandler(console_handler) print("Apify client logger configured for debug output.") ``` -------------------------------- ### Python: Manage Actor Runs with Nested Clients (Async) Source: https://github.com/apify/apify-client-python/blob/master/website/versioned_docs/version-2.5/02_concepts/03_nested_clients.mdx This asynchronous Python code snippet shows how to use nested clients to manage Actor runs. It leverages the `ActorClient.last_run` method and provides direct access to the Dataset client. ```python from apify_client import ApifyClient # Initialize the ApifyClient client = ApifyClient("YOUR_API_TOKEN") # Get an actor client actor_client = client.actor("some-actor-id") # Get the last run of the actor last_run = actor_client.last_run() # Access the dataset client for the last run dataset_client = last_run.dataset() ``` -------------------------------- ### Run Specific Pytest Unit Tests Source: https://github.com/apify/apify-client-python/blob/master/GEMINI.md How to execute specific unit tests or test suites using pytest, including running individual test files or specific test functions. ```bash uv run pytest tests/unit/test_file.py uv run pytest tests/unit/test_file.py::test_name ```