### Install Development Dependencies Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Install the library with development and testing dependencies. This includes setting up pre-commit hooks for code quality. ```bash pip install mediacloud[dev] pip install mediacloud[test] flit install pre-commit install ``` -------------------------------- ### Install MediaCloud Python API Client Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Install the mediacloud package using pip. This is the first step to using the client library. ```bash pip install mediacloud ``` -------------------------------- ### Offset-based Pagination Example Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Example demonstrating offset-based pagination for directory APIs. ```python page = directory.source_list(..., limit=100, offset=0) while page['next']: offset += 100 page = directory.source_list(..., limit=100, offset=offset) ``` -------------------------------- ### Token-based Pagination Example Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Example demonstrating token-based pagination for story lists. ```python stories, next_token = search.story_list(..., pagination_token=None) while next_token: stories, next_token = search.story_list(..., pagination_token=next_token) ``` -------------------------------- ### MediaCloud DirectoryApi - Collection and Source Examples Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Demonstrates how to retrieve information about collections and sources using the DirectoryApi. Requires API key initialization. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') # Get collections collection = directory.collection(34412234) all_collections = directory.collection_list(platform='online_news') # Get sources source = directory.source(12345) sources = directory.source_list(collection_id=34412234) # Get feeds feeds = directory.feed_list(source_id=12345) ``` -------------------------------- ### Create Collection and Add Sources Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Example of creating a new collection with specified name, notes, and public status, and then potentially adding sources to it. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create collection collection = mgmt.collection_create( name='Technology News', notes='Collection of tech news sources', public=True ) ``` -------------------------------- ### MediaCloud SearchApi - Story Count and List Examples Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Shows how to get the total count of stories matching a query and retrieve a list of stories using the SearchApi. Requires API key initialization. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') # Get total count count = search.story_count( 'query', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) # Get stories stories, token = search.story_list( 'query', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) ``` -------------------------------- ### Get MediaCloud API Client Version Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Shows how to retrieve the installed version of the MediaCloud API client library. This is useful for debugging and ensuring compatibility. ```python import importlib.metadata try: VERSION = "v" + importlib.metadata.version('mediacloud') except importlib.metadata.PackageNotFoundError: VERSION = "dev" ``` ```python import mediacloud.api user_agent = mediacloud.api.USER_AGENT_STRING print(user_agent) # e.g., "mediacloud v5.1.0" ``` -------------------------------- ### MediaCloud SearchApi - Analytics Examples Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Illustrates fetching various analytics data like daily counts, weekly source breakdowns, and language distributions using the SearchApi. Requires API key initialization. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') # Get analytics daily = search.story_count_over_time(...) weekly = search.stories_by_source_week(...) sources = search.sources(...) languages = search.languages(...) ``` -------------------------------- ### Elasticsearch Query Syntax Examples Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Demonstrates various ways to construct search queries using Elasticsearch syntax for advanced filtering and targeted searches within the MediaCloud platform. ```text # Simple text search climate change ``` ```text # Boolean operators (climate OR global warming) AND policy ``` ```text # Phrase search "artificial intelligence" AND regulation ``` ```text # Field-specific search title:"climate change" AND author:Smith ``` ```text # Exclude terms climate NOT hoax ``` -------------------------------- ### Internal Query Method Examples Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Demonstrates making direct HTTP requests to the MediaCloud API using the internal _query method. This method supports GET, POST, and DELETE operations with specified endpoints and parameters. ```python import mediacloud.api api = mediacloud.api.BaseApi('your-api-key') # GET request response = api._query('version') # POST request with JSON body response = api._query( 'sources/collections/', params={'name': 'My Collection'}, method='POST' ) # DELETE request response = api._query( f'sources/collections/123/', method='DELETE' ) ``` -------------------------------- ### Browse Directory Content Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Examples for retrieving collections, sources, and feeds using the DirectoryApi. Includes methods for fetching single items and paginated lists. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('key') # Collections collection = directory.collection(34412234) collections = directory.collection_list(platform='online_news', limit=100, offset=0) # Sources source = directory.source(12345) sources = directory.source_list(collection_id=34412234, limit=100, offset=0) # Feeds feeds = directory.feed_list(source_id=12345, limit=100, offset=0) ``` -------------------------------- ### Initialize Multiple API Clients Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Shows how to initialize the SearchApi, DirectoryApi, and DirectoryManagementApi clients using a single API key. This is a common setup for applications interacting with various parts of the MediaCloud API. ```python import mediacloud.api import mediacloud.mgmt api_key = 'your-api-key' # Search API search = mediacloud.api.SearchApi(api_key) # Directory browsing directory = mediacloud.api.DirectoryApi(api_key) # Directory management (admin only) mgmt = mediacloud.mgmt.DirectoryManagementApi(api_key) ``` -------------------------------- ### Handling MCException Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/errors.md Example of catching MCException, typically raised during initialization or configuration errors. It prints the error message and status code. ```python import mediacloud.api import mediacloud.error try: mc = mediacloud.api.SearchApi(None) # No API key except mediacloud.error.MCException as e: print(f"Error: {e.message}") print(f"Status: {e.status_code}") ``` -------------------------------- ### Perform Story Searches and Get Counts Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Demonstrates common search operations including getting the total story count, listing stories with pagination, and retrieving time-series analytics like daily counts and weekly source breakdowns. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('key') # Get count count = search.story_count('query', start_date, end_date, collection_ids=[...]) # Get stories stories, next_token = search.story_list('query', start_date, end_date, collection_ids=[...]) # Get analytics daily = search.story_count_over_time('query', start_date, end_date, ...) weekly = search.stories_by_source_week('query', start_date, end_date, ...) sources = search.sources('query', start_date, end_date, ...) languages = search.languages('query', start_date, end_date, ...) words = search.words('query', start_date, end_date, ...) # Get single story story = search.story('story-id') # Sample sample = search.story_sample('query', start_date, end_date, ...) ``` -------------------------------- ### Get a Sample of Stories Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Retrieves a non-paginated sample of stories that match a query. Use this when you need a small, random set of results without managing pagination. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') # Get a sample of stories sample = search.story_sample( 'vaccination', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], limit=50 ) for story in sample: print(f"{story['media_name']}: {story['title']}") ``` -------------------------------- ### Search Story Count with Default Provider Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Example of searching for story counts using the default provider. This is useful for general content analysis. ```python results = mc_search.story_count( 'climate', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) ``` -------------------------------- ### Customize API Client Class-Level Configuration Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Example of creating a custom API client class to set configuration options like timeout and rate limits at the class level. Use this for defining default configurations for multiple instances. ```python import mediacloud.api class CustomSearchApi(mediacloud.api.SearchApi): TIMEOUT_SECS = 120 RATE_LIMIT_PER_MINUTE = 5 BASE_API_URL = "http://localhost:8000/api/" api = CustomSearchApi('your-api-key') ``` -------------------------------- ### Get Sources in a Collection Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Retrieves all sources within a specified collection, handling pagination. Fetches sources in batches of 100. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') all_sources = [] offset = 0 while True: page = directory.source_list( collection_id=34412234, limit=100, offset=offset ) all_sources.extend(page['results']) if page['next'] is None: break offset += len(page['results']) print(f"Collection contains {len(all_sources)} sources") ``` -------------------------------- ### Customizing Rate Limits for Search API Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Demonstrates how to subclass SearchApi to increase the rate limit per minute. This is useful for authorized users who require higher throughput. The example shows setting RATE_LIMIT_PER_MINUTE to 10 and instantiating the custom API with an API key. ```python import mediacloud.api # Increase rate limit for authorized users class HighRateLimitSearchApi(mediacloud.api.SearchApi): RATE_LIMIT_PER_MINUTE = 10 api = HighRateLimitSearchApi('your-api-key') ``` -------------------------------- ### Search Story Count with Explicit Provider Override Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Example of overriding the default provider to specifically query a different platform, such as YouTube. Use this when targeting content from a specific media source. ```python results = mc_search.story_count( 'climate', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], platform='youtube' ) ``` -------------------------------- ### Get API Version Information Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Fetch version details of the MediaCloud API server, including its Git revision, current timestamp, and version string. Requires an initialized BaseApi instance. ```python import mediacloud.api api = mediacloud.api.BaseApi('your-api-key') version_info = api.version() print(f"API Version: {version_info['version']}") print(f"Git Revision: {version_info['GIT_REV']}") print(f"Server Time: {version_info['now']}") ``` -------------------------------- ### Get Weekly Source Attention Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Retrieve weekly attention metrics for sources based on a search query and date range. Useful for analyzing trends over time. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') weekly_attention = search.stories_by_source_week( 'artificial intelligence', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) for item in weekly_attention[:5]: print(f"{item['media_name']} (week {item['week']}): {item['matching_stories']} stories") ``` -------------------------------- ### Initialize API Client Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Shows how to initialize API client classes with an authentication token. ```python # From https://search.mediacloud.org/ api_key = 'your-api-key' # Initialize any API class search = mediacloud.api.SearchApi(api_key) directory = mediacloud.api.DirectoryApi(api_key) ``` -------------------------------- ### MediaCloud DirectoryManagementApi - Create Collection and Source Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Shows how to create a new collection and a new source using the DirectoryManagementApi. This requires admin permissions and an API key. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create collection collection = mgmt.collection_create(name='My Collection', public=True) # Create source source = mgmt.source_create( name='News Source', homepage='https://example.com' ) ``` -------------------------------- ### Initialize SearchApi Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Basic initialization of the SearchApi class requires an API key obtained from the MediaCloud website. ```python import mediacloud.api # Basic initialization api_key = 'your-api-key-from-mediacloud' mc_search = mediacloud.api.SearchApi(api_key) ``` -------------------------------- ### List All Feeds Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Fetches a list of all available RSS/content feeds. Useful for getting an overview of available feeds or for initial data exploration. ```python import mediacloud.api import datetime directory = mediacloud.api.DirectoryApi('your-api-key') # List all feeds all_feeds = directory.feed_list(limit=100) print(f"Total feeds: {all_feeds['count']}") ``` -------------------------------- ### Initialize DirectoryManagementApi Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Instantiate the DirectoryManagementApi with your MediaCloud API key. Ensure the key has management permissions. ```python import mediacloud.mgmt api_key = 'your-admin-api-key-from-mediacloud' mgmt = mediacloud.mgmt.DirectoryManagementApi(api_key) ``` -------------------------------- ### Create, Update, and Delete Sources Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Illustrates how to manage sources using the DirectoryManagementApi. This includes creating new sources, updating existing ones, and deleting them. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('key') # Sources source = mgmt.source_create(name='Source', homepage='https://...') mgmt.source_update(source_id=123, name='Updated') mgmt.source_delete(123) ``` -------------------------------- ### Basic MediaCloud API Search Usage Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Demonstrates initializing the SearchApi and performing a story search with date and collection filters. It also shows how to paginate through results. ```python import mediacloud.api import datetime # Initialize with your API key api_key = 'your-api-key-from-mediacloud' search = mediacloud.api.SearchApi(api_key) # Search for stories stories, next_token = search.story_list( 'climate change', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], page_size=50 ) # Process results for story in stories: print(f"{story['media_name']}: {story['title']}") # Page through results while next_token: stories, next_token = search.story_list( 'climate change', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], pagination_token=next_token, page_size=50 ) for story in stories: print(f"{story['media_name']}: {story['title']}") ``` -------------------------------- ### Get Attention Over Interval Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Retrieve attention metrics for sources over a specified time interval (e.g., month, day). Useful for detailed trend analysis. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') interval_attention = search.stories_by_source_over_interval( 'pandemic', start_date=datetime.date(2020, 1, 1), end_date=datetime.date(2020, 12, 31), collection_ids=[34412234], interval='month' ) for item in interval_attention: print(f"{item['media_name']} ({item['bucket']}): {item['matching_stories']} stories") ``` -------------------------------- ### Initialize Search and Directory APIs Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Initialize the SearchApi and DirectoryApi classes with your API key. These are used for searching stories and browsing directory content respectively. ```python import mediacloud.api # Initialize search API search = mediacloud.api.SearchApi('your-api-key') # Initialize directory API directory = mediacloud.api.DirectoryApi('your-api-key') ``` -------------------------------- ### Initialize DirectoryApi Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Initialize the DirectoryApi class with your MediaCloud API key. ```python import mediacloud.api api_key = 'your-api-key-from-mediacloud' mc_directory = mediacloud.api.DirectoryApi(api_key) ``` -------------------------------- ### Get User Profile Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Retrieve basic information about the currently authenticated user, including their roles and email. Ensure you have initialized BaseApi with a valid API key. ```python import mediacloud.api api = mediacloud.api.BaseApi('your-api-key') profile = api.user_profile() print(f"User roles: {profile.get('roles')}") print(f"Email: {profile.get('email')}") ``` -------------------------------- ### Get All Sources in a Specific Collection Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Retrieve all sources belonging to a particular collection by specifying the collection ID and using pagination. This is useful for analyzing the content of a single collection. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') collection_id = 34412118 sources = [] offset = 0 while True: response = directory.source_list( collection_id=collection_id, limit=100, offset=offset ) sources += response['results'] if response['next'] is None: break offset += len(response['results']) print(f"Collection has {len(sources)} sources") ``` -------------------------------- ### Use Type Hints for Story and Collection Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Illustrates how to use type hints for Story and Collection objects from the mediacloud.types module. These are not instantiated directly but used for static analysis. ```python from mediacloud.types import Story, Collection # These are type hints - not instantiated directly def process_story(story: Story) -> None: print(story['title']) ``` -------------------------------- ### Create a New Source Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Creates a new news source with specified details. Requires 'name' and 'homepage' parameters. Other optional parameters like 'label', 'media_type', and 'primary_language' can also be provided. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create a new source source = mgmt.source_create( name='Tech News Daily', homepage='https://technewsdaily.example.com', label='TND', media_type='newspaper', primary_language='en', pub_country='US' ) print(f"Created source ID: {source['id']}") ``` -------------------------------- ### Analyze Language Distribution in Articles Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Use the `languages` method to get a breakdown of languages present in stories matching your search criteria. This helps in understanding the linguistic diversity of coverage. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') languages = search.languages( 'climate', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], limit=10 ) for lang in languages: print(f"{lang['language']}: {lang['ratio']:.2%}") ``` -------------------------------- ### SearchApi Constructor Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Initializes the SearchApi with an authentication token. This is required to authenticate requests to the MediaCloud API. ```APIDOC ## SearchApi Constructor ### Description Initialize SearchApi with authentication token. ### Parameters - **auth_token** (str) - Required - API token from https://search.mediacloud.org/ ### Returns SearchApi instance ### Throws - `mediacloud.error.MCException` - If auth_token not provided ### Example ```python import mediacloud.api api_key = 'your-api-key-from-mediacloud' search = mediacloud.api.SearchApi(api_key) ``` ``` -------------------------------- ### Initialize DirectoryManagementApi Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Initialize the DirectoryManagementApi class using your API key. ```python import mediacloud.mgmt api_key = 'your-api-key-from-mediacloud' mc_mgmt = mediacloud.mgmt.DirectoryManagementApi(api_key) ``` -------------------------------- ### Manage Source in Multiple Collections Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Demonstrates how to list all collections a source belongs to, add it to a new collection, and remove it from an old one. Requires the source ID and the IDs of the collections involved. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') source_id = 12345 # Get all collections containing source collections = mgmt.source_collection_list(source_id=source_id) # Add to a new collection mgmt.source_collection_create( source_id=source_id, collection_id=34412235 # Different collection ) # Remove from an old collection mgmt.source_collection_delete( source_id=source_id, collection_id=34412234 # Old collection ) ``` -------------------------------- ### Get Story Count Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Retrieve the total count of stories matching a query within a specified date range and optional collections or sources. Useful for understanding the volume of coverage on a topic. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') count = search.story_count( 'climate change', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) print(f"Matching stories: {count['relevant']}") print(f"Total stories: {count['total']}") ``` -------------------------------- ### BaseApi Constructor Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Initializes a BaseApi instance, setting up authentication and an HTTP session with rate limiting. ```APIDOC ## BaseApi Constructor ### Description Initializes a BaseApi instance with authentication credentials, configuring an HTTP session with rate limiting and necessary headers. ### Method __init__ ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Example ```python import mediacloud.api api_key = 'your-api-key-from-mediacloud' base_api = mediacloud.api.BaseApi(api_key) ``` ### Returns BaseApi instance with authenticated session configured. ### Throws - `mediacloud.error.MCException` - If `auth_token` is None or empty ``` -------------------------------- ### Get Recent Feed Changes Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Retrieve a list of feeds that have been modified within a specified time frame, such as the last 24 hours. The `return_details` parameter can be set to True to include more information about each feed. ```python import mediacloud.api import datetime directory = mediacloud.api.DirectoryApi('your-api-key') # Feeds modified in the last 24 hours yesterday = datetime.datetime.now() - datetime.timedelta(days=1) recent_feeds = directory.feed_list( modified_since=yesterday, limit=100, return_details=True ) print(f"Feeds modified in last 24h: {len(recent_feeds['results'])}") ``` -------------------------------- ### Resolving Missing API Key Error Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/errors.md Shows how to correctly initialize the SearchApi with a valid API key to avoid MCException errors related to missing authentication. ```python api_key = 'your-api-key-from-mediacloud' mc_search = mediacloud.api.SearchApi(api_key) ``` -------------------------------- ### API Key Authentication from Environment Variable Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Demonstrates how to authenticate with the MediaCloud API by reading the API key from an environment variable. This is the recommended approach for security. ```python import os import mediacloud.api # Option 1: Read from environment api_key = os.getenv('MEDIACLOUD_API_KEY') if not api_key: raise ValueError("MEDIACLOUD_API_KEY environment variable not set") mc_search = mediacloud.api.SearchApi(api_key) ``` ```python from dotenv import load_dotenv load_dotenv() api_key = os.getenv('MEDIACLOUD_API_KEY') mc_search = mediacloud.api.SearchApi(api_key) ``` -------------------------------- ### Create, Copy, Update, and Delete Collections Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Demonstrates the basic CRUD operations for collections using the DirectoryManagementApi. Ensure you have the necessary API key and permissions. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('key') # Collections collection = mgmt.collection_create(name='My Collection') copied = mgmt.collection_copy(collection_id=123, name='Copy') mgmt.collection_update(collection_id=123, name='Updated') mgmt.collection_delete(123) ``` -------------------------------- ### Get Story Count Over Time Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Obtain a daily breakdown of story counts for a given query and date range, optionally filtered by collections or sources. This helps in analyzing trends and the temporal distribution of coverage. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') daily_counts = search.story_count_over_time( 'election', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) for point in daily_counts: print(f"{point['date']}: {point['count']} stories ({point['ratio']:.2%})") ``` -------------------------------- ### version() Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Fetches version information about the MediaCloud API server, including its Git revision and current timestamp. ```APIDOC ## version() ### Description Get version information about the MediaCloud API server. ### Method version ### Parameters None ### Returns VersionInfo - Dictionary containing: GIT_REV (git revision), now (current epoch timestamp), version (version string) ### Throws - `mediacloud.error.APIResponseError` - If API request fails ### Example ```python import mediacloud.api api = mediacloud.api.BaseApi('your-api-key') version_info = api.version() print(f"API Version: {version_info['version']}") print(f"Git Revision: {version_info['GIT_REV']}") print(f"Server Time: {version_info['now']}") ``` ``` -------------------------------- ### Customize API Client Instance-Level Configuration Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/configuration.md Demonstrates how to modify configuration settings like timeout on an existing API client instance after it has been initialized. Note that rate limits cannot be changed post-initialization. ```python import mediacloud.api api = mediacloud.api.SearchApi('your-api-key') # These can be modified after creation api.TIMEOUT_SECS = 120 # But RATE_LIMIT_PER_MINUTE cannot be changed after __init__ # (session is already created with the default rate limit) ``` -------------------------------- ### Initialize BaseApi Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/base_api.md Instantiate the BaseApi class with your MediaCloud API token. This sets up authentication and rate limiting for subsequent API calls. ```python import mediacloud.api api_key = 'your-api-key-from-mediacloud' base_api = mediacloud.api.BaseApi(api_key) ``` -------------------------------- ### Keyword-Only Parameters Usage Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Demonstrates the correct and incorrect ways to pass arguments to API methods. All methods in DirectoryManagementApi use keyword-only parameters for safety, requiring arguments to be passed by name. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # ✓ Correct: keyword arguments collection = mgmt.collection_create(name='My Collection', notes='Test') # ✗ Wrong: positional arguments (will raise TypeError) collection = mgmt.collection_create('My Collection', 'Test') # ✓ Correct: keyword-only parameters in other methods mgmt.collection_update(collection_id=123, name='Updated') # ✗ Wrong: positional collection_id (will raise TypeError) mgmt.collection_update(123, name='Updated') ``` -------------------------------- ### Paginate Source List Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Retrieve all sources from the directory using a while loop and offset. This method is suitable for iterating through large datasets. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') all_results = [] offset = 0 limit = 100 while True: page = directory.source_list(limit=limit, offset=offset) all_results.extend(page['results']) if page['next'] is None: # Last page break offset += limit print(f"Retrieved {len(all_results)} total sources") ``` -------------------------------- ### Manage Source-Collection Associations Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Shows how to list, create, and delete associations between sources and collections using the DirectoryManagementApi. These operations link specific sources to collections. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('key') # Associations collections = mgmt.source_collection_list(source_id=123)mgmt.source_collection_create(source_id=123, collection_id=456)mgmt.source_collection_delete(source_id=123, collection_id=456) ``` -------------------------------- ### List Sources with Pagination Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Fetches all sources within a specific collection, handling pagination by iterating through results until no more pages are available. Useful for retrieving complete lists of sources. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') # Fetch all sources in a collection INDIA_NATIONAL_COLLECTION = 34412118 all_sources = [] offset = 0 while True: response = directory.source_list( collection_id=INDIA_NATIONAL_COLLECTION, limit=100, offset=offset ) all_sources += response['results'] # Check if there are more pages if response['next'] is None: break offset += len(response['results']) ``` -------------------------------- ### Configure API Client Rate Limit Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Illustrates subclassing to set a custom rate limit for API requests. ```python import mediacloud.api # Subclass to set rate limit class HighRateLimitSearchApi(mediacloud.api.SearchApi): RATE_LIMIT_PER_MINUTE = 10 fast_search = HighRateLimitSearchApi('your-api-key') ``` -------------------------------- ### Configure API Client Timeout Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Demonstrates how to increase the default request timeout for API clients. ```python import mediacloud.api # Increase timeout search = mediacloud.api.SearchApi('your-api-key') search.TIMEOUT_SECS = 120 ``` -------------------------------- ### Comprehensive Error Handling Pattern Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/errors.md Illustrates a robust error handling strategy for the MediaCloud API client, catching specific MediaCloud exceptions and general exceptions. ```python import mediacloud.api import mediacloud.error import datetime try: mc_search = mediacloud.api.SearchApi('your-api-key') stories, next_token = mc_search.story_list( 'climate change', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234] ) except mediacloud.error.MCException as e: # Handle configuration/initialization errors print(f"Configuration error: {e.message}") except mediacloud.error.APIResponseError as e: # Handle API server errors if e.response.status_code == 401: print("Authentication failed - check your API key") elif e.response.status_code == 400: print(f"Invalid request: {e.data.get('note')}") else: print(f"Server error ({e.response.status_code}): {e.data.get('note')}") except Exception as e: # Handle other exceptions (network errors, etc.) print(f"Unexpected error: {e}") ``` -------------------------------- ### source_create Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Creates a new news source with the provided details. Requires at least a name and homepage. ```APIDOC ## source_create(**kwargs) ### Description Creates a new news source. Requires at least a name and homepage. ### Parameters All parameters are keyword-only: #### Query Parameters - **name** (str) - Yes - Display name of the source - **homepage** (str) - Yes - URL of the source's homepage - **label** (str) - No - Short label or abbreviation - **platform** (str) - No - Platform identifier - **url_search_string** (str) - No - Search string for finding content - **notes** (str) - No - Descriptive notes - **media_type** (str) - No - Type of media (e.g., "newspaper", "blog") - **pub_state** (str) - No - Publication state/region - **pub_country** (str) - No - Publication country code - **primary_language** (str) - No - Primary language code ### Returns - **dict** - Created source object with id and all fields ### Throws - `ValueError` - If 'name' or 'homepage' is missing - `mediacloud.error.APIResponseError` - If creation fails ### Example ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create a new source source = mgmt.source_create( name='Tech News Daily', homepage='https://technewsdaily.example.com', label='TND', media_type='newspaper', primary_language='en', pub_country='US' ) print(f"Created source ID: {source['id']}") ``` ``` -------------------------------- ### Initialize Directory Management API Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Initialize the DirectoryManagementApi for managing collections and sources. This API is typically used for administrative tasks. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') collection = mgmt.collection_create(name='My Collection') ``` -------------------------------- ### Run Tests Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Execute the project's test suite using pytest. Ensure all tests pass to verify the library's functionality. ```bash pytest ``` -------------------------------- ### Search for Stories with Pagination Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/search_api.md Performs a basic search for stories matching a query within a specified date range and collection. It demonstrates how to retrieve the first page of results and how to paginate through subsequent pages using a token. ```python import mediacloud.api import datetime search = mediacloud.api.SearchApi('your-api-key') # Basic search stories, next_token = search.story_list( 'climate', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], page_size=50 ) print(f"Retrieved {len(stories)} stories") for story in stories: print(f" {story['title']} ({story['publish_date']})") # Page through results all_stories = [] pagination_token = None more_stories = True while more_stories: page, pagination_token = search.story_list( 'election', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], pagination_token=pagination_token, page_size=100 ) all_stories.extend(page) more_stories = pagination_token is not None print(f"Retrieved {len(all_stories)} total stories") # Get full text for specific stories stories_with_text, _ = search.story_list( 'technology', start_date=datetime.date(2023, 1, 1), end_date=datetime.date(2023, 12, 31), collection_ids=[34412234], expanded=True, page_size=10 ) ``` -------------------------------- ### DirectoryManagementApi Methods Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Methods for managing collections and sources. These functions require admin permissions and allow for the creation of new collections and sources. ```APIDOC ## DirectoryManagementApi Methods ### Description Methods for managing collections and sources. These functions require admin permissions and allow for the creation of new collections and sources. ### Methods - `collection_create()`: Create a new collection. - `source_create()`: Create a new source. ``` -------------------------------- ### Platform Constants for Filtering Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Defines and demonstrates the use of platform constants for filtering collections and sources. Using these constants ensures correct and consistent platform identification. ```python import mediacloud.api directory = mediacloud.api.DirectoryApi('your-api-key') # Using constants youtube_collections = directory.collection_list( platform=directory.PLATFORM_YOUTUBE ) reddit_sources = directory.source_list( platform=directory.PLATFORM_REDDIT ) ``` -------------------------------- ### DirectoryManagementApi - Sources Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Methods for creating, updating, and deleting sources. ```APIDOC ## DirectoryManagementApi - Sources ### Description Methods for managing sources, including creation, updating, and deletion. ### Methods - `source_create(name: str, homepage: str)`: Creates a new source. - `source_update(source_id: int, name: str)`: Updates an existing source. - `source_delete(source_id: int)`: Deletes a source. ``` -------------------------------- ### DirectoryManagementApi Method Signatures Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Lists the method signatures available in the DirectoryManagementApi for managing directory resources. This API is used for creating, updating, and deleting collections and sources. ```python ``` -------------------------------- ### Platform Constants Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Constants for different platforms supported by the DirectoryApi. ```python PLATFORM_ONLINE_NEWS = "online_news" PLATFORM_YOUTUBE = "youtube" PLATFORM_TWITTER = "twitter" PLATFORM_REDDIT = "reddit" ``` -------------------------------- ### Find Sources by Name Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Retrieves a list of sources whose names partially match the provided query. Useful for finding specific sources when the exact name is unknown. ```python # Find sources by name bbc_sources = directory.source_list(name='BBC', limit=50) for source in bbc_sources['results']: print(f"{source['name']} ({source['id']})") ``` -------------------------------- ### Fetch all Sources in a Collection Source: https://github.com/mediacloud/api-client/blob/main/README.md Retrieves all sources within a specified collection, fetching them in pages of a defined size. It continues fetching until no more pages are available. ```python import mediacloud.api INDIA_NATIONAL_COLLECTION = 34412118 SOURCES_PER_PAGE = 100 # the number of sources retrieved per page mc_directory = mediacloud.api.DirectoryApi(YOUR_MC_API_KEY) sources = [] offset = 0 # offset for paging through while True: # grab a page of sources in the collection response = mc_directory.source_list(collection_id=INDIA_NATIONAL_COLLECTION, limit=SOURCES_PER_PAGE, offset=offset) # add it to our running list of all the sources in the collection sources += response['results'] # if there is no next page then we're done so bail out if response['next'] is None: break # otherwise setup to fetch the next page of sources offset += len(response['results']) print("India National Collection has {} sources".format(len(sources))) ``` -------------------------------- ### DirectoryApi - Source List Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Retrieves a list of sources, with options for filtering by platform, name, or collection ID. ```APIDOC ## DirectoryApi - Source List ### Description Retrieves a list of sources. Supports filtering by platform, name, and collection ID, with options for pagination. ### Method Signature `source_list(platform: str | None = None, name: str | None = None, collection_id: int | None = None, limit: int = 0, offset: int = 0) -> OffsetPage` ``` -------------------------------- ### Handle API Errors Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Demonstrates how to catch and handle MediaCloud API exceptions, including configuration and response errors. ```python import mediacloud.api import mediacloud.error try: search = mediacloud.api.SearchApi('your-api-key') results = search.story_list(...) except mediacloud.error.MCException as e: print(f"Configuration error: {e.message}") except mediacloud.error.APIResponseError as e: print(f"API error {e.response.status_code}: {e.data.get('note')}") ``` -------------------------------- ### DirectoryApi Method Signatures Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Outlines the method signatures for the DirectoryApi, used for retrieving information about collections and sources. These methods facilitate browsing and filtering directory data. ```python def collection(collection_id: int) -> Collection def collection_list(platform: str | None = None, name: str | None = None, limit: int = 0, offset: int = 0, source_id: int | None = None) -> OffsetPage def source(source_id: int) -> Source def source_list(platform: str | None = None, name: str | None = None, collection_id: int | None = None, limit: int = 0, offset: int = 0) -> OffsetPage def feed_list(source_id: int | None = None, modified_since: dt.datetime | int | float | None = None, modified_before: dt.datetime | int | float | None = None, limit: int = 0, offset: int = 0, return_details: bool = False) -> JSONObj ``` -------------------------------- ### DirectoryApi Methods Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/README.md Methods for browsing the MediaCloud directory, including collections, sources, and feeds. These functions enable you to retrieve information about existing directory entities. ```APIDOC ## DirectoryApi Methods ### Description Methods for browsing the MediaCloud directory, including collections, sources, and feeds. These functions enable you to retrieve information about existing directory entities. ### Methods - `collection()`: Get collection by ID. - `collection_list()`: List collections with filtering. - `source()`: Get source by ID. - `source_list()`: List sources with filtering. - `feed_list()`: List feeds with filtering. ``` -------------------------------- ### SearchApi - Story Sample Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Retrieves a sample of stories matching a query within a specified date range, with options for limit and expansion. ```APIDOC ## SearchApi - Story Sample ### Description Retrieves a sample of stories that match the given query within the specified date range. Supports limiting the number of results and expanding the returned story objects. ### Method Signature `story_sample(query: str, start_date: dt.date, end_date: dt.date, collection_ids: List[int] = [], source_ids: List[int] = [], platform: str | None = None, limit: int | None = None, expanded: bool = False) -> List[Story]` ``` -------------------------------- ### source_list Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_api.md Fetches a list of sources with various filtering options. Supports filtering by platform, name, collection ID, and pagination. ```APIDOC ## source_list ### Description Fetches a list of sources with various filtering options. Supports filtering by platform, name, collection ID, and pagination. ### Method GET ### Endpoint /sources ### Parameters #### Query Parameters - **platform** (str | None) - Optional - Filter by platform: "online_news", "youtube", "twitter", "reddit" - **name** (str | None) - Optional - Filter by source name (partial match) - **collection_id** (int | None) - Optional - Filter sources in a specific collection - **limit** (int) - Optional - Number of results per page (0 = no limit) - **offset** (int) - Optional - Starting position for pagination ### Response #### Success Response (200) - **count** (int) - Total number of results - **next** (str | None) - URL for the next page of results - **previous** (str | None) - URL for the previous page of results - **results** (list) - List of Source objects ### Response Example { "count": 100, "next": "/v2/sources?limit=100&offset=100", "previous": null, "results": [ { "id": 123, "name": "Example Source", "url": "http://example.com", "platform": "online_news" } ] } ### Throws - `mediacloud.error.APIResponseError` - If API request fails ``` -------------------------------- ### collection_create() Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Creates a new collection with specified attributes. Requires a name and can include optional notes, visibility, and feature flags. ```APIDOC ## collection_create() ### Description Create a new collection. ### Parameters All parameters are keyword-only: | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | name | str | Yes | — | Display name for the collection | | notes | str | No | None | Descriptive notes about the collection | | public | bool | No | None | Whether collection is publicly visible | | featured | bool | No | None | Whether to feature the collection | | managed | bool | No | None | Whether collection is staff-managed | | monitored | bool | No | None | Whether collection is actively monitored | ### Returns | Type | Description | |------|-------------| | dict | Created collection object with id and all provided fields | ### Throws - `ValueError` - If 'name' is missing or empty - `mediacloud.error.APIResponseError` - If creation fails ### Example ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create a basic collection collection = mgmt.collection_create( name='My News Collection', notes='A test collection for analysis', public=True ) print(f"Created collection: {collection['id']}") ``` ``` -------------------------------- ### Source Management Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/index.md Methods for creating, updating, and deleting sources, and listing their associated collections. ```APIDOC ## Source Management ### Description Methods for creating, updating, and deleting sources, and listing their associated collections. ### Methods #### `source_create(**kwargs)` Creates a new source. #### `source_update(*, source_id: int, **kwargs)` Updates an existing source. #### `source_delete(source_id: int)` Deletes a source. #### `source_collection_list(*, source_id: int)` Lists all collections associated with a given source. ### Return Types - `source_create`: `dict` - `source_update`: `dict` - `source_delete`: `dict` - `source_collection_list`: `list[dict]` ``` -------------------------------- ### Create a New Collection Source: https://github.com/mediacloud/api-client/blob/main/_autodocs/api-reference/directory_management_api.md Use this method to create a new collection with specified details. The 'name' parameter is mandatory. ```python import mediacloud.mgmt mgmt = mediacloud.mgmt.DirectoryManagementApi('your-api-key') # Create a basic collection collection = mgmt.collection_create( name='My News Collection', notes='A test collection for analysis', public=True ) print(f"Created collection: {collection['id']}") ``` -------------------------------- ### Page Through Stories Matching a Query Source: https://github.com/mediacloud/api-client/blob/main/README.md Fetches stories matching a query with specific keywords, within a collection, and handles pagination using a token. This is useful for iterating through results page by page. ```python import mediacloud.api INDIA_NATIONAL_COLLECTION = 34412118 mc_search = mediacloud.api.SearchApi(YOUR_MC_API_KEY) all_stories = [] pagination_token = None more_stories = True while more_stories: page, pagination_token = mc_search.story_list('modi AND biden', collection_ids=[INDIA_NATIONAL_COLLECTION], pagination_token=pagination_token) all_stories += page more_stories = pagination_token is not None print(f"Retrived {len(all_stories)} matching stories") ```