### Run Auth Helper Example Application Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/auth/README.md Executes the main authentication helper example script using the Python interpreter. This will log in, display the access token, and start a background token refresh process. ```sh python -m example.auth.auth ``` -------------------------------- ### Install Python Dependencies Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/auth/README.md Installs the necessary Python packages listed in the 'requirements.txt' file using pip. This ensures all required libraries are available for the application. ```sh pip install -r requirements.txt ``` -------------------------------- ### Get Batch Download Headers (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md This example demonstrates how to retrieve header information for a batch download using a HEAD request. This is useful for getting details like Last-Modified and Content-Length without downloading the entire batch. ```bash HEAD https://api.enterprise.wikimedia.com/v2/batches/2025-07-16/05/afwikibooks_namespace_0/download ``` -------------------------------- ### Download a Batch (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md This example illustrates how to download a batch using a GET request. It also shows how to use the 'Range' header to download specific byte ranges of the batch file, enabling parallel downloads. ```bash GET https://api.enterprise.wikimedia.com/v2/batches/2025-07-16/05/afwikibooks_namespace_0/download ``` ```json { "Range": "bytes=0-20" } ``` ```bash GET https://api.enterprise.wikimedia.com/v2/batches/2025-07-16/05/afwikibooks_namespace_0/download ``` ```json { "Range": "bytes=21-36" } ``` -------------------------------- ### Get All SC Snapshot Metadata (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Retrieves metadata for all available Structured-Contents snapshots. This is a basic GET request to the snapshots endpoint. ```bash POST https://api.enterprise.wikimedia.com/v2/snapshots/structured-contents ``` -------------------------------- ### Run WME SDK Metadata Example Script Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/snapshots/README.md This command executes the metadata example script from the wme-sdk-python project. It assumes you are running it within the project's virtual environment. ```bash python -m example.metadata.metadata ``` -------------------------------- ### Get All Available Batches Metadata (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md This example shows how to retrieve metadata for all available batches for a specific day and hour using a POST request to the Batches API. It returns a list of batch objects, each containing details like identifier, version, and size. ```bash POST https://api.enterprise.wikimedia.com/batches/2025-07-16/05 ``` -------------------------------- ### Get Single Batch Metadata (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md This example shows how to fetch the metadata for a single, specific batch using its identifier. The request targets the v2 endpoint and includes the date, hour, and the batch's unique identifier. ```bash POST https://api.enterprise.wikimedia.com/v2/batches/2025-07-16/05/enwiki_namespace_0 ``` -------------------------------- ### Login: Get Access and Refresh Tokens Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/login/README.md Obtain an access token and refresh token by submitting a username and password to the authentication API. The access token is necessary for accessing data and metadata APIs and is valid for 24 hours. This example demonstrates the request and response structure. ```bash POST https://auth.enterprise.wikimedia.com/v1/login with request parameter: { "username" : "usernamexyz", "password": "passwordxyz" } Response: { "id_token": "abc....", "access_token": "abc...", "refresh_token": "abc..", "expires_in": 86400 } ``` -------------------------------- ### Python: Get Access Token and Make API Call with WME SDK Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/README.md This Python example shows how to obtain an access token using the WME SDK's Helper and AuthClient, and then use that token to make a request to the Structured Contents endpoint. It includes setting up API clients, defining request parameters, and handling responses. Dependencies include modules.auth.helper, modules.auth.auth_client, and modules.api.api_client. ```python import time import logging import json import sys from modules.auth.helper import Helper from modules.auth.auth_client import AuthClient from modules.api.api_client import Client, Request, Filter logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) def main(): auth_client = AuthClient() try: helper = Helper(auth_client) except Exception as e: logger.fatal(f"Failed to create helper: {e}") return try: token = helper.get_access_token() logger.info(f"Access token: {token}") # Use the token to get articles from the API api_client = Client() api_client.set_access_token(token) request = Request( fields=["name", "abstract", "url", "version"], filters=[Filter(field="in_language.identifier", value="en")] ) try: articles = api_client.get_articles("Montreal", request) except Exception as e: logger.error(f"Failed to get articles: {e}") return for article in articles: try: art_json = json.dumps(article, indent=2) print(art_json) except Exception as e: logger.error(f"Failed to serialize article: {e}") except Exception as e: logger.error(f"Failed to get access token: {e}") finally: helper.stop() logger.info("Exiting") time.sleep(1) def usage(): print(f"Usage: {sys.argv[0]}") if __name__ == "__main__": main() ``` -------------------------------- ### Download Snapshot (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/snapshots/README.md Demonstrates downloading a snapshot using the GET HTTP method. It shows how to perform partial downloads using the 'Range' header for parallel downloading, specifying byte ranges. ```bash GET https://api.enterprise.wikimedia.com/v2/snapshots/afwikibooks_namespace_0/download { "Range": "bytes=0-99000" } ``` ```bash GET https://api.enterprise.wikimedia.com/v2/snapshots/afwikibooks_namespace_0/download { "Range": "bytes=99001-215077" } ``` -------------------------------- ### Get Metadata of All SC Snapshots Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Retrieves metadata for all available structured-content snapshots. ```APIDOC ## POST /v2/snapshots/structured-contents ### Description Retrieves metadata for all available structured-content snapshots. ### Method POST ### Endpoint https://api.enterprise.wikimedia.com/v2/snapshots/structured-contents ### Parameters #### Query Parameters None #### Request Body None ### Request Example ```json { "example": "No request body required for this endpoint." } ``` ### Response #### Success Response (200) - **snapshots** (array) - A list of snapshot metadata objects. #### Response Example ```json { "example": "[Response object structure based on ./response_i.json]" } ``` ``` -------------------------------- ### Run Authentication Examples Application Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/login/README.md Execute the authentication example application from the project's root directory using the provided Python command. This script likely demonstrates the usage of the authentication APIs described in this document. ```sh python -m example.login.login ``` -------------------------------- ### Run Structured Contents Application (Shell) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured_contents/README.md This command executes the structured contents example application from the project's root directory using Python. It's a prerequisite for running the API interaction examples. ```shell python -m example.structured_contents.structuredcontents ``` -------------------------------- ### Get All Project Codes (Metadata API) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md Retrieves all available project codes (types) from the Metadata API. This call does not require any parameters and returns a list of project code objects, each containing an identifier, name, and description. It serves as a baseline to understand the available projects. ```bash GET https://api.enterprise.wikimedia.com/v2/codes ``` -------------------------------- ### GET /v2/projects Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md Fetches metadata for all supported Wikimedia projects. Supports filtering and field selection. Can also be used to query a single project. ```APIDOC ## GET /v2/projects ### Description Get information on all the supported projects. Supports filtering and field selection. Allows to query single project. ### Method GET ### Endpoint https://api.enterprise.wikimedia.com/v2/projects ### Parameters #### Query Parameters - **filter** (string) - Optional - Allows filtering projects based on specified criteria. - **fields** (string) - Optional - Allows selecting specific fields to be returned in the response. ### Request Example ```bash GET https://api.enterprise.wikimedia.com/v2/projects?filter=in_language.identifier:en&fields=name,url ``` ### Response #### Success Response (200) - **name** (string) - The name of the project. - **identifier** (string) - A unique identifier for the project. - **url** (string) - The base URL for the project. - **code** (string) - A code representing the type of project (e.g., "wiki", "wiktionary"). - **in_language** (object) - An object containing information about the language of the project. - **identifier** (string) - The identifier for the language (e.g., "en"). #### Response Example ```json [ { "name": "Wikipedia", "identifier": "enwiki", "url": "https://en.wikipedia.org", "code": "wiki", "in_language": { "identifier": "en" } } ] ``` ``` -------------------------------- ### Get Metadata of All Available Batches Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md Retrieve metadata for all available batches for a specific day and hour. This is useful for understanding the data available for a given time. ```APIDOC ## POST /batches/{date}/{hour} ### Description Retrieves metadata for all available batches for a given date and hour. ### Method POST ### Endpoint `https://api.enterprise.wikimedia.com/batches/{date}/{hour}` ### Parameters #### Path Parameters - **date** (string) - Required - The date in YYYY-MM-DD format. - **hour** (string) - Required - The hour in HH format (00-23). #### Request Body - **filters** (array) - Optional - An array of filter objects to narrow down the results. - **field** (string) - Required - The field to filter on (e.g., `in_language.identifier`). - **value** (string) - Required - The value to filter by (e.g., `en`). ### Request Example ```json { "filters": [ { "field": "in_language.identifier", "value": "en" } ] } ``` ### Response #### Success Response (200) - **identifier** (string) - Unique identifier for the batch. - **version** (string) - Version hash of the batch. - **date_modified** (string) - Timestamp when the batch was last modified. - **is_part_of** (object) - Information about the project the batch belongs to. - **identifier** (string) - Project identifier (e.g., `abwiki`). - **in_language** (object) - Information about the language of the articles. - **identifier** (string) - Language code (e.g., `ab`). - **namespace** (object) - Information about the namespace. - **identifier** (integer) - Namespace identifier (e.g., 0). - **size** (object) - Size of the batch. - **value** (number) - The size value. - **unit_text** (string) - The unit of size (e.g., `MB`). #### Response Example ```json [ { "identifier": "abwiki_namespace_0", "version": "34462a47ee37113b765e59936d8fd7c8", "date_modified": "2024-07-16T06:06:16.892227533Z", "is_part_of": { "identifier": "abwiki" }, "in_language": { "identifier": "ab" }, "namespace": { "identifier": 0 }, "size": { "value": 0.027e0, "unit_text": "MB" } } ] ``` ``` -------------------------------- ### Login Example Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/login/README.md Obtain an access token and refresh token by providing your username and password. ```APIDOC ## POST /v1/login ### Description Get access token and refresh token by sending in username and password. ### Method POST ### Endpoint https://auth.enterprise.wikimedia.com/v1/login ### Parameters #### Request Body - **username** (string) - Required - The user's username. - **password** (string) - Required - The user's password. ### Request Example ```json { "username" : "usernamexyz", "password": "passwordxyz" } ``` ### Response #### Success Response (200) - **id_token** (string) - JWT token containing user information. - **access_token** (string) - Token used for authenticating subsequent API requests. Valid for 24 hours. - **refresh_token** (string) - Token used to obtain a new access token. - **expires_in** (integer) - The duration in seconds for which the access token is valid. #### Response Example ```json { "id_token": "abc....", "access_token": "abc...", "refresh_token": "abc..", "expires_in": 86400 } ``` ``` -------------------------------- ### Get Metadata of a Single Batch Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md Retrieve detailed metadata for a specific batch identified by its date, hour, and identifier. ```APIDOC ## POST /v2/batches/{date}/{hour}/{batch_identifier} ### Description Retrieves metadata for a single, specific batch. ### Method POST ### Endpoint `https://api.enterprise.wikimedia.com/v2/batches/{date}/{hour}/{batch_identifier}` ### Parameters #### Path Parameters - **date** (string) - Required - The date in YYYY-MM-DD format. - **hour** (string) - Required - The hour in HH format (00-23). - **batch_identifier** (string) - Required - The identifier of the specific batch (e.g., `enwiki_namespace_0`). ### Response #### Success Response (200) - **identifier** (string) - Unique identifier for the batch. - **version** (string) - Version hash of the batch. - **date_modified** (string) - Timestamp when the batch was last modified. - **is_part_of** (object) - Information about the project the batch belongs to. - **identifier** (string) - Project identifier (e.g., `enwiki`). - **in_language** (object) - Information about the language of the articles. - **identifier** (string) - Language code (e.g., `en`). - **namespace** (object) - Information about the namespace. - **identifier** (integer) - Namespace identifier (e.g., 0). - **size** (object) - Size of the batch. - **value** (number) - The size value. - **unit_text** (string) - The unit of size (e.g., `MB`). #### Response Example ```json { "identifier": "enwiki_namespace_0", "version": "4464d27eb28f52d69850ebd3f6f5f224", "date_modified": "2024-07-16T06:12:20.744959737Z", "is_part_of": { "identifier": "enwiki" }, "in_language": { "identifier": "en" }, "namespace": { "identifier": 0 }, "size": { "value": 5551.221e0, "unit_text": "MB" } } ``` ``` -------------------------------- ### Get Batch Download Headers Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md Retrieve HTTP header information for a batch download, such as Last-Modified and Content-Length, which is useful for managing downloads. ```APIDOC ## HEAD /v2/batches/{date}/{hour}/{batch_identifier}/download ### Description Retrieves only the HTTP header information for a batch file, such as `Last-Modified` and `Content-Length`, without downloading the file content. This is useful for checking file metadata before initiating a full download. ### Method HEAD ### Endpoint `https://api.enterprise.wikimedia.com/v2/batches/{date}/{hour}/{batch_identifier}/download` ### Parameters #### Path Parameters - **date** (string) - Required - The date in YYYY-MM-DD format. - **hour** (string) - Required - The hour in HH format (00-23). - **batch_identifier** (string) - Required - The identifier of the specific batch (e.g., `afwikibooks_namespace_0`). ### Response #### Success Response (200) Headers include information like: - **Last-Modified**: Timestamp of the last modification. - **Content-Length**: Size of the batch file in bytes. ``` -------------------------------- ### GET /v2/snapshots/{snapshot_id}/download Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/snapshots/README.md Downloads a snapshot. This endpoint supports parallel downloads using the `Range` header to specify byte ranges. ```APIDOC ## GET /v2/snapshots/{snapshot_id}/download ### Description Download a snapshot. You can parallel download using `Range` header. ### Method GET ### Endpoint `https://api.enterprise.wikimedia.com/v2/snapshots/{snapshot_id}/download` ### Parameters #### Path Parameters - **snapshot_id** (string) - Required - The unique identifier of the snapshot. #### Query Parameters None #### Request Body None ### Headers #### Range - **Range** (string) - Optional - Specifies the byte range to download. Format: `bytes=start-end`. Example: `bytes=0-99000`. ### Request Example Download the first 99000 bytes: ```bash GET https://api.enterprise.wikimedia.com/v2/snapshots/afwikibooks_namespace_0/download Headers: { "Range": "bytes=0-99000" } ``` Download the remaining bytes: ```bash GET https://api.enterprise.wikimedia.com/v2/snapshots/afwikibooks_namespace_0/download Headers: { "Range": "bytes=99001-215077" } ``` ### Response #### Success Response (200 or 206 Partial Content) - The response body contains the snapshot data for the requested byte range. #### Response Example (Partial Content) ``` [Binary data for the specified byte range] ``` ``` -------------------------------- ### Get All Snapshot Metadata Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/snapshots/README.md Retrieves metadata for all available snapshots across different projects and namespaces. This endpoint is useful for understanding the available data and its structure. No specific filters are applied, returning all snapshot information. ```bash POST https://api.enterprise.wikimedia.com/v2/snapshots ``` -------------------------------- ### Token Refresh Example Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/login/README.md Generate a new access token using a valid refresh token. ```APIDOC ## POST /v1/token-refresh ### Description Get a new access token by providing your username and refresh token. ### Method POST ### Endpoint https://auth.enterprise.wikimedia.com/v1/token-refresh ### Parameters #### Request Body - **username** (string) - Required - The user's username. - **refresh_token** (string) - Required - The token used to obtain a new access token. ### Request Example ```json { "username" : "usernamexyz", "refresh_token": "abc.." } ``` ### Response #### Success Response (200) - **id_token** (string) - JWT token containing user information. - **access_token** (string) - A new access token. Valid for 24 hours. - **expires_in** (integer) - The duration in seconds for which the new access token is valid. #### Response Example ```json { "id_token": "xyz..", "access_token": "xyz...", "expires_in": 86400 } ``` ``` -------------------------------- ### Retrieve Project Data Examples Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md This snippet demonstrates the structure of project data returned by the Wikimedia Enterprise API, including identifiers, URLs, and language codes. It serves as an example of how project information is organized. ```json { "identifier": "chwikibooks", "url": "https://ch.wikibooks.org", "code": "wikibooks", "in_language": { "identifier": "ch" } }, { "name": "വിക്കിപീഡിയ", "identifier": "mlwiki", "url": "https://ml.wikipedia.org", "code": "wiki", "in_language": { "identifier": "ml" } }, { "name": "വിക്കിനിഘണ്ടു", "identifier": "mlwiktionary", "url": "https://ml.wiktionary.org", "code": "wiktionary", "in_language": { "identifier": "ml" } }, { "name": "വിക്കിപാഠശാല", "identifier": "mlwikibooks", "url": "https://ml.wikibooks.org", "code": "wikibooks", "in_language": { "identifier": "ml" } }, { "name": "വിക്കിചൊല്ലുകൾ", "identifier": "mlwikiquote", "url": "https://ml.wikiquote.org", "code": "wikiquote", "in_language": { "identifier": "ml" } }, { "name": "വിക്കിഗ്രന്ഥശാല", "identifier": "mlwikisource", "url": "https://ml.wikisource.org", "code": "wikisource", "in_language": { "identifier": "ml" } }, { "name": "Wikipedia", "identifier": "nrmwiki", "url": "https://nrm.wikipedia.org", "code": "wiki", "in_language": { "identifier": "nrm" } }, { "name": "Wikipedia", "identifier": "nvwiki", "url": "https://nv.wikipedia.org", "code": "wiki", "in_language": { "identifier": "nv" } }, { "name": "Wikipedia", "identifier": "scnwiki", "url": "https://scn.wikipedia.org", "code": "wiki", "in_language": { "identifier": "scn" } }, { "name": "Wikizziunariu", "identifier": "scnwiktionary", "url": "https://scn.wiktionary.org", "code": "wiktionary", "in_language": { "identifier": "scn" } }, { "name": "װיקיפּעדיע", "identifier": "yiwiki", "url": "https://yi.wikipedia.org", "code": "wiki", "in_language": { "identifier": "yi" } }, { "name": "װיקיװערטערבוך", "identifier": "yiwiktionary", "url": "https://yi.wiktionary.org", "code": "wiktionary", "in_language": { "identifier": "yi" } }, { "name": "װיקיביבליאָטעק", "identifier": "yiwikisource", "url": "https://yi.wikisource.org", "code": "wikisource", "in_language": { "identifier": "yi" } }, { "name": "Wikipedia", "identifier": "shiwiki", "url": "https://shi.wikipedia.org", "code": "wiki", "in_language": { "identifier": "shi" } }, { "name": "Wikipedia", "identifier": "alswiki", "url": "https://als.wikipedia.org", "code": "wiki", "in_language": { "identifier": "als" } }, { "name": "ويكيپيديا", "identifier": "arywiki", "url": "https://ary.wikipedia.org", "code": "wiki", "in_language": { "identifier": "ary" } }, { "name": "Wikikamus", "identifier": "btmwiktionary", "url": "https://btm.wiktionary.org", "code": "wiktionary", "in_language": { "identifier": "btm" } }, { "name": "Wikipedia", "identifier": "cdowiki", "url": "https://cdo.wikipedia.org", "code": "wiki", "in_language": { "identifier": "cdo" } }, { "name": "Wikipedia", "identifier": "jawiki", "url": "https://ja.wikipedia.org", "code": "wiki", "in_language": { "identifier": "ja" } }, { "name": "Wiktionary", "identifier": "jawiktionary", "url": "https://ja.wiktionary.org", "code": "wiktionary", "in_language": { "identifier": "ja" } }, { "name": "Wikibooks", "identifier": "jawikibooks", "url": "https://ja.wikibooks.org", "code": "wikibooks", "in_language": { "identifier": "ja" } }, { "name": "ウィキニュース", "identifier": "jawikinews", "url": "https://ja.wikinews.org", "code": "wikinews", "in_language": { "identifier": "ja" } }, { "name": "Wikiquote", "identifier": "jawikiquote", "url": "https://ja.wikiquote.org", "code": "wikiquote", "in_language": { "identifier": "ja" } }, { "name": "Wikisource", ``` -------------------------------- ### Forgot Password Example Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/login/README.md Initiate the password reset process by sending your username. ```APIDOC ## POST /v1/forgot-password ### Description Allows you to set a new password by sending in a code to your associated email. ### Method POST ### Endpoint https://auth.enterprise.wikimedia.com/v1/forgot-password ### Parameters #### Request Body - **username** (string) - Required - The user's username. ### Request Example ```json { "username": "abc.." } ``` ``` -------------------------------- ### Clone wme-sdk-python Repository Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/auth/README.md Clones the wme-sdk-python repository from GitHub and navigates into the project directory. This is the initial step to set up the project locally. ```sh git clone https://github.com/wikimedia-enterprise/wme-sdk-python.git cd wme-sdk-python ``` -------------------------------- ### Configure Environment Variables Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/auth/README.md Sets up environment variables by renaming a sample file and editing it to include WME username and password. These credentials are required for authentication. ```sh mv sample.env .env ``` ```bash WME_USERNAME=your_username WME_PASSWORD=your_password ``` -------------------------------- ### Get SC Snapshot Metadata by Language (JSON) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Fetches metadata for Structured-Contents snapshots filtered by a specific language identifier. This example uses a JSON payload for filtering. ```json { "filters": [ { "field": "in_language.identifier", "value": "en" } ] } ``` -------------------------------- ### Get Language Metadata Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md Retrieves metadata for a specific language using its identifier. ```APIDOC ## GET /v2/languages/{identifier} ### Description Retrieves metadata for a specific language, including its name, alternate name, and text direction. ### Method GET ### Endpoint `/v2/languages/{identifier}` ### Parameters #### Path Parameters - **identifier** (string) - Required - The unique identifier of the language (e.g., 'fr' for French). ### Request Example ```bash GET https://api.enterprise.wikimedia.com/v2/languages/fr ``` ### Response #### Success Response (200) - **identifier** (string) - The language identifier. - **name** (string) - The common name of the language. - **alternate_name** (string) - An alternative name for the language. - **direction** (string) - The text direction ('ltr' for left-to-right, 'rtl' for right-to-left). #### Response Example ```json { "identifier": "fr", "name": "French", "alternate_name": "français", "direction": "ltr" } ``` ``` -------------------------------- ### Get Metadata of a Single SC Snapshot Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Retrieves metadata for a specific structured-content snapshot using its identifier. ```APIDOC ## POST /v2/snapshots/structured-contents/{snapshot_identifier} ### Description Retrieves metadata for a single structured-content snapshot identified by its unique identifier. ### Method POST ### Endpoint https://api.enterprise.wikimedia.com/v2/snapshots/structured-contents/{snapshot_identifier} ### Parameters #### Path Parameters - **snapshot_identifier** (string) - Required - The unique identifier of the snapshot (e.g., "enwiki_namespace_0"). #### Query Parameters None #### Request Body None ### Request Example ```json { "example": "No request body required for this endpoint." } ``` ### Response #### Success Response (200) - **snapshot** (object) - Metadata object for the specified snapshot. #### Response Example ```json { "example": "[Response object structure based on ./response_iii.json]" } ``` ``` -------------------------------- ### Get Metadata of All SC Snapshots in English Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Retrieves metadata for all available structured-content snapshots, filtered for the English language. ```APIDOC ## POST /v2/snapshots/structured-contents (Filtered by Language) ### Description Retrieves metadata for all available structured-content snapshots, filtered to include only those in the English language. ### Method POST ### Endpoint https://api.enterprise.wikimedia.com/v2/snapshots/structured-contents ### Parameters #### Query Parameters None #### Request Body - **filters** (array) - Required - An array of filter objects. - **field** (string) - Required - The field to filter on (e.g., "in_language.identifier"). - **value** (string) - Required - The value to filter by (e.g., "en"). ### Request Example ```json { "filters": [ { "field": "in_language.identifier", "value": "en" } ] } ``` ### Response #### Success Response (200) - **snapshots** (array) - A list of snapshot metadata objects matching the filter. #### Response Example ```json { "example": "[Response object structure based on ./response_ii.json]" } ``` ``` -------------------------------- ### Python Auth Client Token Management Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/auth/README.md Demonstrates the usage of the AuthClient and Helper classes in Python to obtain and manage access tokens. It includes token retrieval, logging, and error handling, with a loop to periodically fetch tokens. ```python3 import time import logging from auth_client import AuthClient from helper import Helper logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) def main(): auth_client = AuthClient() try: helper = Helper(auth_client) except Exception as e: logger.fatal(f"Failed to create helper: {e}") return try: while True: with helper.lock: try: token = helper.get_access_token() logger.info(f"Access token: {token}") ### Do something with the token here :) except Exception as e: logger.fatal(f"Failed to get token: {e}") return time.sleep(3600) finally: helper.stop() if __name__ == "__main__": main() ``` -------------------------------- ### GET /namespaces Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md Retrieves metadata for all supported namespaces. This endpoint allows for filtering and field selection, as well as querying for a single namespace. ```APIDOC ## GET /namespaces ### Description Retrieves metadata for all supported namespaces. Supports filtering and field selection. Allows to query single namespace. ### Method GET ### Endpoint https://api.enterprise.wikimedia.com/v2/namespaces ### Parameters #### Query Parameters - **filter** (string) - Optional - Allows filtering namespaces based on certain criteria. - **fields** (string) - Optional - Specifies which fields to include in the response (e.g., 'name,identifier'). ### Request Example ```json { "example": "GET https://api.enterprise.wikimedia.com/v2/namespaces?fields=name,identifier" } ``` ### Response #### Success Response (200) - **name** (string) - The name of the namespace. - **identifier** (integer) - The unique identifier for the namespace. - **description** (string) - A detailed description of the namespace. #### Response Example ```json { "example": [ { "name": "Articles", "identifier": 0, "description": "The main namespace, article namespace, or mainspace is the namespace of Wikipedia that contains the encyclopedia proper—that is, where 'live' Wikipedia articles reside." }, { "name": "File", "identifier": 6, "description": "The File namespace is a namespace consisting of administration pages in which all of Wikipedia's media content resides. On Wikipedia, all media filenames begin with the prefix File:, including data files for images, video clips, or audio clips, including document length clips; or MIDI files (a small file of computer music instructions)." } ] } ``` ``` -------------------------------- ### Set Up Environment Variables for WME SDK Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/snapshots/README.md This snippet shows how to configure the necessary environment variables for authenticating with the Wikimedia Enterprise API. It requires setting WME_USERNAME and WME_PASSWORD in a .env file. ```bash WME_USERNAME="your_username" WME_PASSWORD="your_password" ``` -------------------------------- ### Get Batches Metadata Filtered by Language (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md This example demonstrates how to filter batch metadata to retrieve only those batches for a specific language (e.g., English) for a given day and hour. It uses a POST request with a JSON payload specifying the filter condition. ```bash POST https://api.enterprise.wikimedia.com/batches/2025-07-16/05 ``` ```json { "filters": [ { "field": "in_language.identifier", "value": "en" } ] } ``` -------------------------------- ### Get and Download Snapshots API Source: https://context7.com/wikimedia-enterprise/wme-sdk-python/llms.txt This section details how to retrieve, filter, and download snapshot data using the Python SDK. It includes examples for fetching all snapshots, filtering by language, retrieving a single snapshot, getting snapshot metadata (HEAD request), and downloading and processing snapshot content. ```APIDOC ## Get and Download Snapshots API ### Description This endpoint allows you to retrieve, filter, and download snapshot data. You can get all snapshots, filter them by specific criteria, fetch individual snapshots, and download their content for processing. ### Method GET, HEAD, POST ### Endpoint `/api/v1/snapshots` (for listing/filtering snapshots) `/api/v1/snapshots/{snapshot_id}` (for getting a single snapshot or its metadata) `/api/v1/snapshots/{snapshot_id}/download` (for downloading a snapshot) ### Parameters #### Query Parameters for List/Filter Snapshots - **fields** (string) - Optional - Comma-separated list of fields to include in the response. - **filters** (object) - Optional - Filters to apply to the snapshot list (e.g., `{"in_language.identifier": "en"}`). #### Path Parameters for Single Snapshot/Download - **snapshot_id** (string) - Required - The unique identifier of the snapshot. #### Request Body (for `get_snapshots` or `get_snapshot` with specific fields) ```json { "fields": ["identifier", "version", "date_modified", "is_part_of", "in_language", "namespace", "size"] } ``` ### Request Example ```python # Get all snapshots (limited to 3) all_snapshots = api_client.get_snapshots(Request(fields=snapshot_fields)) limited = all_snapshots[:3] logger.info(json.dumps([Snapshot.to_json(s) for s in limited], indent=2)) # Get filtered snapshots (English) ss_filter = {"in_language.identifier": "en"} en_snapshots = api_client.get_snapshots(Request(fields=snapshot_fields, filters=ss_filter)) logger.info(f"Found {len(en_snapshots)} English snapshots") # Get single snapshot es_snapshot = api_client.get_snapshot("eswiki_namespace_0", Request(fields=snapshot_fields)) logger.info(json.dumps(Snapshot.to_json(es_snapshot), indent=2)) # Get HEAD metadata target_id = "eswikibooks_namespace_0" headers = api_client.head_snapshot(target_id) content_length = headers.get('Content-Length', 0) logger.info(f"Snapshot size: {content_length} bytes") # Download snapshot with io.BytesIO() as buffer: api_client.download_snapshot(target_id, buffer) size_mb = buffer.getbuffer().nbytes / (1024 * 1024) logger.info(f"Downloaded {size_mb:.2f} MB in {time.time() - start_time:.2f}s") ``` ### Response #### Success Response (200) - **snapshot** (object) - Details of the snapshot. - **identifier** (string) - Unique identifier of the snapshot. - **version** (string) - Version of the snapshot. - **date_modified** (string) - Date and time the snapshot was last modified. - **is_part_of** (string) - Identifier of the larger dataset this snapshot belongs to. - **in_language** (object) - Information about the language of the snapshot. - **identifier** (string) - Language code (e.g., "en"). - **namespace** (string) - The namespace the snapshot pertains to. - **size** (integer) - The size of the snapshot in bytes. #### Response Example (Single Snapshot) ```json { "identifier": "eswiki_namespace_0", "version": "2023-01-01", "date_modified": "2023-01-01T12:00:00Z", "is_part_of": "wikidataset", "in_language": {"identifier": "es"}, "namespace": "main", "size": 104857600 } ``` #### Response Example (HEAD Metadata) ```json { "Content-Length": "104857600", "Content-Type": "application/x-ndjson" } ``` ``` -------------------------------- ### Connect to Realtime Streaming API with Filters and Field Selection (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/streaming/README.md This example demonstrates how to connect to the Realtime Streaming API using a POST request with specific filters and field selections. It shows the endpoint, request parameters for fields and filters, and a sample JSON response containing article event data. The primary input is the JSON payload defining the desired data. ```bash POST https://realtime.enterprise.wikimedia.com/v2/articles { "fields": [ "name", "is_part_of.*", "event.*" ], "filters": [ { "field": "namespace.identifier", "value": 0 } ] } ``` -------------------------------- ### Reconnecting using Start Date per Partition ('since_per_partition') Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/streaming/README.md This method, though not recommended due to performance concerns, allows specifying a unique RFC3339 formatted date-time for each partition to resume streaming. Events will be received starting from the specified time for each partition. Similar to 'offsets', irrelevant partitions in the map are ignored, and the total number of specified partitions cannot exceed 50. ```json { "parts": [0], "since_per_partition": {"1": "2023-06-05T12:00:00Z", "5": "2023-06-05T12:00:00Z"} } ``` -------------------------------- ### Get Specific SC Snapshot Metadata (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Retrieves metadata for a single Structured-Contents snapshot using its unique identifier. The identifier typically includes the project and namespace. ```bash POST https://api.enterprise.wikimedia.org/v2/snapshots/structured-contents/enwiki_namespace_0 ``` -------------------------------- ### Download Single SC Snapshot (Bash) Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/structured-contents-snapshots/README.MD Initiates the download of a specific Structured-Contents snapshot. This requires the snapshot's identifier and appends the '/download' path to the URL. ```bash GET https://api.enterprise.wikimedia.com/v2/snapshots/structured-contents/enwiki_namespace_0/download ``` -------------------------------- ### GET /v2/languages Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/metadata/README.md Retrieves a list of all supported languages by the Wikimedia Enterprise API, including their metadata. This endpoint supports filtering and field selection for customized responses. ```APIDOC ## GET /v2/languages ### Description Retrieves a list of all supported languages with their metadata. Supports filtering and field selection. ### Method GET ### Endpoint https://api.enterprise.wikimedia.com/v2/languages ### Parameters #### Query Parameters - **filter** (string) - Optional - Allows filtering the results based on specific criteria. - **fields** (string) - Optional - Specifies which fields to include in the response. ### Request Example ```bash GET https://api.enterprise.wikimedia.com/v2/languages ``` ### Response #### Success Response (200) - **identifier** (string) - The unique identifier for the language. - **name** (string) - The common name of the language. - **alternate_name** (string) - An alternative name for the language. - **direction** (string) - The text directionality of the language (e.g., 'ltr' for left-to-right, 'rtl' for right-to-left). #### Response Example ```json [ { "identifier": "cv", "name": "Chuvash", "alternate_name": "чӑвашла", "direction": "ltr" }, { "identifier": "id", "name": "Indonesian", "alternate_name": "Bahasa Indonesia", "direction": "ltr" } ] ``` ``` -------------------------------- ### Download a Batch File Source: https://github.com/wikimedia-enterprise/wme-sdk-python/blob/main/example/batches/README.md Download the content of a batch file. Supports parallel downloads using the Range header for efficient data transfer. ```APIDOC ## GET /v2/batches/{date}/{hour}/{batch_identifier}/download ### Description Downloads the content of a specific batch file. You can use the `Range` header to download parts of the file in parallel, which is highly recommended for large files. ### Method GET ### Endpoint `https://api.enterprise.wikimedia.com/v2/batches/{date}/{hour}/{batch_identifier}/download` ### Parameters #### Path Parameters - **date** (string) - Required - The date in YYYY-MM-DD format. - **hour** (string) - Required - The hour in HH format (00-23). - **batch_identifier** (string) - Required - The identifier of the specific batch (e.g., `afwikibooks_namespace_0`). #### Request Headers - **Range** (string) - Optional - Specifies the byte range to download (e.g., `bytes=0-20`). If omitted, the entire file is downloaded. ### Request Example (Partial Download) ```json { "Range": "bytes=0-20" } ``` ### Response #### Success Response (200 or 206 Partial Content) - The response body contains the binary content of the batch file or the specified byte range. ```