### Sample Logout HTTP GET Request Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-authorization_code_grant_flow This is an example of an HTTP GET request to log out a user. It requires the access token and optionally accepts a return URI for redirection after logout. This action invalidates the session. ```http GET https://accounts.dowjones.com/oauth2/v1/logout? access_token=AUTHZ_ACCESS_TOKEN_RECEIVED& return_uri=RETURN_URI ``` -------------------------------- ### Python: Create and Monitor Explain Query and Snapshot Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-international_travel This Python script demonstrates how to create an 'explain' query to validate a search and estimate document count, then proceeds to create a 'snapshot' for data extraction. It includes logic to poll the status of asynchronous jobs until completion and download resulting files. Dependencies include the 'requests', 'json', and 'os' libraries. ```python import requests import json import os from time import sleep # Assuming headers, url, and request_body are defined elsewhere # Example placeholders: headers = {"Authorization": "Bearer YOUR_API_KEY"} url = "https://documents.dowjones.com/api/v1/explain" request_body = {"query": "example query"} # Check the Explain to verify that the query was valid and see how many documents would be returned response = requests.post(url, data=json.dumps(request_body), headers=headers) if response.status_code != 201: print("ERROR: An error occurred creating an explain: " + response.text) else: explain = response.json() print("Explain Created. Job ID: " + explain["data"]["id"]) state = explain["data"]["attributes"]["current_state"] # Wait for Explain job to complete while state != "JOB_STATE_DONE": self_link = explain["links"]["self"] response = requests.get(self_link, headers=headers) explain = response.json() state = explain["data"]["attributes"]["current_state"] print(f"Current state: {state}. Waiting...") sleep(10) # Wait for 10 seconds before checking again print("Explain Completed Successfully.") doc_count = explain["data"]["attributes"]["counts"] print("Number of documents returned: " + str(doc_count)) print("Proceed with the Snapshot? (Y/N)") proceed = input('> ') if proceed.lower() != 'y' and proceed.lower() != "yes": print("Not proceeding with extraction") else: # Create a Snapshot with the given query snapshot_url = "https://documents.dowjones.com/api/v1/snapshots" print("Creating the Snapshot: " + json.dumps(request_body)) response = requests.post(snapshot_url, data=json.dumps(request_body), headers=headers) # Verify that the response from creating an extraction is OK if response.status_code != 201: print("ERROR: An error occurred creating an extraction: " + response.text) else: extraction = response.json() print("Extraction Created. Job ID: " + extraction['data']['id']) self_link = extraction["links"]["self"] print("Waiting for extraction job to complete...") sleep(30) # Initial wait before first status check while True: # Call the second endpoint, which will verify if the extraction is ready. status_response = requests.get(self_link, headers=headers) # Verify that the response from the self_link is OK if status_response.status_code != 200: print("ERROR: an error occurred getting the details for the extraction: " + status_response.text) break else: status = status_response.json() if 'current_state' in status['data']['attributes']: currentState = status['data']['attributes']['current_state'] print(f"Current state is: {currentState}") # Job is still running, sleep for 30 seconds if currentState in ["JOB_STATE_RUNNING", "JOB_VALIDATING", "JOB_QUEUED", "JOB_CREATED"]: print("Sleeping for 30 seconds...") sleep(30) # If currentState is JOB_STATE_DONE, then the process completed successfully. elif currentState == "JOB_STATE_DONE": print("Job completed successfully") print("Downloading Snapshot files to current directory") if 'files' in status['data']['attributes']: for file in status['data']['attributes']['files']: filepath = file['uri'] parts = filepath.split('/') filename = parts[len(parts) - 1] r = requests.get(file['uri'], stream=True, headers=headers) dir_path = os.path.dirname(os.path.realpath(__file__)) full_filepath = os.path.join(dir_path, filename) with open(full_filepath, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk) print(f"Downloaded: {filename}") break # If job has another state, that means it was not successful. else: print("An error occurred with the job. Final state is: " + currentState) break else: # If current_state does not yet exist in the response, we will sleep for 30 seconds print("Current state not found. Sleeping for 30 seconds...") sleep(30) ``` -------------------------------- ### Sample AuthZ Access Token Response Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-authorization_code_grant_flow This is an example of a successful response when requesting an AuthZ Access Token. It returns a JSON object containing the access_token, token_type, and expires_in duration. ```json { "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik0wRkNRakJUaENOek5CTVRGRU56TkVPQSJ9.eyJwaWIiOnsibWFzdGVyX2FwcF9pZCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4Iiwic2Vzc2lvbl9pZCI6IjI3MTQ0WHhYX2NkYzVkNzE0LWQ1MGEtNDAzMy1lOGM2LWNjMGZiNDU5Njc3NCIsImFsbG93X2F1dG9fb9naW4iOiJ0cnVlIiwiYXBjIjoiOSIsImN0YyI6IkQifSwiaXNzIjoiaHR0cHM6Ly9hdXRoLmludC5hY2vdW50cy5kb3dqb25lcy5jb20vIiwic3ViIjoiYXV0aDB8Vk5KUUVQSlYzUUJEMk5QWSIsImF1ZCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4IiiZXhwIjoxNTAzODQ2NTU3LCJpYXQiOjE1MDM1ODczNTcsImF6cCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4In0.AdGy4iNtRnB1sEUOdx8iqWaCiJS0MkGOrRCt6SsDl3HyxLa5SoNczb2rCu9x7fYbyDnjKUn0ZkLHDS_DDyio6JrJ5qXF9p07IGhKhouDW1ouX6GEZ_LyTsJ7gFK0830N_VjBMFJcDiTOQ89Pz8QwaNlrkKgjq11bEVOxSsiWFzjDAhB23fUiIN6Fn8ABezySZhDzWOM87H7fG2t8gOlC0aPRwAHGZvyrUopApyK2G7v6ODyvD6S5ghqAmqB_BsgAyr4urvGg2euH5MNCCepclK09BMgb9KqoNoFQe0Q34H9wzjFlu1FPWP-GZm3cgJZYqhx7G4ih12FVOMA", "token_type": "Bearer", "expires_in": 259200 } ``` -------------------------------- ### Create and Monitor Explain Request - Python Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-2020_us_presidential_election This Python code snippet demonstrates how to create an 'explain' request to verify a query and check the number of documents that would be returned. It includes polling the job status until completion and printing the document count. Dependencies: `requests`, `json`, `time.sleep`, `os`. ```python import requests import json from time import sleep import os # Assuming 'headers', 'url', 'request_body' are defined elsewhere # Check the Explain to verify that the query was valid and see how many documents would be returned if response.status_code != 201: print("ERROR: An error occurred creating an explain: " + response.text) else: explain = response.json() print("Explain Created. Job ID: " + explain["data"]["id"]) state = explain["data"]["attributes"]["current_state"] # Wait for Explain job to complete while state != "JOB_STATE_DONE": self_link = explain["links"]["self"] response = requests.get(self_link, headers=headers) explain = response.json() state = explain["data"]["attributes"]["current_state"] print("Explain Completed Successfully.") doc_count = explain["data"]["attributes"]["counts"] print("Number of documents returned: " + str(doc_count)) print("Proceed with the Snapshot? (Y/N)") proceed = input('> ') if proceed.lower() != 'y' and proceed.lower() != "yes": print("Not proceeding with extraction") else: # Create a Snapshot with the given query print("Creating the Snapshot: " + json.dumps(request_body)) response = requests.post(url, data=json.dumps(request_body), headers=headers) print(response.text) # Verify that the response from creating an extraction is OK if response.status_code != 201: print("ERROR: An error occurred creating an extraction: " + response.text) else: extraction = response.json() print(extraction) print("Extraction Created. Job ID: " + extraction['data']['id']) self_link = extraction["links"]["self"] sleep(30) print("Checking state of the job.") while True: # Call the second endpoint, which will verify if the extraction is ready. status_response = requests.get(self_link, headers=headers) # Verify that the response from the self_link is OK if status_response.status_code != 200: print("ERROR: an error occurred getting the details for the extraction: " + status_response.text) else: # There is an edge case where the job does not have a current_state yet. If current_state does not yet exist in the response, we will sleep for 10 seconds status = status_response.json() if 'current_state' in status['data']['attributes']: currentState = status['data']['attributes']['current_state'] print("Current state is: " + currentState) # Job is still running, sleep for 10 seconds if currentState == "JOB_STATE_RUNNING": print("Sleeping for 30 seconds... Job state running") sleep(30) elif currentState == "JOB_VALIDATING": print("Sleeping for 30 seconds... Job validating") sleep(30) elif currentState == "JOB_QUEUED": print("Sleeping for 30 seconds... Job queued") sleep(30) elif currentState == "JOB_CREATED": print("Sleeping for 30 seconds... Job created") sleep(30) else: # If currentState is JOB_STATE_DONE, then the process completed successfully. if currentState == "JOB_STATE_DONE": print("Job completed successfully") print("Downloading Snapshot files to current directory") for file in status['data']['attributes']['files']: filepath = file['uri'] parts = filepath.split('/') filename = parts[len(parts) - 1] r = requests.get(file['uri'], stream=True, headers=headers) dir_path = os.path.dirname(os.path.realpath(__file__)) filename = os.path.join(dir_path, filename) with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk) # If job has another state, that means it was not successful. else: print("An error occurred with the job. Final state is: " + currentState) break else: print("Sleeping for 30 seconds...") sleep(30) ``` -------------------------------- ### GET Request Example (Dow Jones Calendar API) Source: https://developer.dowjones.com/documents/site-docs-getting_started/-api_essentials-getting_a_response This example demonstrates how to make a GET request to the Dow Jones Calendar API to search for events, including query parameters for filtering. ```APIDOC ## GET /calendar-events/search ### Description Retrieves a list of calendar events from the Dow Jones Calendar API with specified filters. ### Method GET ### Endpoint `https://api.dowjones.com/calendar-events/search` ### Query Parameters - **filter[has_confirmed_events_only]** (boolean) - Optional - Filters for events that have confirmed occurrences. - **filter[region]** (string) - Optional - Filters events by a specific region (e.g., 'AS' for Asia). - **filter[event_class]** (string) - Optional - Filters events by their class (e.g., 'IEP_STAT'). ### Request Example ```bash curl -X GET --header 'Accept: application/json' 'https://api.dowjones.com/calendar-events/search?filter[has_confirmed_events_only]=true&filter[region]=AS&filter[event_class]=IEP_STAT' ``` ### Response #### Success Response (200) - **(structure depends on API response)** - Description of the event data returned. ``` -------------------------------- ### Create and Monitor Snapshot - Python Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-artificial_intelligence This Python snippet demonstrates how to create a 'snapshot' using the API after a successful explain request. It includes polling the job status, handling different job states, and downloading the resulting files upon completion. ```python import requests import json from time import sleep import os # Assuming headers, url, request_body, proceed are defined elsewhere # Example placeholder values: headers = {"Content-Type": "application/json", "Authorization": "Bearer YOUR_API_KEY"} url = "https://documents.dowjones.com/api/v1/snapshots" request_body = {"query": "your_query_here", "explain_id": "explain_job_id_here"} # explain_id from previous step proceed = 'y' # Assume user proceeded if proceed.lower() != 'y' and proceed.lower() != "yes": print("Not proceeding with extraction") else: # Create a Snapshot with the given query print("Creating the Snapshot: " + json.dumps(request_body)) response = requests.post(url, data=json.dumps(request_body), headers=headers) # Verify that the response from creating an extraction is OK if response.status_code != 201: print("ERROR: An error occurred creating an extraction: " + response.text) else: extraction = response.json() print(extraction) print("Extraction Created. Job ID: " + extraction['data']['id']) self_link = extraction["links"]["self"] sleep(30) # Initial wait before first status check print("Checking state of the job.") while True: # Call the second endpoint, which will verify if the extraction is ready. status_response = requests.get(self_link, headers=headers) # Verify that the response from the self_link is OK if status_response.status_code != 200: print("ERROR: an error occurred getting the details for the extraction: " + status_response.text) break else: status = status_response.json() if 'current_state' in status['data']['attributes']: currentState = status['data']['attributes']['current_state'] print("Current state is: " + currentState) # Job is still running, sleep for 30 seconds if currentState in ["JOB_STATE_RUNNING", "JOB_VALIDATING", "JOB_QUEUED", "JOB_CREATED"]: print(f"Sleeping for 30 seconds... Job state {currentState}") sleep(30) else: # If currentState is JOB_STATE_DONE, then the process completed successfully. if currentState == "JOB_STATE_DONE": print("Job completed successfully") print("Downloading Snapshot files to current directory") for file in status['data']['attributes']['files']: filepath = file['uri'] parts = filepath.split('/') filename = parts[len(parts) - 1] r = requests.get(file['uri'], stream=True, headers=headers) dir_path = os.path.dirname(os.path.realpath(__file__)) filename = os.path.join(dir_path, filename) with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk) print("All files downloaded.") # If job has another state, that means it was not successful. else: print("An error occurred with the job. Final state is: " + currentState) break else: # Handle cases where 'current_state' might not be immediately available print("Current state not yet available. Sleeping for 30 seconds...") sleep(30) ``` -------------------------------- ### Python: Create and Monitor Explain Request Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-commodities This snippet shows how to create an 'explain' request to verify a query and check the number of documents that would be returned. It includes logic to poll the API until the job is complete and then prints the document count. Dependencies include the 'requests' library. ```python import requests import json from time import sleep import os # Assuming 'headers', 'request_body', 'url' are defined elsewhere # Example placeholders: headers = {"Content-Type": "application/json", "Authorization": "Bearer YOUR_API_KEY"} request_body = {"query": "your_query_here"} url = "https://documents.dowjones.com/api/v1/explain" # Check the Explain to verify that the query was valid and see how many documents would be returned response = requests.post(url, data=json.dumps(request_body), headers=headers) if response.status_code != 201: print("ERROR: An error occurred creating an explain: " + response.text) else: explain = response.json() print("Explain Created. Job ID: " + explain["data"]["id"]) state = explain["data"]["attributes"]["current_state"] # Wait for Explain job to complete while state != "JOB_STATE_DONE": self_link = explain["links"]["self"] response = requests.get(self_link, headers=headers) explain = response.json() state = explain["data"]["attributes"]["current_state"] print(f"Current state: {state}. Waiting...") sleep(10) # Wait for 10 seconds before polling again print("Explain Completed Successfully.") doc_count = explain["data"]["attributes"]["counts"] print("Number of documents returned: " + str(doc_count)) print("Proceed with the Snapshot? (Y/N)") proceed = input('> ') if proceed.lower() != 'y' and proceed.lower() != "yes": print("Not proceeding with extraction") else: # Create a Snapshot with the given query snapshot_url = "https://documents.dowjones.com/api/v1/snapshots" print("Creating the Snapshot: " + json.dumps(request_body)) response = requests.post(snapshot_url, data=json.dumps(request_body), headers=headers) print(response.text) # Verify that the response from creating an extraction is OK if response.status_code != 201: print("ERROR: An error occurred creating an extraction: " + response.text) else: extraction = response.json() print(extraction) print("Extraction Created. Job ID: " + extraction['data']['id']) self_link = extraction["links"]["self"] sleep(30) print("Checking state of the job.") while True: # Call the second endpoint, which will verify if the extraction is ready. status_response = requests.get(self_link, headers=headers) # Verify that the response from the self_link is OK if status_response.status_code != 200: print("ERROR: an error occurred getting the details for the extraction: " + status_response.text) else: status = status_response.json() if 'current_state' in status['data']['attributes']: currentState = status['data']['attributes']['current_state'] print("Current state is: " + currentState) if currentState in ["JOB_STATE_RUNNING", "JOB_VALIDATING", "JOB_QUEUED", "JOB_CREATED"]: print("Sleeping for 30 seconds... Job state is " + currentState) sleep(30) elif currentState == "JOB_STATE_DONE": print("Job completed successfully") print("Downloading Snapshot files to current directory") for file in status['data']['attributes']['files']: filepath = file['uri'] parts = filepath.split('/') filename = parts[len(parts) - 1] r = requests.get(file['uri'], stream=True, headers=headers) dir_path = os.path.dirname(os.path.realpath(__file__)) filename = os.path.join(dir_path, filename) with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk) print("Download complete.") break else: print("An error occurred with the job. Final state is: " + currentState) break else: print("Current state not found in response. Sleeping for 30 seconds...") sleep(30) ``` -------------------------------- ### Successful Authorization Response - Redirect URI Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-implicit_grant_flow A successful OAuth 2.0 authorization response redirects the user to the specified `redirect_uri` with an authorization code or access token as a URL fragment. This example shows the format of a successful response containing an access token and an ID token. ```url REDIRECT_URI#access_token=7yaShH44VvmOAGYEKM7I5uPuWJqA&id_token=eyJ0eXAiOiJKV1QiLJhbGciOiJSUzI1iIsImtpZCI6Ik0wRkNRakJEUVRNVFUTTNSREUwTWpGRE5VWkNNekE0UWoR1JUaENOek5CTVRGRU56TkVPQSJ9.eyJwaWIiOnsic2lkIjoiQjRXd3lOYzOcDFlbWxmaGU2em5naFUtdjRVaDNkREMiLCJrbWxpIjoiMCIsImlsIjoiZW4ifSwiaXNzIjoiaHR0cHM6Ly9hdXRoLmludC5hY2NvdW5cy5kb3dqb25lcy5jb20vIiwic3ViIjoiYXV0aDB8Vk5KUUVQSlYzUUJEMk5QWSIsImF1ZCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4IiwiZXhwIjoxNTA5MDEyNTY0LCJpYXQiOjE1MDg3NTMzNjQsIm5vbmNlIjoiYWJjNTA2MTciLCJhdF9oYXNoIjoieHI4eldFbXFNOTMwaWZndW9jbDNfQSJ9.Vw86vK1fH75RBgLbHr3QlxilVvAKB5mfTKixXO2m9QmS2f3k9lDliGlu7aGQ1CKGLTXcqjdFHXLe2TicKvZkCKooac5iIEGcsTnL-2wMfIaRVU9juqJas24GUzt2EkzOVtzSfVtalIzoROnSQWaQ9N5NaJ9909r9sD59La9yDPuXAEWH97FQYTw1FjNn8t6rJbKhuwaSS2sMO9b4qjd1_HH_S-HhBWdfVTmYS0pC2LuMhSCpVUWTFmwQaMSJvF1iF9VIbAjc2nHoC5ELt4Ifjz_Pd2UiM1owxVS4mtHbgzevdTMJCySKJGhFGijnwEkrTmIS6eaW71d4ncKqBlQ&token_type=Bearer&state=abc ``` -------------------------------- ### Searching DistDoc Metadata Fields and Attributes (UQL) Source: https://developer.dowjones.com/documents/site-docs-getting_started/-building_queries-unified_search Demonstrates how to query DistDoc metadata fields and attributes using UQL. It shows how to refer to XML elements as paths and attributes using the '@' symbol, providing examples for various search criteria. ```plaintext Example XML Structure: free text Example Queries: * `Code:(@value:c181 and codeI:@org:categoriser not codeI:@org:rbc)` * `Code:(@value:att and codeI:@org:map-isin and @subcat:com)` * `CodeSet:(@codeCat:in not codeI:(@org:categoriser or @org:rbc))` * `Code:(@value=”canral” and @subcat=”com” and @why=”about”)` * `Code:(@rs:"88" and @rr:"high")` * `CodeSet:(@codeCat=("fpco" or “fppe”) and code:@rr:"veryhigh")` * `code:(@value:att and codeI:@org:map-isin) and company` * `code:(@value:att and codeI:@org:map-isin) and la:en` * `code:(@value:att and codeI:@org:map-isin) and hlp:company` ``` -------------------------------- ### Logout Endpoint (HTTP GET) Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-implicit_grant_flow Use the GET /logout endpoint to invalidate the PIB session ID and delete the browser auto-login session. This requires the AuthZ Access token and should be performed on a user agent. ```HTTP GET https://accounts.dowjones.com/oauth2/v1/logout?access_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik0wRkNRakJEUVRNeVFUTTNSREUwTWpGRE5VWkNNekE0UWpoR1JUaENOek5CTVRGRU56TkVPQSJ9.eyJwaWIiOnsibWFzdGVyX2FwcF9pZCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4Iiwic2Vzc2lvbl9pZCI6IjI3MTQzWHhYXFmMjNjNjY1LTE5YzEtY2VkYy1hMmFkLWE1NDI5MTQ2ZmFmZiIsImFsbG93X2F1dG9fbG9naW4iOiJ0cnVlIiwiYXBjIjoiOSIsImN0YyI6IkQifSwiaXNzIjoiaHR0cHM6y9hdXRoLmludC5hY2NvdW50cy5kb3dqb25lcy5jb20vIiwic3ViIjoiYXV0aDB8Vk5KUUVQSlYzUUJEMk5QWSIsImF1ZCI6IkpiOEU0UGxiaGRCMmFOWURRMEE4NFVyQVQ0TmRqRjA4IiwiZXhwIjoxNA0NTM0NDg5LCJpYXQiOjE1MDQyNzUyODksImF6cCI6IkpiOEU0UGiaGRCMmFOWn0.R01ncqftu4amc95cleTKPkWH4AtxzcLMvHmZzLxO062Mjn3Z0E5QVvVXCBPfvmD8KACE5mqNisvVEZQfI35h0Nm63veRG49LuCL9HwNYTZOt4oVrID4Z2tik0ChkSy7eZ5b1hO26VbU7J5s1Ih3yvdhyaXyvfLec4fZpuFszyooOLTHr_agk2Pdddkuacr2kQJLGSTOfAiNkR0jTR2kn0hUz_Yf6RaqGd6Y3leByzyEVMojjvY7zMiRMtKZ5Q_FbtIlFt1Ps2tcTf0PtAaZDahpSNyb7_DzRZRnw2dbnG8wBYuZV0b8lCK1maxELhJ8I_d1GzgzB5eyIqAqdZEw&return_uri=RETURN_URI ``` -------------------------------- ### Query Construction Examples Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-financial_advisors Demonstrates two different query constructions using the 'OR' and 'AND' operators to retrieve varying sets of results. ```APIDOC ## Query Construction Examples ### Description This section illustrates how to construct queries for the Factiva Snapshots API, showcasing the difference between using `OR` and `AND` operators. ### Query 1: Using `OR` Operator This query retrieves articles that match at least one of the specified search terms, subject codes, OR industry codes, combined with other criteria. ```json { "query": { "where": "( ( REGEXP_CONTAINS( CONCAT(title, ' ', IFNULL(snippet, ''), ' ', IFNULL(section, ''), ' ', IFNULL(body, '')), r'(?i)(\b)(financial\W+services|private\W+banking|wealth\W+management|risk\W+management|investing|securities|financial\W+investments|investment\W+advice|financial\W+performance|financial\W+advisors)(\b)') OR REGEXP_CONTAINS(LOWER(industry_codes), r'(^|,)(ifinal|iwealth|i83109|iinv|i831|i83108)($|,)' ) ) OR REGEXP_CONTAINS(LOWER(subject_codes), r'(^|,)(c15|gfnpl)($|,)' ) ) AND REGEXP_CONTAINS(LOWER(restrictor_codes), r'(^|,)(wsjo|bon|mrkwc)($|,)') AND language_code='en' AND publication_date >= '2019-01-01 00:00:00' ) } } ``` ### Query 2: Using `AND` Operator This query is more specific, retrieving articles that match at least one of the search terms, at least one of the subject codes, AND at least one of the industry codes, combined with other criteria. ```json { "query": { "where": "( ( REGEXP_CONTAINS( CONCAT(title, ' ', IFNULL(snippet, ''), ' ', IFNULL(section, ''), ' ', IFNULL(body, '')), r'(?i)(\b)(financial\W+services|private\W+banking|wealth\W+management|risk\W+management|investing|securities|financial\W+investments|investment\W+advice|financial\W+performance|financial\W+advisors)(\b)') AND REGEXP_CONTAINS(LOWER(industry_codes), r'(^|,)(ifinal|iwealth|i83109|iinv|i831|i83108)($|,)' ) ) AND REGEXP_CONTAINS(LOWER(subject_codes), r'(^|,)(c15|gfnpl)($|,)' ) ) AND REGEXP_CONTAINS(LOWER(restrictor_codes), r'(^|,)(wsjo|bon|mrkwc)($|,)') AND language_code= 'en' AND publication_date >= '2019-01-01 00:00:00' ) } } ``` ### Explanation Query 1 yields more results due to the `OR` operator, while Query 2 is more restrictive because of the `AND` operator. ``` -------------------------------- ### POST Request Example (Factiva Streams API) Source: https://developer.dowjones.com/documents/site-docs-getting_started/-api_essentials-getting_a_response This example shows how to make a POST request to the Factiva Streams API using Python, including authentication with a user key. ```APIDOC ## POST /alpha/streams ### Description Sends data to the Factiva Streams API, likely for creating or updating stream configurations or sending stream data. ### Method POST ### Endpoint `https://api.dowjones.com/alpha/streams` ### Parameters #### Request Body - **(structure depends on API requirements)** - Required - The payload containing data for the streams API. #### Headers - **content-type**: `application/json` - **user-key**: `{USER_KEY}` - Required - Your API key for authentication. ### Request Example ```python import requests import json streams_url = "https://api.dowjones.com/alpha/streams" request_body = { ... } # Define your request body here response = requests.post(streams_url, data=json.dumps(request_body), headers={'content-type': 'application/json', 'user-key': '{USER_KEY}'}) ``` ### Response #### Success Response (200/201) - **(structure depends on API response)** - Description of the success response, which might confirm the operation or return created/updated data. ``` -------------------------------- ### Check Explain Query and Create Snapshot - Python Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-earnings_reports This Python code snippet demonstrates how to check the result of an 'explain' query to verify its validity and the number of documents it would return. It then proceeds to create a 'snapshot' for data extraction if the user confirms. The code handles API requests, response parsing, and user input for confirmation. ```python if response.status_code != 201: print("ERROR: An error occurred creating an explain: " + response.text) else: explain = response.json() print("Explain Created. Job ID: " + explain["data"]["id"]) state = explain["data"]["attributes"]["current_state"] # Wait for Explain job to complete while state != "JOB_STATE_DONE": self_link = explain["links"]["self"] response = requests.get(self_link, headers=headers) explain = response.json() state = explain["data"]["attributes"]["current_state"] print("Explain Completed Successfully.") doc_count = explain["data"]["attributes"]["counts"] print("Number of documents returned: " + str(doc_count)) print("Proceed with the Snapshot? (Y/N)") proceed = input('> ') if proceed.lower() != 'y' and proceed.lower() != "yes": print("Not proceeding with extraction") else: # Create a Snapshot with the given query print("Creating the Snapshot: " + json.dumps(request_body)) response = requests.post(url, data=json.dumps(request_body), headers=headers) print(response.text) # Verify the response from creating an extraction is OK if response.status_code != 201: print("ERROR: An error occurred creating an extraction: " + response.text) else: extraction = response.json() print(extraction) print("Extraction Created. Job ID: " + extraction['data']['id']) self_link = extraction["links"]["self"] sleep(30) print("Checking state of the job.") while True: # Call the second endpoint, which will verify if the extraction is ready. status_response = requests.get(self_link, headers=headers) # Verify that the response from the self_link is OK if status_response.status_code != 200: print("ERROR: an error occurred getting the details for the extraction: " + status_response.text) else: # There is an edge case where the job does not have a current_state yet. If current_state does not yet exist in the response, we will sleep for 10 seconds status = status_response.json() if 'current_state' in status['data']['attributes']: currentState = status['data']['attributes']['current_state'] print("Current state is: " + currentState) # Job is still running, sleep for 10 seconds if currentState == "JOB_STATE_RUNNING": print("Sleeping for 30 seconds... Job state running") sleep(30) elif currentState == "JOB_VALIDATING": print("Sleeping for 30 seconds... Job validating") sleep(30) elif currentState == "JOB_QUEUED": print("Sleeping for 30 seconds... Job queued") sleep(30) elif currentState == "JOB_CREATED": print("Sleeping for 30 seconds... Job created") sleep(30) else: # If currentState is JOB_STATE_DONE, then the process completed successfully. if currentState == "JOB_STATE_DONE": print("Job completed successfully") print("Downloading Snapshot files to current directory") for file in status['data']['attributes']['files']: filepath = file['uri'] parts = filepath.split('/') filename = parts[len(parts) - 1] r = requests.get(file['uri'], stream=True, headers=headers) dir_path = os.path.dirname(os.path.realpath(__file__)) filename = os.path.join(dir_path, filename) with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk) # If job has another state, that means it was not successful. else: print("An error occurred with the job. Final state is: " + currentState) break else: print("Sleeping for 30 seconds...") sleep(30) ``` -------------------------------- ### GET /authorize - Retrieve Authorization Code Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-authorization_code_grant_flow This endpoint is used to initiate the Authorization Code Grant flow by requesting an authorization code. This step must be performed in a user agent (browser). ```APIDOC ## GET /authorize ### Description Retrieves an authorization code by initiating the OAuth 2.0 Authorization Code Grant flow. This process involves user authentication and redirection to a specified URI with the code. ### Method GET ### Endpoint `https://accounts.dowjones.com/oauth2/v1/authorize` ### Parameters #### Query Parameters - **client_id** (string) - Required - Specifies the unique client identifier. - **connection** (string) - Optional - Specifies the custom name of the connection used to identify the user. Default value: `DJPIB`. - **device** (string) - Optional - Specifies if a Refresh token is requested. When requesting a Refresh token, set its value to any string (e.g., unique mobile device identifier). If not specified, the Refresh token is preserved under an `ignore` device string. - **redirect_uri** (string) - Required - Specifies the URL where, after successful authorization, the server sends the authorization code. This value must be whitelisted for the specified `client_id`. - **response_type** (string) - Required - Specifies the type of flow to execute. Use `code` for the Authorization Code grant. - **scope** (string) - Required - Specifies the scope returned in the AuthN ID token. Use `openid pib` to retrieve AuthN tokens, or `openid pib offline_access` to retrieve AuthN and Refresh tokens. - **state** (string) - Optional - Specifies the caller state that is propagated back to the `redirect_uri` on successful authentication. ### Request Example ``` GET https://accounts.dowjones.com/oauth2/v1/authorize?client_id=CLIENT_ID&redirect_uri=REDIRECT_URI&response_type=code&scope=openid%20pib%20offline_access&state=abc ``` ### Response #### Success Response (302 Found - Redirect) - The server redirects the user agent to the `redirect_uri` specified in the request, appending the authorization code as a query parameter. #### Response Example (Redirection to `REDIRECT_URI` with `code` and `state` query parameters) ``` -------------------------------- ### Factiva Snapshots API - Example Script Source: https://developer.dowjones.com/documents/site-docs-getting_started/-data_selection_samples-commodities This Python script demonstrates how to query the Factiva Snapshots API and download data. It requires user configuration for API key and desired download format. ```APIDOC ## POST /alpha/extractions/documents/_explain ### Description This endpoint is used to create an 'explain' for a given query, which is a prerequisite for executing the query and downloading data using the Factiva Snapshots API. The provided script uses this to validate and prepare the query before data extraction. ### Method POST ### Endpoint https://api.dowjones.com/alpha/extractions/documents/_explain ### Parameters #### Query Parameters None #### Request Body - **query** (object) - Required - The query object containing search criteria. - **where** (string) - Required - The search query string using Factiva's query language. - **format** (string) - Optional - The desired format for the downloaded data. Allowed values: `avro` (default), `csv`, `json`. ### Request Example ```json { "query": { "where": "(REGEXP_CONTAINS(LOWER(CONCAT(title,' ',snippet,' ',body)), r'(?i)(\\b)(investing|securities|online\\+brokers|after\\+hours\\+trading|post\\+trade\\+services|commodity|financial\\+market\\+news|commodity\\+markets|agricultural\\+commodity\\+markets|commodities\\+asset\\+class\\+news)(\\b)') OR REGEXP_CONTAINS(LOWER(industry_codes), r'(^|,) (iinv|i83105|icusto)($|,)') OR REGEXP_CONTAINS(LOWER(subject_codes), r'(^|,) (mcat|m14|m141|ncmac)($|,)')) AND REGEXP_CONTAINS(LOWER(restrictor_codes), r'(^|,) (wsjo|bon|mrkwc)($|,)') AND language_code='en' AND publication_date >= '2019-01-01 00:00:00'", "format": "csv" } } ``` ### Response #### Success Response (200) - **explain_id** (string) - An identifier for the created explain. - **status** (string) - The status of the explain creation. #### Response Example ```json { "explain_id": "some_explain_id", "status": "success" } ``` ### Notes - Ensure `USER_KEY` is set correctly in the script. - The `format` variable determines the output file type (`avro`, `csv`, or `json`). - After running the script, the data will be downloaded to the directory where the script is executed. ``` -------------------------------- ### Logout Endpoint Source: https://developer.dowjones.com/documents/site-docs-getting_started/-sessions_and_authentication-implicit_grant_flow Use the GET or POST /logout endpoint to invalidate the PIB session ID and delete the browser auto-login session. This action must be performed on a user agent (browser). ```APIDOC ## POST /logout ### Description Invalidates the PIB session ID contained in the AuthZ Access token and deletes the browser auto-login session established. ### Method POST ### Endpoint https://accounts.dowjones.com/oauth2/v1/logout ### Parameters #### Query Parameters - **access_token** (string) - Required - Specifies the AuthZ Access token returned in Step 2. - **productname** (string) - Optional - Specifies the PIB product. Default value: `dna`. - **return_uri** (string) - Optional - Specifies the URL to return to after logout. If not provided, the user is redirected to www.dowjones.com. ### Request Example ``` POST https://accounts.dowjones.com/oauth2/v1/logout?access_token=YOUR_ACCESS_TOKEN&return_uri=YOUR_RETURN_URI ``` ### Response #### Success Response (200) A successful response redirects the user to the URL specified in the `return_uri` parameter, or to www.dowjones.com if `return_uri` is not provided. ## GET /logout ### Description Invalidates the PIB session ID contained in the AuthZ Access token and deletes the browser auto-login session established. ### Method GET ### Endpoint https://accounts.dowjones.com/oauth2/v1/logout ### Parameters #### Query Parameters - **access_token** (string) - Required - Specifies the AuthZ Access token returned in Step 2. - **productname** (string) - Optional - Specifies the PIB product. Default value: `dna`. - **return_uri** (string) - Optional - Specifies the URL to return to after logout. If not provided, the user is redirected to www.dowjones.com. ### Request Example ``` GET https://accounts.dowjones.com/oauth2/v1/logout?access_token=YOUR_ACCESS_TOKEN&return_uri=YOUR_RETURN_URI ``` ### Response #### Success Response (200) A successful response redirects the user to the URL specified in the `return_uri` parameter, or to www.dowjones.com if `return_uri` is not provided. ```