### Batch Request Examples Source: https://api.semanticscholar.org/api-docs Example URLs and payloads for batch paper retrieval with different field configurations. ```text https://api.semanticscholar.org/graph/v1/paper/batch * {"ids":["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"]} ``` ```text https://api.semanticscholar.org/graph/v1/paper/batch?fields=title,isOpenAccess,openAccessPdf,authors * {"ids":["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"]} ``` -------------------------------- ### Batch paper request examples Source: https://api.semanticscholar.org/api-docs/graph Example URLs and payloads for batch paper retrieval. ```http https://api.semanticscholar.org/graph/v1/paper/batch {"ids":["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"]} ``` ```http https://api.semanticscholar.org/graph/v1/paper/batch?fields=title,isOpenAccess,openAccessPdf,authors {"ids":["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"]} ``` -------------------------------- ### POST /recommendations/v1/papers Source: https://api.semanticscholar.org/api-docs/recommendations Get recommended papers based on a list of positive and negative example paper IDs. ```APIDOC ## POST /recommendations/v1/papers ### Description Retrieves a list of recommended papers based on provided positive and negative example paper IDs. Returns up to a specified limit of recommendations. ### Method POST ### Endpoint https://api.semanticscholar.org/recommendations/v1/papers ### Parameters #### Query Parameters - **limit** (integer) - Optional - How many recommendations to return. Default: 100, Maximum: 500. - **fields** (string) - Optional - A comma-separated list of the fields to be returned. If omitted, only paperId and title are returned. #### Request Body - **positivePaperIds** (Array of strings) - Required - List of paper IDs to use as positive examples. - **negativePaperIds** (Array of strings) - Required - List of paper IDs to use as negative examples. ### Request Example { "positivePaperIds": ["649def34f8be52c8b66281af98ae884c09aef38b"], "negativePaperIds": ["ArXiv:1805.02262"] } ### Response #### Success Response (200) - **recommendedPapers** (Array) - List of recommended papers with requested fields. #### Response Example { "recommendedPapers": [ { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar" } ] } ``` -------------------------------- ### Filter by Start Date Source: https://api.semanticscholar.org/api-docs Example of filtering papers published on or after a specific date. ```http https://api.semanticscholar.org/graph/v1/author/1741101/papers?publicationDateOrYear=1981-08-25: ``` -------------------------------- ### Field selection examples Source: https://api.semanticscholar.org/api-docs/graph Examples of the fields query parameter for customizing returned paper data. ```text fields=title,url fields=title,embedding.specter_v2 fields=title,authors,citations.title,citations.abstract ``` -------------------------------- ### GET /recommendations/v1/papers/forpaper/{paper_id} Source: https://api.semanticscholar.org/api-docs/recommendations Retrieves a list of recommended papers based on a single positive example paper. You can specify query parameters to filter the recommendations. ```APIDOC ## GET /recommendations/v1/papers/forpaper/{paper_id} ### Description Get recommended papers for a single positive example paper. ### Method GET ### Endpoint https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{paper_id} ### Parameters #### Path Parameters - **paper_id** (string) - Required - The ID of the paper for which to get recommendations. #### Query Parameters - **from** (string) - Optional - Which pool of papers to recommend from. Default: "recent". Enum: "recent", "all-cs". - **limit** (integer) - Optional - How many recommendations to return. Maximum 500. Default: 100. - **fields** (string) - Optional - A comma-separated list of the fields to be returned. If omitted, only `paperId` and `title` will be returned. Example: `title,url,authors`. ### Response #### Success Response (200) List of recommendations with default or requested fields. #### Error Response - **400**: Bad query parameters - **404**: Input papers not found ### Response Example (200) ```json { "recommendedPapers": [ { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "corpusId": 215416146, "externalIds": { "MAG": "3015453090", "DBLP": "conf/acl/LoWNKW20", "ACL": "2020.acl-main.447", "DOI": "10.18653/V1/2020.ACL-MAIN.447", "CorpusId": 215416146 }, "url": "https://www.semanticscholar.org/paper/5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar", "abstract": "We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery.", "venue": "Annual Meeting of the Association for Computational Linguistics", "publicationVenue": { "id": "1e33b3be-b2ab-46e9-96e8-d4eb4bad6e44", "name": "Annual Meeting of the Association for Computational Linguistics", "type": "conference", "alternate_names": [ "Annu Meet Assoc Comput Linguistics", "Meeting of the Association for Computational Linguistics", "ACL", "Meet Assoc Comput Linguistics" ], "url": "https://www.aclweb.org/anthology/venues/acl/" }, "year": 1997, "referenceCount": 59, "citationCount": 453, "influentialCitationCount": 90, "isOpenAccess": true, "openAccessPdf": { "url": "https://www.acl.org/anthology/2020.acl-main.447.pdf", "status": "HYBRID", "license": "CCBY", "disclaimer": "Notice: This snippet is extracted from the open access paper or abstract available at https://aclanthology.org/2020.acl-main.447, which is subject to the license by the author or copyright owner provided with this content. Please go to the source to verify the license and copyright information for your use." }, "fieldsOfStudy": [ "Computer Science" ], "s2FieldsOfStudy": [ { "category": "Computer Science", "source": "external" }, { "category": "Computer Science", "source": "s2-fos-model" }, { "category": "Mathematics", "source": "s2-fos-model" } ], "publicationTypes": [ "Journal Article", "Review" ], "publicationDate": "2024-04-29", "journal": { "volume": "40", "pages": "116 - 135", "name": "IETE Technical Review" }, "citationStyles": { "bibtex": "@['JournalArticle', 'Conference']{Ammar2018ConstructionOT,\n author = {Waleed Ammar and Dirk Groeneveld and Chandra Bhagavatula and Iz Beltagy and Miles Crawford and Doug Downey and Jason Dunkelberger and Ahmed Elgohary and Sergey Feldman and Vu A. Ha and Rodney Michael Kinney and Sebastian Kohlmeier and Kyle Lo and Tyler C. Murray and Hsu-Han Ooi and Matthew E. Peters and Joanna L. Power and Sam Skjonsberg and Lucy Lu Wang and Christopher Wilhelm and Zheng Yuan and Madeleine van Zuylen and Oren Etzioni},\n booktitle = {NAACL},\n pages = {84-91},\n title = {Construction of the Literature Graph in Semantic Scholar},\n year = {2018}\n}" }, "authors": [ { "authorId": "1741101", "name": "Oren Etzioni" } ] } ] } ``` -------------------------------- ### Retrieve Available Releases Source: https://api.semanticscholar.org/api-docs/datasets Example JSON response for the list of available data releases. ```json [ * "2022-01-17" ] ``` -------------------------------- ### GET /datasets/v1/diffs/{start_release_id}/to/{end_release_id}/{dataset_name} Source: https://api.semanticscholar.org/api-docs/datasets Fetches download links for a dataset, showing changes between a start release and an end release. ```APIDOC ## GET /datasets/v1/diffs/{start_release_id}/to/{end_release_id}/{dataset_name} ### Description Retrieves a list of download links for a specific dataset, detailing the differences between two release versions. ### Method GET ### Endpoint /datasets/v1/diffs/{start_release_id}/to/{end_release_id}/{dataset_name} ### Parameters #### Path Parameters - **start_release_id** (string) - Required - ID of the release held by the client. - **end_release_id** (string) - Required - ID of the release the client wishes to update to, or 'latest' for the most recent release. - **dataset_name** (string) - Required - Name of the dataset. ### Response #### Success Response (200) - **dataset** (string) - The name of the dataset. - **start_release** (string) - The start release version. - **end_release** (string) - The end release version. - **diffs** (array) - A list of differences between releases. - **from_release** (string) - The starting release version for this diff entry. - **to_release** (string) - The ending release version for this diff entry. - **update_files** (array) - List of URLs for files that were updated. - **delete_files** (array) - List of URLs for files that were deleted. ### Response Example ```json { "dataset": "papers", "start_release": "2023-08-01", "end_release": "2023-08-29", "diffs": [ { "from_release": "2023-08-01", "to_release": "2023-08-07", "update_files": [ "http://..." ], "delete_files": [ "http://..." ] } ] } ``` ``` -------------------------------- ### Retrieve Release Metadata Source: https://api.semanticscholar.org/api-docs/datasets Example JSON response for metadata describing a specific release. ```json { * "release_id": "2022-01-17", * "README": "Subject to the following terms ...", * "datasets": [ * { * "name": "papers", * "description": "Core paper metadata", * "README": "This dataset contains ..." } ] } ``` -------------------------------- ### Response Samples Source: https://api.semanticscholar.org/api-docs/graph Examples of successful and error responses from the API. ```APIDOC ## Response Samples ### Success Response (200) This is an example of a successful response containing author data. #### Response Example ```json { "offset": 0, "next": 0, "data": [ { "contexts": [ "SciBERT (Beltagy et al., 2019) follows the BERT’s masking strategy to pre-train the model from scratch using a scientific corpus composed of papers from Semantic Scholar (Ammar et al., 2018).", "27M articles from the Semantic Scholar dataset (Ammar et al., 2018)." ], "intents": [ "methodology" ], "contextsWithIntent": [ { "context": "SciBERT (Beltagy et al., 2019) follows the BERT’s ...", "intents": [ "methodology" ] } ], "isInfluential": false, "citedPaper": { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "corpusId": 215416146, "externalIds": { "MAG": "3015453090", "DBLP": "conf/acl/LoWNKW20", "ACL": "2020.acl-main.447", "DOI": "10.18653/V1/2020.ACL-MAIN.447", "CorpusId": 215416146 }, "url": "https://www.semanticscholar.org/paper/5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar", "abstract": "We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery.", "venue": "Annual Meeting of the Association for Computational Linguistics", "publicationVenue": { "id": "1e33b3be-b2ab-46e9-96e8-d4eb4bad6e44", "name": "Annual Meeting of the Association for Computational Linguistics", "type": "conference", "alternate_names": [ "Annu Meet Assoc Comput Linguistics", "Meeting of the Association for Computational Linguistics", "ACL", "Meet Assoc Comput Linguistics" ], "url": "https://www.aclweb.org/anthology/venues/acl/" }, "year": 1997, "referenceCount": 59, "citationCount": 453, "influentialCitationCount": 90, "isOpenAccess": true, "openAccessPdf": { "url": "https://www.aclweb.org/anthology/2020.acl-main.447.pdf", "status": "HYBRID", "license": "CCBY", "disclaimer": "Notice: This snippet is extracted from the open access paper or abstract available at https://aclanthology.org/2020.acl-main.447, which is subject to the license by the author or copyright owner provided with this content. Please go to the source to verify the license and copyright information for your use." }, "fieldsOfStudy": [ "Computer Science" ], "s2FieldsOfStudy": [ { "category": "Computer Science", "source": "external" }, { "category": "Computer Science", "source": "s2-fos-model" }, { "category": "Mathematics", "source": "s2-fos-model" } ], "publicationTypes": [ "Journal Article", "Review" ], "publicationDate": "2024-04-29", "journal": { "volume": "40", "pages": "116 - 135", "name": "IETE Technical Review" }, "citationStyles": { "bibtex": "@['JournalArticle', 'Conference']{Ammar2018ConstructionOT,\n author = {Waleed Ammar and Dirk Groeneveld and Chandra Bhagavatula and Iz Beltagy and Miles Crawford and Doug Downey and Jason Dunkelberger and Ahmed Elgohary and Sergey Feldman and Vu A. Ha and Rodney Michael Kinney and Sebastian Kohlmeier and Kyle Lo and Tyler C. Murray and Hsu-Han Ooi and Matthew E. Peters and Joanna L. Power and Sam Skjonsberg and Lucy Lu Wang and Christopher Wilhelm and Zheng Yuan and Madeleine van Zuylen and Oren Etzioni},\n booktitle = {NAACL},\n pages = {84-91},\n title = {Construction of the Literature Graph in Semantic Scholar},\n year = {2018}\n}" }, "authors": [ { "authorId": "1741101", "name": "Oren Etzioni" } ] } } ] } ``` ### Error Responses - **400 Bad Request**: Indicates an issue with the request parameters. - **404 Not Found**: Indicates that the requested resource could not be found. ``` -------------------------------- ### Paper Data Structure Example Source: https://api.semanticscholar.org/api-docs/graph Example JSON structure representing paper metadata including publication details, citation styles, and embeddings. ```json { "publicationTypes": [ "Journal Article", "Review" ], "publicationDate": "2024-04-29", "journal": { "volume": "40", "pages": "116 - 135", "name": "IETE Technical Review" }, "citationStyles": { "bibtex": "@['JournalArticle', 'Conference']{Ammar2018ConstructionOT,\n author = {Waleed Ammar and Dirk Groeneveld and Chandra Bhagavatula and Iz Beltagy and Miles Crawford and Doug Downey and Jason Dunkelberger and Ahmed Elgohary and Sergey Feldman and Vu A. Ha and Rodney Michael Kinney and Sebastian Kohlmeier and Kyle Lo and Tyler C. Murray and Hsu-Han Ooi and Matthew E. Peters and Joanna L. Power and Sam Skjonsberg and Lucy Lu Wang and Christopher Wilhelm and Zheng Yuan and Madeleine van Zuylen and Oren Etzioni},\n booktitle = {NAACL},\n pages = {84-91},\n title = {Construction of the Literature Graph in Semantic Scholar},\n year = {2018}\n}\n" }, "authors": [ { "authorId": "1741101", "name": "Oren Etzioni" } ] } ], "embedding": { "model": "specter@v0.1.1", "vector": [ -8.82082748413086, -2.6610865592956543 ] }, "tldr": { "model": "tldr@v2.0.0", "text": "This paper reduces literature graph construction into familiar NLP tasks, point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task." }, "textAvailability": "string" } ``` -------------------------------- ### Retrieve Dataset Download Links Source: https://api.semanticscholar.org/api-docs/datasets Example JSON response for dataset download links within a specific release. ```json { * "name": "papers", * "description": "Core paper metadata", * "README": "Subject to terms of use as follows ...", * "files": [ * "https://..." ] } ``` -------------------------------- ### Paginate Search Results Source: https://api.semanticscholar.org/api-docs/index This example demonstrates how to use the 'offset' and 'limit' parameters for paginating through search results. 'offset' specifies the starting position, and 'limit' sets the maximum number of results. ```HTTP GET https://api.semanticscholar.org/graph/v1/paper/search?query=ai&offset=100&limit=50 ``` -------------------------------- ### GET /release/ Source: https://api.semanticscholar.org/api-docs/datasets Retrieves a list of all available data releases identified by date stamps. ```APIDOC ## GET /release/ ### Description Returns a list of available data releases, where each release is identified by a date stamp (e.g., "2023-08-01"). ### Method GET ### Endpoint https://api.semanticscholar.org/datasets/v1/release/ ### Response #### Success Response (200) - **releases** (array of strings) - List of release date stamps. #### Response Example [ "2022-01-17" ] ``` -------------------------------- ### GET /release/{release_id}/dataset/{dataset_name} Source: https://api.semanticscholar.org/api-docs/datasets Retrieves description and pre-signed download links for a specific dataset within a release. ```APIDOC ## GET /release/{release_id}/dataset/{dataset_name} ### Description Returns the description and a list of pre-signed S3 download URLs for the partitions of a specific dataset. ### Method GET ### Endpoint https://api.semanticscholar.org/datasets/v1/release/{release_id}/dataset/{dataset_name} ### Parameters #### Path Parameters - **release_id** (string) - Required - ID of the release. - **dataset_name** (string) - Required - Name of the dataset. ### Response #### Success Response (200) - **name** (string) - Name of the dataset. - **description** (string) - Description of the dataset. - **README** (string) - Terms of use. - **files** (array of strings) - List of download URLs. #### Response Example { "name": "papers", "description": "Core paper metadata", "README": "Subject to terms of use as follows ...", "files": [ "https://..." ] } ``` -------------------------------- ### Response Samples Source: https://api.semanticscholar.org/api-docs Examples of successful and error responses from the Semantic Scholar API. ```APIDOC ## Response Samples ### Success Response (200) This is an example of a successful response, typically containing a list of data entries. #### Response Example ```json { "offset": 0, "next": 0, "data": [ { "contexts": [ "SciBERT (Beltagy et al., 2019) follows the BERT’s masking strategy to pre-train the model from scratch using a scientific corpus composed of papers from Semantic Scholar (Ammar et al., 2018).", "27M articles from the Semantic Scholar dataset (Ammar et al., 2018)." ], "intents": [ "methodology" ], "contextsWithIntent": [ { "context": "SciBERT (Beltagy et al., 2019) follows the BERT’s ...", "intents": [ "methodology" ] } ], "isInfluential": false, "citedPaper": { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "corpusId": 215416146, "externalIds": { "MAG": "3015453090", "DBLP": "conf/acl/LoWNKW20", "ACL": "2020.acl-main.447", "DOI": "10.18653/V1/2020.ACL-MAIN.447", "CorpusId": 215416146 }, "url": "https://www.semanticscholar.org/paper/5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar", "abstract": "We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery.", "venue": "Annual Meeting of the Association for Computational Linguistics", "publicationVenue": { "id": "1e33b3be-b2ab-46e9-96e8-d4eb4bad6e44", "name": "Annual Meeting of the Association for Computational Linguistics", "type": "conference", "alternate_names": [ "Annu Meet Assoc Comput Linguistics", "Meeting of the Association for Computational Linguistics", "ACL", "Meet Assoc Comput Linguistics" ], "url": "https://www.aclweb.org/anthology/venues/acl/" }, "year": 1997, "referenceCount": 59, "citationCount": 453, "influentialCitationCount": 90, "isOpenAccess": true, "openAccessPdf": { "url": "https://www.aclweb.org/anthology/2020.acl-main.447.pdf", "status": "HYBRID", "license": "CCBY", "disclaimer": "Notice: This snippet is extracted from the open access paper or abstract available at https://aclanthology.org/2020.acl-main.447, which is subject to the license by the author or copyright owner provided with this content. Please go to the source to verify the license and copyright information for your use." }, "fieldsOfStudy": [ "Computer Science" ], "s2FieldsOfStudy": [ { "category": "Computer Science", "source": "external" }, { "category": "Computer Science", "source": "s2-fos-model" }, { "category": "Mathematics", "source": "s2-fos-model" } ], "publicationTypes": [ "Journal Article", "Review" ], "publicationDate": "2024-04-29", "journal": { "volume": "40", "pages": "116 - 135", "name": "IETE Technical Review" }, "citationStyles": { "bibtex": "@['JournalArticle', 'Conference']{Ammar2018ConstructionOT,\n author = {Waleed Ammar and Dirk Groeneveld and Chandra Bhagavatula and Iz Beltagy and Miles Crawford and Doug Downey and Jason Dunkelberger and Ahmed Elgohary and Sergey Feldman and Vu A. Ha and Rodney Michael Kinney and Sebastian Kohlmeier and Kyle Lo and Tyler C. Murray and Hsu-Han Ooi and Matthew E. Peters and Joanna L. Power and Sam Skjonsberg and Lucy Lu Wang and Christopher Wilhelm and Zheng Yuan and Madeleine van Zuylen and Oren Etzioni},\n booktitle = {NAACL},\n pages = {84-91},\n title = {Construction of the Literature Graph in Semantic Scholar},\n year = {2018}\n}" }, "authors": [ { "authorId": "1741101", "name": "Oren Etzioni" } ] } } ] } ``` ### Error Responses The API may return error codes such as 400 (Bad Request) or 404 (Not Found) depending on the request. - **400**: Indicates a problem with the request parameters or format. - **404**: Indicates that the requested resource could not be found. ``` -------------------------------- ### GET /release/{release_id} Source: https://api.semanticscholar.org/api-docs/datasets Retrieves metadata for a specific release, including a list of available datasets. ```APIDOC ## GET /release/{release_id} ### Description Returns metadata describing a particular release, including a list of datasets available within that release. ### Method GET ### Endpoint https://api.semanticscholar.org/datasets/v1/release/{release_id} ### Parameters #### Path Parameters - **release_id** (string) - Required - ID of the release (date stamp). ### Response #### Success Response (200) - **release_id** (string) - The ID of the release. - **README** (string) - Terms and information about the release. - **datasets** (array) - List of available datasets. #### Response Example { "release_id": "2022-01-17", "README": "Subject to the following terms ...", "datasets": [ { "name": "papers", "description": "Core paper metadata", "README": "This dataset contains ..." } ] } ``` -------------------------------- ### Filter by Publication Date Range Source: https://api.semanticscholar.org/api-docs Example of filtering papers within a specific date range. ```http https://api.semanticscholar.org/graph/v1/author/1741101/papers?publicationDateOrYear=2016-03-05:2020-06-06 ``` -------------------------------- ### GET /paper/search/match Source: https://api.semanticscholar.org/api-docs/graph Retrieves a single paper based on the closest title match to the provided query. ```APIDOC ## GET /paper/search/match ### Description Retrieves a single paper that is the closest title match to the given query string. Returns a 404 error if no match is found. ### Method GET ### Endpoint https://api.semanticscholar.org/graph/v1/paper/search/match ### Parameters #### Query Parameters - **query** (string) - Required - The title string to search for. ### Response #### Success Response (200) - **paperId** (string) - Unique identifier for the paper. - **title** (string) - Title of the paper. - **matchScore** (float) - The confidence score of the match. ``` -------------------------------- ### Batch Author Request Payload Example Source: https://api.semanticscholar.org/api-docs The JSON structure required for the request body when querying multiple author IDs. ```json { "ids": [ "1741101" ] } ``` -------------------------------- ### GET /graph/v1/paper/autocomplete Source: https://api.semanticscholar.org/api-docs/graph Returns minimal information about papers matching a partial query string to support interactive query-completion. ```APIDOC ## GET /graph/v1/paper/autocomplete ### Description Returns minimal information about papers matching a partial query string to support interactive query-completion. ### Method GET ### Endpoint https://api.semanticscholar.org/graph/v1/paper/autocomplete ### Parameters #### Query Parameters - **query** (string) - Required - Plain-text partial query string. Will be truncated to first 100 characters. ### Response #### Success Response (200) - **matches** (array) - Batch of papers with default or requested fields #### Response Example { "matches": [ { "id": "649def34f8be52c8b66281af98ae884c09aef38b", "title": "SciBERT: A Pretrained Language Model for Scientific Text", "authorsYear": "Beltagy et al., 2019" } ] } ``` -------------------------------- ### Paper Recommendations Response Sample Source: https://api.semanticscholar.org/api-docs/recommendations This is a sample successful response (200 OK) for the paper recommendations endpoint. It includes a list of recommended papers with various fields. ```json { "recommendedPapers": [ { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "corpusId": 215416146, "externalIds": { "MAG": "3015453090", "DBLP": "conf/acl/LoWNKW20", "ACL": "2020.acl-main.447", "DOI": "10.18653/V1/2020.ACL-MAIN.447", "CorpusId": 215416146 }, "url": "https://www.semanticscholar.org/paper/5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar", "abstract": "We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery.", "venue": "Annual Meeting of the Association for Computational Linguistics", "publicationVenue": { "id": "1e33b3be-b2ab-46e9-96e8-d4eb4bad6e44", "name": "Annual Meeting of the Association for Computational Linguistics", "type": "conference", "alternate_names": [ "Annu Meet Assoc Comput Linguistics", "Meeting of the Association for Computational Linguistics", "ACL", "Meet Assoc Comput Linguistics" ], "url": "https://www.aclweb.org/anthology/venues/acl/" }, "year": 1997, "referenceCount": 59, "citationCount": 453, "influentialCitationCount": 90, "isOpenAccess": true, "openAccessPdf": { "url": "https://www.aclweb.org/anthology/2020.acl-main.447.pdf", "status": "HYBRID", "license": "CCBY", "disclaimer": "Notice: This snippet is extracted from the open access paper or abstract available at https://aclanthology.org/2020.acl-main.447, which is subject to the license by the author or copyright owner provided with this content. Please go to the source to verify the license and copyright information for your use." }, "fieldsOfStudy": [ "Computer Science" ], "s2FieldsOfStudy": [ { "category": "Computer Science", "source": "external" }, { "category": "Computer Science", "source": "s2-fos-model" }, { "category": "Mathematics", "source": "s2-fos-model" } ], "publicationTypes": [ "Journal Article", "Review" ], "publicationDate": "2024-04-29", "journal": { "volume": "40", "pages": "116 - 135", "name": "IETE Technical Review" }, "citationStyles": { "bibtex": "@['JournalArticle', 'Conference']{Ammar2018ConstructionOT,\n author = {Waleed Ammar and Dirk Groeneveld and Chandra Bhagavatula and Iz Beltagy and Miles Crawford and Doug Downey and Jason Dunkelberger and Ahmed Elgohary and Sergey Feldman and Vu A. Ha and Rodney Michael Kinney and Sebastian Kohlmeier and Kyle Lo and Tyler C. Murray and Hsu-Han Ooi and Matthew E. Peters and Joanna L. Power and Sam Skjonsberg and Lucy Lu Wang and Christopher Wilhelm and Zheng Yuan and Madeleine van Zuylen and Oren Etzioni},\n booktitle = {NAACL},\n pages = {84-91},\n title = {Construction of the Literature Graph in Semantic Scholar},\n year = {2018}\n} " }, "authors": [ { "authorId": "1741101", "name": "Oren Etzioni" } ] } ] } ``` -------------------------------- ### Get Recommended Papers Request Body Source: https://api.semanticscholar.org/api-docs/recommendations This is the schema for the request body used to get paper recommendations. It requires lists of positive and negative paper IDs. ```json { "positivePaperIds": [ "649def34f8be52c8b66281af98ae884c09aef38b" ], "negativePaperIds": [ "ArXiv:1805.02262" ] } ``` -------------------------------- ### Get Author Details (Default Fields) Source: https://api.semanticscholar.org/api-docs/index Retrieves basic author information, including authorId and name, by making a GET request to the author endpoint. No specific fields are requested, so default fields are returned. ```HTTP GET https://api.semanticscholar.org/graph/v1/author/1741101 ``` -------------------------------- ### Batch Paper Retrieval Source: https://api.semanticscholar.org/api-docs Python example using the requests library to fetch details for multiple papers via a POST request. ```python r = requests.post( 'https://api.semanticscholar.org/graph/v1/paper/batch', params={'fields': 'referenceCount,citationCount,title'}, json={"ids": ["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"]} ) print(json.dumps(r.json(), indent=2)) [ { "paperId": "649def34f8be52c8b66281af98ae884c09aef38b", "title": "Construction of the Literature Graph in Semantic Scholar", "referenceCount": 27, "citationCount": 299 }, { "paperId": "f712fab0d58ae6492e3cdfc1933dae103ec12d5d", "title": "Reinfection and low cross-immunity as drivers of epidemic resurgence under high seroprevalence: a model-based approach with application to Amazonas, Brazil", "referenceCount": 13, "citationCount": 0 } ] ``` -------------------------------- ### Get Author Details (Default Fields) Source: https://api.semanticscholar.org/api-docs/graph Retrieves basic author information, including authorId and name, by making a GET request to the author endpoint. No additional query parameters are needed for default fields. ```HTTP GET https://api.semanticscholar.org/graph/v1/author/1741101 ``` -------------------------------- ### Access S2AG Datasets via Python Source: https://api.semanticscholar.org/api-docs/datasets Demonstrates retrieving release lists, latest release metadata, and specific dataset download links using the requests library. ```python r1 = requests.get('https://api.semanticscholar.org/datasets/v1/release').json() print(r1[-3:]) ['2023-03-14', '2023-03-21', '2023-03-28'] r2 = requests.get('https://api.semanticscholar.org/datasets/v1/release/latest').json() print(r2['release_id']) 2023-03-28 print(json.dumps(r2['datasets'][0], indent=2)) { "name": "abstracts", "description": "Paper abstract text, where available. 100M records in 30 1.8GB files.", "README": "Semantic Scholar Academic Graph Datasets The "abstracts" dataset provides..." } r3 = requests.get('https://api.semanticscholar.org/datasets/v1/release/latest/dataset/abstracts').json() print(json.dumps(r3, indent=2)) { "name": "abstracts", "description": "Paper abstract text, where available. 100M records in 30 1.8GB files.", "README": "Semantic Scholar Academic Graph Datasets The "abstracts" dataset provides...", "files": [ "https://ai2-s2ag.s3.amazonaws.com/dev/staging/2023-03-28/abstracts/20230331_0..." ] } ``` -------------------------------- ### Filter by End Date Source: https://api.semanticscholar.org/api-docs Example of filtering papers published on or before a specific date. ```http https://api.semanticscholar.org/graph/v1/author/1741101/papers?publicationDateOrYear=:2015-01 ``` -------------------------------- ### Filter by Publication Year Source: https://api.semanticscholar.org/api-docs Example of filtering papers published in a specific year. ```http https://api.semanticscholar.org/graph/v1/author/1741101/papers?publicationDateOrYear=2019 ``` -------------------------------- ### Process Incremental Dataset Updates Source: https://api.semanticscholar.org/api-docs/datasets Demonstrates applying incremental diffs to a local datastore or using Spark for large-scale dataset updates. ```python difflist = requests.get('https://api.semanticscholar.org/datasets/v1/diffs/2023-08-01/to/latest/papers').json() for diff in difflist['diffs']: for url in diff['update_files']: for json_line in requests.get(url).iter_lines(): record = json.loads(json_line) datastore.upsert(record['corpusid'], record) for url in diff['delete_files']: for json_line in requests.get(url).iter_lines(): record = json.loads(json_line) datastore.delete(record['corpusid']) ``` ```python current = sc.textFile('s3://curr-dataset-location').map(json.loads).keyBy(lambda x: x['corpusid']) updates = sc.textFile('s3://diff-updates-location').map(json.loads).keyBy(lambda x: x['corpusid']) deletes = sc.textFile('s3://diff-deletes-location').map(json.loads).keyBy(lambda x: x['corpusid']) updated = current.fullOuterJoin(updates).mapValues(lambda x: x[1] if x[1] is not None else x[0]) updated = updated.fullOuterJoin(deletes).mapValues(lambda x: None if x[1] is not None else x[0]).filter(lambda x: x[1] is not None) updated.values().map(json.dumps).saveAsTextFile('s3://updated-dataset-location') ``` -------------------------------- ### Request Author Details with Specific Fields Source: https://api.semanticscholar.org/api-docs/graph This example demonstrates fetching author details including their URL, name, paper count, and papers with titles and open access PDF links. The 'fields' parameter is used to specify these details. ```http https://api.semanticscholar.org/graph/v1/author/batch?fields=url,name,paperCount,papers,papers.title,papers.openAccessPdf ``` ```json {"ids":["1741101", "1780531", "48323507"]} ``` -------------------------------- ### GET /paper/search/bulk Source: https://api.semanticscholar.org/api-docs/graph Retrieves a batch of papers, optionally restricted by fields of study. ```APIDOC ## GET /paper/search/bulk ### Description Retrieves a batch of papers. Results can be restricted to specific fields of study using a comma-separated list. ### Method GET ### Endpoint https://api.semanticscholar.org/graph/v1/paper/search/bulk ### Parameters #### Query Parameters - **fieldsOfStudy** (string) - Optional - Restricts results to papers in the given fields of study (e.g., "Physics,Mathematics"). ### Response #### Success Response (200) - **total** (integer) - Total number of papers found. - **token** (string) - Pagination or session token. - **data** (array) - List of paper objects. #### Response Example { "total": 15117, "token": "SDKJFHSDKFHWIEFSFSGHEIURYC", "data": [ { "paperId": "5c5751d45e298cea054f32b392c12c61027d2fe7", "title": "Construction of the Literature Graph in Semantic Scholar" } ] } ``` -------------------------------- ### Fetch Author Papers with Fields Parameter Example Source: https://api.semanticscholar.org/api-docs/index Provides an example of using the `fields` parameter to request specific paper attributes like 'title', 'fieldsOfStudy', and 'references'. It also shows how to access subfields like 'citations.url' and 'citations.venue'. ```HTTP GET https://api.semanticscholar.org/graph/v1/author/{author_id}/papers?fields=title,fieldsOfStudy,references ``` ```HTTP GET https://api.semanticscholar.org/graph/v1/author/{author_id}/papers?fields=abstract,citations.url,citations.venue ```