### Install refextract

Source: http://pythonhosted.org/refextract

Install the refextract library using pip. This is the first step before using any of its functionalities.

```bash
pip install refextract

```

--------------------------------

### Extract Journal Reference

Source: http://pythonhosted.org/refextract

Use extract_journal_reference to get structured information from a publication reference string. Ensure the input is a valid reference format.

```python
from refextract import extract_journal_reference
reference = extract_journal_reference("J.Phys.,A39,13445")
print(reference)

```

```json
{
    'extra_ibids': [],
    'is_ibid': False,
    'misc_txt': u'',
    'page': u'13445',
    'title': u'J. Phys.',
    'type': 'JOURNAL',
    'volume': u'A39',
    'year': ''
 }
```

--------------------------------

### Extract references from a file with custom format

Source: http://pythonhosted.org/refextract

Customizes the output format of extracted references using the reference_format parameter.

```python
>>> extract_references_from_file(path, reference_format="{title},{volume},{page}")
```

--------------------------------

### Override knowledge bases for file extraction

Source: http://pythonhosted.org/refextract

Provides custom paths for journal knowledge bases to improve extraction accuracy.

```python
>>> extract_references_from_file(path, override_kbs_files={'journals': 'my/path/to.kb'})
```

--------------------------------

### Override knowledge bases for string extraction

Source: http://pythonhosted.org/refextract

Provides custom paths for knowledge bases when extracting from a string.

```python
>>> extract_references_from_url(path, override_kbs_files={'journals': 'my/path/to.kb'})
```

--------------------------------

### Extract references from a string with custom format

Source: http://pythonhosted.org/refextract

Customizes the output format of references extracted from a raw string.

```python
>>> extract_references_from_url(path, reference_format="{title},{volume},{page}")
```

--------------------------------

### Extract References from URL

Source: http://pythonhosted.org/refextract

Extract references directly from a URL pointing to a PDF file using extract_references_from_url. The URL must be accessible.

```python
from refextract import extract_references_from_url
reference = extract_references_from_url("http://arxiv.org/pdf/1503.07589v1.pdf")
print(reference)

```

```json
{
    'references': [
            {'author': [u'F. Englert and R. Brout'],
             'doi': [u'10.1103/PhysRevLett.13.321'],
             'journal_page': [u'321'],
             'journal_reference': ['Phys.Rev.Lett.,13,1964'],
             'journal_title': [u'Phys.Rev.Lett.'],
             'journal_volume': [u'13'],
             'journal_year': [u'1964'],
             'linemarker': [u'1'],
             'title': [u'Broken symmetry and the mass of gauge vector mesons'],
             'year': [u'1964']}, ...
       ],
    'stats': {
          'author': 15,
          'date': '2016-01-12 10:52:58',
          'doi': 1,
          'misc': 0,
          'old_stats_str': '0-1-1-15-0-1-0',
          'reportnum': 1,
          'status': 0,
          'title': 1,
          'url': 0,
          'version': u'0.1.0.dev20150722'
    }
}

```

--------------------------------

### Extract References from File

Source: http://pythonhosted.org/refextract

Extract references from a full-text PDF file using extract_references_from_file. Provide the correct file path as an argument.

```python
from refextract import extract_references_from_file
reference = extract_references_from_file("some/fulltext/1503.07589v1.pdf")
print(reference)

```

```json
{
    'references': [
            {'author': [u'F. Englert and R. Brout'],
             'doi': [u'10.1103/PhysRevLett.13.321'],
             'journal_page': [u'321'],
             'journal_reference': ['Phys.Rev.Lett.,13,1964'],
             'journal_title': [u'Phys.Rev.Lett.'],
             'journal_volume': [u'13'],
             'journal_year': [u'1964'],
             'linemarker': [u'1'],
             'title': [u'Broken symmetry and the mass of gauge vector mesons'],
             'year': [u'1964']}, ...
       ],
    'stats': {
          'author': 15,
          'date': '2016-01-12 10:52:58',
          'doi': 1,
          'misc': 0,
          'old_stats_str': '0-1-1-15-0-1-0',
          'reportnum': 1,
          'status': 0,
          'title': 1,
          'url': 0,
          'version': u'0.1.0.dev20150722'
    }
}

```

--------------------------------

### extract_references_from_file

Source: http://pythonhosted.org/refextract

Extracts references from a local PDF file.

```APIDOC
## extract_references_from_file

### Description
Extracts references from a local PDF file. Raises FullTextNotAvailable if the file does not exist.

### Parameters
#### Request Body
- **path** (string) - Required - Path to the local PDF file.
- **recid** (any) - Optional - Record ID.
- **reference_format** (string) - Optional - Format string for references (default: '{title} {volume} ({year}) {page}').
- **linker_callback** (function) - Optional - Callback function executed for every reference element found.
- **override_kbs_files** (dict) - Optional - Dictionary to override knowledge base files.

### Response
- **result** (dict) - Returns a dictionary with extracted references and stats.
```

--------------------------------

### extract_references_from_string

Source: http://pythonhosted.org/refextract

Extracts references from a raw string.

```APIDOC
## extract_references_from_string

### Description
Extracts references from a raw string. Raises FullTextNotAvailable if the source is invalid.

### Parameters
#### Request Body
- **source** (string) - Required - The raw string to extract references from.
- **is_only_references** (boolean) - Optional - Set to False if the string contains more than just references to improve accuracy.
- **recid** (any) - Optional - Record ID.
- **reference_format** (string) - Optional - Format string for references.
- **linker_callback** (function) - Optional - Callback function for reference elements.
- **override_kbs_files** (dict) - Optional - Dictionary to override knowledge base files.
```

--------------------------------

### extract_references_from_url

Source: http://pythonhosted.org/refextract

Extracts references from a PDF located at a URL.

```APIDOC
## extract_references_from_url

### Description
Extracts references from the PDF specified in the URL. Raises FullTextNotAvailable if the URL returns a 404.

### Parameters
#### Request Body
- **url** (string) - Required - The URL of the PDF file.
- **headers** (dict) - Optional - HTTP headers for the request.
- **chunk_size** (int) - Optional - Chunk size for downloading (default: 1024).
- **kwargs** (dict) - Optional - Additional keyword arguments.
```

--------------------------------

### extract_journal_reference

Source: http://pythonhosted.org/refextract

Extracts journal reference information from a given string.

```APIDOC
## extract_journal_reference

### Description
Extracts the journal reference from a string and parses for specific journal information.

### Parameters
#### Request Body
- **line** (string) - Required - The input string containing the journal reference.
- **override_kbs_files** (dict) - Optional - Dictionary to override knowledge base files for journal names.
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.