### Install QuartzBio from Git

Source: https://quartzbio.github.io/quartzbio-python/index.html

Install the quartzbio package directly from its GitHub repository using pip. This is useful for development or using the latest unreleased version.

```bash
pip install -e git+https://github.com/quartzbio/quartzbio-python.git#egg=quartzbio
```

--------------------------------

### Install QuartzBio Python Library

Source: https://quartzbio.github.io/quartzbio-python/_sources/authentication.md.txt

Install or upgrade the QuartzBio Python client using pip. Ensure you are using a recent version of Python.

```bash
pip install --upgrade quartzbio
```

--------------------------------

### Install QuartzBio Python Package

Source: https://quartzbio.github.io/quartzbio-python/index.html

Install the quartzbio package using pip. For interactive use, also install IPython and gnureadline.

```bash
pip install quartzbio
```

```bash
pip install ipython
pip install gnureadline
```

--------------------------------

### Develop QuartzBio Python Package

Source: https://quartzbio.github.io/quartzbio-python/index.html

Set up the QuartzBio Python package for development. Clone the repository and install it in development mode using setup.py or pip.

```bash
git clone https://github.com/quartzbio/quartzbio-python.git
cd quartzbio-python/
python setup.py develop
```

```bash
pip install -e .
```

--------------------------------

### Log in to QuartzBio CLI

Source: https://quartzbio.github.io/quartzbio-python/index.html

Log in to your QuartzBio account using the command-line interface. Ensure you have installed the package first.

```bash
quartzbio login
```

--------------------------------

### Run Tests with Tox

Source: https://quartzbio.github.io/quartzbio-python/index.html

Install tox and run it to execute tests for the QuartzBio Python package. Tox automates testing in different environments.

```bash
pip install tox
tox
```

--------------------------------

### Flattening Algorithm Example

Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt

Illustrates how list fields are expanded into multiple columns during CSV or XLSX export.

```Python
{"a": "a", "b": ["x"]}
{"a": "a", "b": ["x", "y"]}
{"a": "a", "b": ["x", "y", "z"]}
```

--------------------------------

### Querying Files

Source: https://quartzbio.github.io/quartzbio-python/querying_datasets_and_files.html

Demonstrates how to query and filter file objects, retrieve a specified number of records, and get all fields from a file.

```APIDOC
## Querying Files

File objects can be queried and filtered on one or more fields. The query results are returned in pages. Text files such as CSV, TXT, TSV, or BED must be uploaded with headers.

### Basic Query

```python
clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz')
clinvar.query()
```

### Query with Limit

Retrieve a specified number of records by setting the `limit` parameter.

```python
clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz')
q = clinvar.query(limit=50)
```

### Retrieving All Fields

Get all fields from the file by calling the `fields` method.

```python
fields = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz').query().fields()
```

### Loading Files into Readers (e.g., Pandas)

Use the `download_url()` method to load files into readers.

```python
from quartzbio import *
import pandas

# Get file using ID or full path
f = Object.retrieve("ID")
f = Object.get_by_full_path("vault/path/to/file.csv")

# Get file download URL and load into reader
url = f.download_url()
pandas.read_csv(url)
```

### Supported File Extensions and Compressions

File querying is supported for the following file extensions and compressions:

**File Extensions** | **Compression**  
---|---
 txt | GZIP, BZIP2  
 csv | GZIP, BZIP2  
tsv | GZIP, BZIP2  
bed | GZIP, BZIP2  
json | GZIP, BZIP2  
parquet | GZIP  

The only supported encoding is UTF-8.

### Output Format

The output format of the query can be specified using the `output_format` parameter.

**Output Format** | **Description**  
---|---
 json (default) | applicable to all file extensions  
 csv | applicable only to **csv** , **txt** , **tsv** , or **bed** file extensions  
 tsv | applicable only to **csv** , **txt** , **tsv** , or **bed** file extensions  

**Example:**

```python
clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz')
clinvar.query(output_format='json')
```
```

--------------------------------

### Checking Dataset Availability

Source: https://quartzbio.github.io/quartzbio-python/dataset_versioning.html

Provides examples of how to check dataset availability before querying.

```APIDOC
## Supporting Archived Datasets

### Description
Checks the availability status of datasets before attempting to query them, or catches query errors.

### Method
Accessing the `availability` attribute and using try-except blocks for `query()`.

### Endpoint
N/A (Attribute access and method calls on dataset objects)

### Request Example
```python
# Explicitly check availability
datasets = vault.datasets()
for dataset in datasets:
    if dataset.availability != 'available':
        print("Dataset {} availability is {}. Not querying.".format(dataset.id, dataset.availability))
        continue

    print(dataset.query())


# Catch errors
datasets = vault.datasets()
for dataset in datasets:
    try:
        print(dataset.query())
    except errors.SolveError as e:
        print("Dataset can not be queried: {}".format(e))
```
```

--------------------------------

### Illustrate Flattening Algorithm

Source: https://quartzbio.github.io/quartzbio-python/exporting_data.html

Example showing how nested list fields are transformed into flattened columns during CSV or XLSX export.

```text
{"a": "a", "b": ["x"]}
{"a": "a", "b": ["x", "y"]}
{"a": "a", "b": ["x", "y", "z"]}

```

```text
a,b.0,b.1,b.2
a,x,,
a,x,y,
a,x,y,z

```

--------------------------------

### Create Dataset from Template

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt

Create a DatasetTemplate object from the constructed dictionary and then use it to create or get a dataset. The dataset will be initialized with the non-transient fields from the template.

```Python
from quartzbio import Dataset, DatasetTemplate 

template = DatasetTemplate.create(**template)
dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields)
# Dataset will now have the non-transient fields from the template
# with desired titles/descriptions and expressions print(dataset.fields())
# But no records print(dataset.documents_count)
```

--------------------------------

### Example JSONL file structure

Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt

Each line in a JSONL file must be a complete, valid JSON object without internal line breaks.

```json
{"field": 1}
{"field": 2}
{"field": 3}
```

--------------------------------

### Initialize Global Search

Source: https://quartzbio.github.io/quartzbio-python/data_discovery.html

Initialize the GlobalSearch object to start a search. By default, it returns all objects.

```python
from quartzbio import GlobalSearch

# Search returns all objects by default
results = GlobalSearch()
```

--------------------------------

### Genomic Coordinate Filter Setup

Source: https://quartzbio.github.io/quartzbio-python/filters.html

Initialize a query for a dataset with a GRCh37 build to enable genomic coordinate filtering. Requires importing the Dataset class.

```python
from quartzbio import Dataset

# GRCh37
q = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37').query()
```

--------------------------------

### Define Dataset Path Formats

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Examples of path strings used to identify datasets within domains, vaults, or personal directories.

```text
<domain>:<vault>:<path>
```

```text
myDomain:MyVault:/folder/dataset
```

```text
~/folder/dataset
```

--------------------------------

### Advanced Search with Regex and Glob Patterns in Python

Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html

Python examples for advanced searching within vaults using regular expressions and glob patterns for flexible file matching.

```python
from quartzbio import Vault, Object

# Get the public vault
public_vault = Vault.get_by_full_path('quartzbio:public')
# Find datasets using regex
clinvar_v2 = public_vault.datasets(regex='/ClinVar/2.*')
```

```python
xml_files = [i.filename for i in folder.search('*.xml.gz AND type:file')]
```

```python
# List the dataset ids of every dataset that has Outcome somewhere in the path
all_outcomes = [d.id for d in Object.all(regex=".*Outcome.*", type='dataset')]
```

```python
# List the filenames of all xml files within a specific path
path = 'quartzbio:Public:/MEDLINE/2.3.4-2018'
folder = Object.get_by_full_path(path)
json_files = [i.filename for i in folder.files(regex="{}.*.json.gz".format(folder.path))]
```

```python
# Unix style wildcards are supported too
json_files = [i.filename for i in folder.files(glob="{}*.json.gz".format(folder.path))]
```

--------------------------------

### quartzbio.cli.tutorial Module

Source: https://quartzbio.github.io/quartzbio-python/_sources/quartzbio.cli.tutorial.rst.txt

Overview of the quartzbio.cli.tutorial module, which provides tutorial-related functionality for the QuartzBio command-line interface.

```APIDOC
## Module: quartzbio.cli.tutorial

### Description
This module contains the tutorial implementation for the QuartzBio CLI. It provides interactive or guided steps to help users understand how to utilize the QuartzBio platform via the command line.

### Members
- The module includes all public members defined within `quartzbio.cli.tutorial`.
- Inherits from standard Python module structures for CLI integration.
```

--------------------------------

### Get Extended Statistics for a Numerical Field

Source: https://quartzbio.github.io/quartzbio-python/_sources/joining_and_aggregating_datasets.md.txt

Use the 'stats' facet type with 'extended': True to retrieve comprehensive statistical information for a numerical field. This example calculates extended statistics for 'info.ALLELEID'.

```python
from quartzbio import Dataset

query = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37').query()

# Get extended statistics for a numerical field.
query.facets(
    **{'info.ALLELEID': {
        'facet_type': 'stats', 'extended': True}})

```

--------------------------------

### Import File using Command Line

Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt

Import a data file using the quartzbio CLI. The --create-dataset flag will create the dataset if it does not exist. --follow tracks progress.

```bash
# Import a file (create the dataset if necessary):
quartzbio import --create-dataset --follow ~/test-dataset data.vcf.gz

# Import files in upsert mode (create the dataset from a template if necessary):
quartzbio import --create-dataset --template-file template.json --commit-mode=upsert --follow ~/test-dataset data.vcf.gz
```

--------------------------------

### get

Source: https://quartzbio.github.io/quartzbio-python/_sources/expression_functions.md.txt

Get the value at any depth of a nested object.

```APIDOC
## get

### Description
Get the value at any depth of a nested object based on the path described by path.

### Parameters
- **obj** (list|dict) - Required - The object to process
- **path** (str|list) - Required - List or '.' delimited string of path
- **default** (any) - Optional - Default value to return if path doesn't exist
```

--------------------------------

### Retrieve Template and Create Dataset

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt

Retrieve an existing template by its ID and use its fields to create or get a dataset. This ensures the dataset structure matches the specified template.

```Python
from quartzbio import Dataset, DatasetTemplate 

template = DatasetTemplate.retrieve('id of your template')
dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields)
# Dataset will now have the non-transient fields from the template
# with desired titles/descriptions and expressions print(dataset.fields())
# But no records print(dataset.documents_count)
```

--------------------------------

### Translate variant examples

Source: https://quartzbio.github.io/quartzbio-python/expression_functions.html

Various usage examples for the translate_variant function with different optional parameters.

```python
translate_variant("GRCH38-7-117559590-117559593-A")
```

```python
translate_variant("GRCH38-7-117559590-117559593-A", gene_model="ensembl")
```

```python
translate_variant("GRCH38-7-117559590-117559593-A", transcript="NM_000492.3")
```

```python
translate_variant("GRCH38-7-117559590-117559593-A", include_effects=True)
```

--------------------------------

### Create a Dataset in Python

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Initializes a new, empty dataset in a personal vault using the Dataset.get_or_create_by_full_path method.

```python
from quartzbio import Dataset

# Create a new, empty dataset in your personal vault (represented by "~/")
dataset = Dataset.get_or_create_by_full_path('~/my_dataset')
```

--------------------------------

### GET /v2/vaults/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt

Retrieve metadata for a specific vault.

```APIDOC
## GET https://<EDP_API_HOST>/v2/vaults/{ID}

### Description
Retrieve a vault's metadata. This request requires an authorized user with "read" permission or higher on the vault.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/vaults/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the vault.

### Response
#### Success Response (200)
- **vault** (object) - The Vault resource.
```

--------------------------------

### Upload Local Folder using Command Line

Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt

Upload local files and folders to the QuartzBio vault using the quartzbio CLI. Only missing files and folders are uploaded.

```bash
# Upload local_folder to the root of the personal vault:
quartzbio upload ./local_folder
```

--------------------------------

### GET /v2/vaults

Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt

List all available vaults accessible to the user.

```APIDOC
## GET https://<EDP_API_HOST>/v2/vaults

### Description
List all available vaults. All public vaults are included in this response. If the request is sent by an authenticated user, vaults which the user has "read" permission or higher on are also returned.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/vaults

### Response
#### Success Response (200)
- **vaults** (array) - A list of vaults matching the provided filters.
```

--------------------------------

### Create Dataset with Fields

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Demonstrates how to create a new dataset and define its fields using a template during the creation process. This includes setting various properties for each field.

```APIDOC
## POST /datasets

### Description
Creates a new dataset with specified fields and properties.

### Method
POST

### Endpoint
`/datasets`

### Parameters
#### Request Body
- **fields** (list of objects) - Required - A list of field definitions, where each field is an object with properties like `name`, `description`, `data_type`, etc.
- **capacity** (string) - Optional - The capacity setting for the dataset (e.g., 'small').

### Request Example
```python
from quartzbio import Dataset, DatasetField

fields = [
    {
        "name": "my_string_field",
        "description": "Just a string",
        "data_type": "string",
        "is_list": False,
        "is_hidden": False,
        "ordering": 0
    },
    {
        "name": "gene_symbol",
        "description": "HUGO gene symbol",
        "data_type": "string",
        "entity_type": "gene"
    }
]

dataset_full_path = '~/python_examples/my_fields_dataset'

dataset = Dataset.get_or_create_by_full_path(
    dataset_full_path,
    fields=fields,
    capacity='small'
)
```

### Response
#### Success Response (200)
- **dataset** (object) - The created dataset object.

#### Response Example
(Dataset object details)
```

--------------------------------

### Get Dataset Migration

Source: https://quartzbio.github.io/quartzbio-python/creating_and_migrating_datasets.html

Retrieves metadata about a specific dataset migration.

```APIDOC
## GET /v2/dataset_migrations/{ID}

### Description
Retrieve metadata about a dataset migration.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_migrations/{ID}

### Authorization
This request requires an authorized user with permission.

### Response
#### Success Response (200)
- **DatasetMigration** (object) - The response contains a DatasetMigration resource.

### Path Parameters
- **ID** (string) - Required - The ID of the dataset migration to retrieve.
```

--------------------------------

### Create Dataset with Template

Source: https://quartzbio.github.io/quartzbio-python/dataset_templates.html

Instantiate a DatasetTemplate object from the prepared template dictionary and then create a new Dataset using this template. The dataset will be initialized with the non-transient fields defined in the template.

```python
from quartzbio import Dataset, DatasetTemplate

template = DatasetTemplate.create(**template)
dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields)
# Dataset will now have the non-transient fields from the template
# with desired titles/descriptions and expressions print(dataset.fields())
# But no records print(dataset.documents_count)
```

--------------------------------

### launch_ipython_shell

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html

Opens the standard QuartzBio shell (IPython wrapper).

```APIDOC
## launch_ipython_shell

### Description
Opens the QuartzBio shell (IPython wrapper).

### Parameters
- **args** (object) - Required - Arguments to pass to the shell initialization.
```

--------------------------------

### GET /beacons/query

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.resource.beacon.html

Executes a query against a Beacon to perform entity-based searches.

```APIDOC
## GET /beacons/query

### Description
Performs a search query against the Beacon resource.

### Method
GET

### Endpoint
/beacons/query

### Parameters
#### Query Parameters
- **query** (string) - Required - The search query string.
- **entity_type** (string) - Optional - The type of entity to filter the search by.
```

--------------------------------

### GET /v2/saved_queries

Source: https://quartzbio.github.io/quartzbio-python/_sources/querying_datasets_and_files.md.txt

Retrieves a list of all Saved Queries available to the user.

```APIDOC
## GET /v2/saved_queries

### Description
Retrieves all Saved Queries available to a user.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/saved_queries

### Authorization
This request requires an authorized user.

### Response
#### Success Response (200)
- **list of SavedQuery resources** (array) - The response contains a list of SavedQuery resources.

#### Response Example
{
  "example": "List of SavedQuery resources"
}
```

--------------------------------

### Build Documentation Locally

Source: https://quartzbio.github.io/quartzbio-python/index.html

Build the project documentation locally using Sphinx and Sphinx Autobuild. This command allows you to view the documentation in a browser as you make changes.

```bash
make sphinx-autobuild
```

--------------------------------

### Create a Dataset Template

Source: https://quartzbio.github.io/quartzbio-python/dataset_templates.html

Assemble the prepared fields into a template dictionary, including essential metadata like name, version, description, and template type. The 'template_type' must be set to 'dataset'. This structure is then used to create a DatasetTemplate object.

```python
template = {
    "name": "My Variant Template",
    "version": '1.2.0',
    "description": 'Import a special CSV file. Genome is assumed to be GRCh38, also has variant entity for GRCh37.',
    "template_type": "dataset",
    "is_public": False,
    "entity_params": {
        'disable': True
    },
    "fields": fields
}
```

--------------------------------

### GET /v2/saved_queries/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/querying_datasets_and_files.md.txt

Retrieves a specific Saved Query by its ID.

```APIDOC
## GET /v2/saved_queries/{ID}

### Description
Retrieve a Saved Query.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/saved_queries/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the Saved Query.

### Authorization
This request requires an authorized user with permission.

### Response
#### Success Response (200)
- **SavedQuery resource** (object) - The response contains a SavedQuery resource.

#### Response Example
{
  "example": "SavedQuery resource"
}
```

--------------------------------

### GET /v2/dataset_exports/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt

Retrieve metadata about a specific dataset export.

```APIDOC
## GET /v2/dataset_exports/{ID}

### Description
Retrieve metadata about an export.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_exports/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the export.

### Response
#### Success Response (200)
- **DatasetExport** (object) - The resource containing export metadata.
```

--------------------------------

### GET /v2/dataset_snapshot_tasks

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Retrieve a list of all available dataset snapshot tasks.

```APIDOC
## GET /v2/dataset_snapshot_tasks

### Description
Retrieve a list of available dataset snapshot tasks.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_snapshot_tasks

### Response
#### Success Response (200)
- **List** (array) - The response contains a list of DatasetSnapshotTask resources.
```

--------------------------------

### GET /v2/dataset_restore_tasks

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Retrieve a list of all available dataset restore tasks.

```APIDOC
## GET /v2/dataset_restore_tasks

### Description
Retrieve a list of available dataset restore tasks.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_restore_tasks
```

--------------------------------

### Download Files and Folders using QuartzBio CLI

Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html

Command-line interface for downloading files and folders from QuartzBio vaults. Supports recursive downloads, exclusions, and dry runs.

```bash
quartzbio download "~/path/to/file.txt" .
```

```bash
quartzbio download --recursive "~/path/to/folder" local_folder
```

```bash
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.*"
```

```bash
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.DS_store"
```

```bash
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*" --include "*.pdf"
```

```bash
quartzbio download --recursive "~/path/to/folder" local_folder --delete --dry-run
```

```bash
quartzbio download --help
```

--------------------------------

### GET /v2/dataset_imports/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt

Retrieves details for a specific dataset import.

```APIDOC
## GET /v2/dataset_imports/{ID}

### Description
Retrieves the details of a specific dataset import resource.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_imports/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the dataset import.

### Response
#### Success Response (200)
- **Status** - HTTP 200 OK
```

--------------------------------

### Import File with Template Fields

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt

Retrieve a template and create a dataset. Then, import a file into this dataset, specifying the template's fields as the target structure for the import. This process waits for the import to complete.

```Python
from quartzbio import Dataset, DatasetTemplate 

template = DatasetTemplate.retrieve('id of your template')
dataset = Dataset.get_or_create_by_full_path('your dataset path')
# Only field should be "id"
print(dataset.fields())
file_object = Object.retrieve('id of file uploaded to EDP')
DatasetImport.create(    
    dataset_id=dataset.id,    
    object_id=file_object.id,    
    target_fields=template.fields,    
    commit_mode='append',
    )
    # Wait for import to finish 
    dataset.activity(follow=True)
    # Should now see all the non-transient fields from the template!
    print(dataset.fields())
```

--------------------------------

### GET /v2/dataset_commits/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Retrieve metadata for a specific dataset commit.

```APIDOC
## GET /v2/dataset_commits/{ID}

### Description
Retrieve metadata about a dataset commit.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_commits/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the dataset commit.
```

--------------------------------

### QueryFile Class Initialization

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.query.html

Initializes a new QueryFile instance for object content queries.

```APIDOC
## Class: quartzbio.query.QueryFile

### Description
A QueryFile API request wrapper that generates a request for an object content query, and can iterate through streaming result sets.

### Parameters
- **file_id** (string) - Required - The ID of the file to query.
- **fields** (list) - Optional - List of fields to include.
- **exclude_fields** (list) - Optional - List of fields to exclude.
- **filters** (Filter) - Optional - Filter instance created using F class operators.
- **limit** (number) - Optional - Maximum number of results (default: inf).
- **page_size** (number) - Optional - Number of results per page (default: 1000).
- **output_format** (string) - Optional - Format of the output (default: 'json').
- **header** (boolean) - Optional - Whether to include a header (default: True).
```

--------------------------------

### GET /v2/objects

Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt

Lists all available objects. Requires read permissions to the vaults.

```APIDOC
## GET /v2/objects

### Description
List all available objects. The response includes objects which exist inside vaults that the user has "read" permission or higher to.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/objects

### Parameters
#### Query Parameters
- **id** (integer) - Optional - The ID of an object.
- **vault_id** (integer) - Optional - The ID of the vault that will contain the object.
- **vault_name** (string) - Optional - The name of the vault containing objects.
- **vault_full_path** (text) - Optional - The full path of the vault containing objects.
- **parent_object_id** (integer) - Optional - The ID of the existing folder object to place the new object into. To place at "/", set this value to null.
- **filename** (string) - Optional - The filename of the object, not including its parent folder. This value cannot contain slashes.
- **path** (string) - Optional - The path of the object, including its parent folder.
- **object_type** (string) - Optional - The type of the object. Must be one of "file", "folder", or "dataset".
- **depth** (integer) - Optional - The depth of the object in the Vault. Objects at the root have depth = 0.
- **query** (string) - Optional - A string that matches any objects whose path contains that string.
- **regex** (regex) - Optional - A regular expression which searches objects for matching paths (case-insensitive).

### Response
#### Success Response (200)
- **List of objects** (array) - The response returns a list of objects matching the provided filters.

#### Response Example
[
  {
    "id": 789,
    "vault_id": 123,
    "parent_object_id": 456,
    "filename": "my_document.txt",
    "object_type": "file",
    "description": "This is a sample document.",
    "metadata": {
      "key1": "value1"
    },
    "tags": ["important", "report"],
    "storage_class": "standard",
    "created_at": "2023-10-27T10:00:00Z",
    "updated_at": "2023-10-27T10:00:00Z"
  },
  {
    "id": 987,
    "vault_id": 123,
    "parent_object_id": null,
    "filename": "my_folder",
    "object_type": "folder",
    "description": null,
    "metadata": {},
    "tags": [],
    "storage_class": null,
    "created_at": "2023-10-26T09:00:00Z",
    "updated_at": "2023-10-26T09:00:00Z"
  }
]
```

--------------------------------

### GET /v2/datasets/{DATASET_ID}/exports

Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt

List all exports associated with a specific dataset.

```APIDOC
## GET /v2/datasets/{DATASET_ID}/exports

### Description
List the exports associated with a dataset.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/datasets/{DATASET_ID}/exports

### Parameters
#### Path Parameters
- **DATASET_ID** (string) - Required - The unique identifier of the dataset.

### Response
#### Success Response (200)
- **List[DatasetExport]** (array) - A list of DatasetExport resources.
```

--------------------------------

### launch_ipython_legacy_shell

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html

Opens the QuartzBio shell (IPython wrapper) for older IPython versions.

```APIDOC
## launch_ipython_legacy_shell

### Description
Open the QuartzBio shell (IPython wrapper) for older IPython versions.

### Parameters
- **args** (object) - Required - Arguments to pass to the shell initialization.
```

--------------------------------

### GET /v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Retrieve metadata about a specific dataset snapshot task.

```APIDOC
## GET /v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID}

### Description
Retrieve metadata about a dataset snapshot task.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID}

### Parameters
#### Path Parameters
- **SNAPSHOT_TASK_ID** (string) - Required - The unique identifier of the snapshot task.

### Response
#### Success Response (200)
- **DatasetSnapshotTask** (object) - The response contains a DatasetSnapshotTask resource.
```

--------------------------------

### Download files and folders using the CLI

Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt

Command-line interface commands for downloading single files, entire folders recursively, and excluding specific file patterns. Use --dry-run with --delete for safety.

```Shell
# Download a single file
quartzbio download "~/path/to/file.txt" .

# Download a folder
quartzbio download --recursive "~/path/to/folder" local_folder

# Download a folder, but exclude hidden files and folders
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.*"

# Download a folder, but exclude DS_store files
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.DS_store"

# Download only PDF files within a folder
# --include always supersedes --exclude
quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*" --include "*.pdf"

# The --delete flag will delete local files that do not match
# those found in the vault. Always use the --dry-run mode first
# with this option as it will delete files permanently.
quartzbio download --recursive "~/path/to/folder" local_folder --delete --dry-run

# For full usage:
quartzbio download --help
```

--------------------------------

### GET /v2/dataset_restore_tasks/{RESTORE_TASK_ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Retrieve metadata about a specific dataset restore task.

```APIDOC
## GET /v2/dataset_restore_tasks/{RESTORE_TASK_ID}

### Description
Retrieve metadata about a dataset restore task.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_restore_tasks/{RESTORE_TASK_ID}

### Parameters
#### Path Parameters
- **RESTORE_TASK_ID** (string) - Required - The unique identifier of the restore task.
```

--------------------------------

### launch_ipython_5_shell

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html

Opens the QuartzBio shell (IPython wrapper) specifically for IPython 5+ versions.

```APIDOC
## launch_ipython_5_shell

### Description
Open the QuartzBio shell (IPython wrapper) with IPython 5+.

### Parameters
- **args** (object) - Required - Arguments to pass to the shell initialization.
```

--------------------------------

### GET /v2/datasets/{DATASET_ID}/imports

Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt

Lists all imports associated with a specific dataset.

```APIDOC
## GET /v2/datasets/{DATASET_ID}/imports

### Description
Returns a list of DatasetImport resources associated with the specified dataset.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/datasets/{DATASET_ID}/imports

### Parameters
#### Path Parameters
- **DATASET_ID** (string) - Required - The unique identifier of the dataset.

#### Query Parameters
- **limit** (integer) - Optional - The number of objects to return per page.
- **offset** (integer) - Optional - The offset within the list of available objects.

### Response
#### Success Response (200)
- **Body** - A list of DatasetImport resources.
```

--------------------------------

### Enable Global Beacon Indexing for Datasets

Source: https://quartzbio.github.io/quartzbio-python/metadata_and_global_beacons.html

Initiate the Global Beacon indexing process for a dataset. The status and progress can be monitored via the returned dictionary.

```python
# Getting the dataset
dataset = Dataset.get_by_full_path('~/beacon-test-dataset')

# Enabling Global Beacon on dataset
dataset.enable_global_beacon()

# Example Output:
{'id': 125,
'datastore_id': 6,
'dataset_id': 1658666726768179211,
'status': 'indexing',
'progress_percent': 0,
'is_deleted': False}
```

--------------------------------

### Import Data from Manifest into Dataset

Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt

Launch an import using a manifest into a new or existing dataset. Progress tracking is available through the API or web interface.

```python
from quartzbio import Dataset, DatasetImport

dataset = Dataset.get_or_create_by_full_path('~/python_examples/manifest_dataset')

# Launch the import
imp = DatasetImport.create(
    dataset_id=dataset.id,
    manifest=manifest.manifest,
    commit_mode='append'
)

# Follow the import status
dataset.activity(follow=True)
```

--------------------------------

### GET /v2/datasets/{DATASET_ID}/commits

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

List all commits associated with a specific dataset.

```APIDOC
## GET /v2/datasets/{DATASET_ID}/commits

### Description
Retrieve a list of dataset commits associated with a dataset.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/datasets/{DATASET_ID}/commits

### Parameters
#### Path Parameters
- **DATASET_ID** (string) - Required - The unique identifier of the dataset.
```

--------------------------------

### Create Dataset Import with Manifest

Source: https://quartzbio.github.io/quartzbio-python/importing_data.html

Use this to create a dataset import by providing a manifest, which specifies remote files to import. Ensure the manifest includes a valid URL and optionally other file details.

```json
{
    "files": [{
        "url": "https://example.com/file.json.gz",
        "name": "file.json.gz",
        "format": "json",
        "size": 100,
        "md5": "",
        "base64_md5": ""
    }]
}

```

--------------------------------

### Get Object Copy Task

Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html

Retrieves metadata for a specific object copy task.

```APIDOC
## GET /v2/object_copy_tasks/{ID}

### Description
Retrieve metadata about an object copy task.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/object_copy_tasks/{ID}

### Parameters
#### Path Parameters
- **ID** (integer) - Required - The ID of the object copy task to retrieve.

### Authorization
This request requires that the authorized user is also the user who created the object copy task being retrieved.

### Response
#### Success Response (200)
- **task** (object) - The response contains an object copy task resource.
```

--------------------------------

### POST /api/datasets/create

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.main.html

Creates a new QuartzBio dataset with specified configurations.

```APIDOC
## POST /api/datasets/create

### Description
This endpoint is used to create a new QuartzBio dataset. It supports options for creating the underlying vault if it does not exist, specifying a template, setting dataset capacity, adding tags, providing metadata, and performing a dry run.

### Method
POST

### Endpoint
/api/datasets/create

### Parameters
#### Query Parameters
- **--create-vault** (boolean) - Optional - Create the vault if it doesn't exist
- **--template-id** (string) - Optional - The template ID used when creating a new dataset
- **--template-file** (string) - Optional - A local template file to be used when creating a new dataset
- **--capacity** (string) - Optional - Specifies the capacity of the dataset: small (default, <100M records), medium (<500M), large (>=500M)
- **--tag** (string) - Optional - A tag to be added. Tags are case insensitive strings. Example tags: --tag GRCh38 --tag Tissue --tag "Foundation Medicine"
- **--metadata** (KEY=VALUE) - Optional - Dataset metadata in the format KEY=VALUE
- **--metadata-json-file** (string) - Optional - Metadata key value pairs in JSON format
- **--dry-run** (boolean) - Optional - Dry run mode will not create the dataset
- **full_path** (string) - Required - The full path to the dataset in the format: "domain:vault:/path/dataset". Defaults to your personal vault if no vault is provided. Defaults to the vault root if no path is provided.

### Request Example
```json
{
  "full_path": "my-domain:my-vault:/data/new-dataset",
  "--template-id": "template-123",
  "--capacity": "large",
  "--tag": ["GRCh38", "WGS"],
  "--metadata": ["Project=ProjectX", "SampleID=Sample123"],
  "--dry-run": true
}
```

### Response
#### Success Response (200)
- **message** (string) - Confirmation message of dataset creation or dry run status.

#### Response Example
```json
{
  "message": "Dataset 'new-dataset' creation dry run successful."
}
```
```

--------------------------------

### Get Current User

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.main.html

Displays the email address of the currently authenticated QuartzBio user.

```APIDOC
## GET /whoami

### Description
Show your QuartzBio email address.

### Method
GET
```

--------------------------------

### GET /v2/dataset_commits/{ID}/rollback

Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt

Check if a commit can be reverted and identify blocking commits.

```APIDOC
## GET /v2/dataset_commits/{ID}/rollback

### Description
Returns whether or not a commit can be reverted and returns a reason why along with any commits that are blocking.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_commits/{ID}/rollback

### Parameters
#### Path Parameters
- **ID** (string) - Required - The unique identifier of the dataset commit.
```

--------------------------------

### Apply Field Filters

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.query.html

Example of applying multiple field-based filters to a dataset query.

```python
dataset.query(gene__in=['BRCA', 'GATA3'],
    chr='3', start__gt=10000, end__lte=20000)
```

--------------------------------

### Dataset and Object Query Filtering

Source: https://quartzbio.github.io/quartzbio-python/filters.html

Demonstrates how to initialize a query on a dataset or object and apply basic filters.

```APIDOC
## Dataset.query().filter()

### Description
Applies a filter to a dataset or object query based on field values.

### Parameters
#### Query Parameters
- **field_name__action** (any) - Required - The field name appended with an optional filter action (e.g., clinical_significance='pathogenic').

### Request Example
```python
from quartzbio import Dataset
dataset = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37')
dataset.query().filter(clinical_significance='pathogenic')
```
```

--------------------------------

### GET /v2/dataset_migrations/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Retrieves metadata for a specific dataset migration. Requires general permissions.

```APIDOC
## GET /v2/dataset_migrations/{ID}

### Description
Retrieve metadata about a dataset migration.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_migrations/{ID}

### Parameters
#### Path Parameters
- **ID** (integer) - Required - The ID of the dataset migration to retrieve.

### Response
#### Success Response (200)
- **DatasetMigration resource** (object) - The response contains a DatasetMigration resource.

#### Response Example
```json
{
  "id": 789,
  "commit_mode": "commit",
  "include_errors": true,
  "source_id": 123,
  "source_params": {
    "limit": 100,
    "fields": "name,email"
  },
  "target_fields": {
    "name": "{{source.name}}",
    "email": "{{source.email}}"
  },
  "target_id": 456,
  "priority": 1,
  "status": "completed"
}
```
```

--------------------------------

### Get Task Queue

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.data.html

Retrieves all running and queued tasks for an account, grouped by user and status.

```APIDOC
## GET /api/queue

### Description
Retrieves and displays running and queued tasks, grouped by user and status.

### Method
GET

### Endpoint
/api/queue

### Parameters
#### Query Parameters
- **statuses** (list) - Optional - A list of statuses to filter tasks by (default: ['running', 'queued']).
```

--------------------------------

### Launch Dataset Import from Manifest

Source: https://quartzbio.github.io/quartzbio-python/importing_data.html

Launches an import task using a manifest to add data from remote URLs to a specified dataset. The commit_mode can be set to 'append' or 'overwrite'.

```python
from quartzbio import Dataset, DatasetImport

dataset = Dataset.get_or_create_by_full_path('~/python_examples/manifest_dataset')

# Launch the import
imp = DatasetImport.create(
    dataset_id=dataset.id,
    manifest=manifest.manifest,
    commit_mode='append'
)
```

--------------------------------

### Authenticate with QuartzBio API

Source: https://quartzbio.github.io/quartzbio-python/quartzbio.html

Initializes a session with the QuartzBio EDP API using a specified host URL.

```python
import quartzbio
quartzbio.login(
    api_host="https://quartzbio.api.az.aws.quartz.bio",
)
```

--------------------------------

### GET /v2/dataset_fields/{ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Retrieve a specific dataset field by its ID. Requires read permission on the dataset.

```APIDOC
## GET /v2/dataset_fields/{ID}

### Description
Retrieve a dataset field.

### Method
GET

### Endpoint
https://<EDP_API_HOST>/v2/dataset_fields/{ID}

### Parameters
#### Path Parameters
- **ID** (string) - Required - The ID of the dataset field to retrieve.

### Response
#### Success Response (200)
- **DatasetField** (object) - The retrieved DatasetField resource.
```

--------------------------------

### quartzbio.cli.ipython Module Overview

Source: https://quartzbio.github.io/quartzbio-python/_sources/quartzbio.cli.ipython.rst.txt

This module provides the IPython CLI interface for QuartzBio. It includes members and inheritance details for interactive shell usage.

```APIDOC
## Module: quartzbio.cli.ipython

### Description
This module provides the IPython CLI interface for the QuartzBio Python SDK. It is designed to facilitate interactive data analysis and workflow management within an IPython environment.

### Members
- The module exposes all public members defined within `quartzbio.cli.ipython`.

### Inheritance
- The module includes inheritance information for classes defined within the scope of the IPython CLI integration.
```

--------------------------------

### GET /v2/datasets/{DATASET_ID}

Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt

Retrieves metadata for a specific dataset. Requires authorization to view the target dataset.

```APIDOC
## GET /v2/datasets/{DATASET_ID}

### Description
Retrieve metadata about a dataset. This request requires an authorized user with permission to view the target dataset.

### Method
GET

### Endpoint
GET https://<EDP_API_HOST>/v2/datasets/{DATASET_ID}

### Parameters
#### Path Parameters
- **DATASET_ID** (string) - Required - The ID of the dataset to retrieve.

### Response
#### Success Response (200)
- **Dataset resource** - The response contains a Dataset resource.

#### Response Example
{
  "example": "{\"id\": 123, \"name\": \"MyDataset\", \"vault_id\": 1, \"vault_parent_object_id\": 0, \"fields\": [], \"metadata\": {\"key\": \"value\"}, \"tags\": [\"tag1\"], \"capacity\": \"medium\", \"storage_class\": \"standard\"}"
}
```

--------------------------------

### Log out and Log in with Credentials

Source: https://quartzbio.github.io/quartzbio-python/_sources/authentication.md.txt

Clear existing credentials and log in using a personal access token and API host. Replace TOKEN and DOMAIN with your actual values. The API host should not have a trailing slash.

```python
# Clear your existing credentials
quartzbio logout

# Replace "TOKEN" with the Personal Access Token copied from the EDP web page
# Replace "DOMAIN" with your account's subdomain (i.e. your company name)
quartzbio login --access-token TOKEN --api-host https://DOMAIN.api.edp.aws.quartz.bio
```

--------------------------------

### Create Manifest for URL Import

Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt

Create a manifest object and add URLs to it. This manifest will be used to import data from remote servers.

```python
from quartzbio import Manifest

source_url = "https://s3.amazonaws.com/downloads.quartzbio.com/demo/interesting-variants.json.gz"

manifest = Manifest()
manifest.add_url(source_url)
```