### Install QuartzBio from Git Source: https://quartzbio.github.io/quartzbio-python/index.html Install the quartzbio package directly from its GitHub repository using pip. This is useful for development or using the latest unreleased version. ```bash pip install -e git+https://github.com/quartzbio/quartzbio-python.git#egg=quartzbio ``` -------------------------------- ### Install QuartzBio Python Library Source: https://quartzbio.github.io/quartzbio-python/_sources/authentication.md.txt Install or upgrade the QuartzBio Python client using pip. Ensure you are using a recent version of Python. ```bash pip install --upgrade quartzbio ``` -------------------------------- ### Install QuartzBio Python Package Source: https://quartzbio.github.io/quartzbio-python/index.html Install the quartzbio package using pip. For interactive use, also install IPython and gnureadline. ```bash pip install quartzbio ``` ```bash pip install ipython pip install gnureadline ``` -------------------------------- ### Develop QuartzBio Python Package Source: https://quartzbio.github.io/quartzbio-python/index.html Set up the QuartzBio Python package for development. Clone the repository and install it in development mode using setup.py or pip. ```bash git clone https://github.com/quartzbio/quartzbio-python.git cd quartzbio-python/ python setup.py develop ``` ```bash pip install -e . ``` -------------------------------- ### Log in to QuartzBio CLI Source: https://quartzbio.github.io/quartzbio-python/index.html Log in to your QuartzBio account using the command-line interface. Ensure you have installed the package first. ```bash quartzbio login ``` -------------------------------- ### Run Tests with Tox Source: https://quartzbio.github.io/quartzbio-python/index.html Install tox and run it to execute tests for the QuartzBio Python package. Tox automates testing in different environments. ```bash pip install tox tox ``` -------------------------------- ### Flattening Algorithm Example Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt Illustrates how list fields are expanded into multiple columns during CSV or XLSX export. ```Python {"a": "a", "b": ["x"]} {"a": "a", "b": ["x", "y"]} {"a": "a", "b": ["x", "y", "z"]} ``` -------------------------------- ### Querying Files Source: https://quartzbio.github.io/quartzbio-python/querying_datasets_and_files.html Demonstrates how to query and filter file objects, retrieve a specified number of records, and get all fields from a file. ```APIDOC ## Querying Files File objects can be queried and filtered on one or more fields. The query results are returned in pages. Text files such as CSV, TXT, TSV, or BED must be uploaded with headers. ### Basic Query ```python clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz') clinvar.query() ``` ### Query with Limit Retrieve a specified number of records by setting the `limit` parameter. ```python clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz') q = clinvar.query(limit=50) ``` ### Retrieving All Fields Get all fields from the file by calling the `fields` method. ```python fields = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz').query().fields() ``` ### Loading Files into Readers (e.g., Pandas) Use the `download_url()` method to load files into readers. ```python from quartzbio import * import pandas # Get file using ID or full path f = Object.retrieve("ID") f = Object.get_by_full_path("vault/path/to/file.csv") # Get file download URL and load into reader url = f.download_url() pandas.read_csv(url) ``` ### Supported File Extensions and Compressions File querying is supported for the following file extensions and compressions: **File Extensions** | **Compression** ---|--- txt | GZIP, BZIP2 csv | GZIP, BZIP2 tsv | GZIP, BZIP2 bed | GZIP, BZIP2 json | GZIP, BZIP2 parquet | GZIP The only supported encoding is UTF-8. ### Output Format The output format of the query can be specified using the `output_format` parameter. **Output Format** | **Description** ---|--- json (default) | applicable to all file extensions csv | applicable only to **csv** , **txt** , **tsv** , or **bed** file extensions tsv | applicable only to **csv** , **txt** , **tsv** , or **bed** file extensions **Example:** ```python clinvar = Object.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/ClinVar-5-2-0-20210110-Variants-GRCH37-1425664822266145048-20221110194518.json.gz') clinvar.query(output_format='json') ``` ``` -------------------------------- ### Checking Dataset Availability Source: https://quartzbio.github.io/quartzbio-python/dataset_versioning.html Provides examples of how to check dataset availability before querying. ```APIDOC ## Supporting Archived Datasets ### Description Checks the availability status of datasets before attempting to query them, or catches query errors. ### Method Accessing the `availability` attribute and using try-except blocks for `query()`. ### Endpoint N/A (Attribute access and method calls on dataset objects) ### Request Example ```python # Explicitly check availability datasets = vault.datasets() for dataset in datasets: if dataset.availability != 'available': print("Dataset {} availability is {}. Not querying.".format(dataset.id, dataset.availability)) continue print(dataset.query()) # Catch errors datasets = vault.datasets() for dataset in datasets: try: print(dataset.query()) except errors.SolveError as e: print("Dataset can not be queried: {}".format(e)) ``` ``` -------------------------------- ### Illustrate Flattening Algorithm Source: https://quartzbio.github.io/quartzbio-python/exporting_data.html Example showing how nested list fields are transformed into flattened columns during CSV or XLSX export. ```text {"a": "a", "b": ["x"]} {"a": "a", "b": ["x", "y"]} {"a": "a", "b": ["x", "y", "z"]} ``` ```text a,b.0,b.1,b.2 a,x,, a,x,y, a,x,y,z ``` -------------------------------- ### Create Dataset from Template Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt Create a DatasetTemplate object from the constructed dictionary and then use it to create or get a dataset. The dataset will be initialized with the non-transient fields from the template. ```Python from quartzbio import Dataset, DatasetTemplate template = DatasetTemplate.create(**template) dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields) # Dataset will now have the non-transient fields from the template # with desired titles/descriptions and expressions print(dataset.fields()) # But no records print(dataset.documents_count) ``` -------------------------------- ### Example JSONL file structure Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt Each line in a JSONL file must be a complete, valid JSON object without internal line breaks. ```json {"field": 1} {"field": 2} {"field": 3} ``` -------------------------------- ### Initialize Global Search Source: https://quartzbio.github.io/quartzbio-python/data_discovery.html Initialize the GlobalSearch object to start a search. By default, it returns all objects. ```python from quartzbio import GlobalSearch # Search returns all objects by default results = GlobalSearch() ``` -------------------------------- ### Genomic Coordinate Filter Setup Source: https://quartzbio.github.io/quartzbio-python/filters.html Initialize a query for a dataset with a GRCh37 build to enable genomic coordinate filtering. Requires importing the Dataset class. ```python from quartzbio import Dataset # GRCh37 q = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37').query() ``` -------------------------------- ### Define Dataset Path Formats Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Examples of path strings used to identify datasets within domains, vaults, or personal directories. ```text :: ``` ```text myDomain:MyVault:/folder/dataset ``` ```text ~/folder/dataset ``` -------------------------------- ### Advanced Search with Regex and Glob Patterns in Python Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html Python examples for advanced searching within vaults using regular expressions and glob patterns for flexible file matching. ```python from quartzbio import Vault, Object # Get the public vault public_vault = Vault.get_by_full_path('quartzbio:public') # Find datasets using regex clinvar_v2 = public_vault.datasets(regex='/ClinVar/2.*') ``` ```python xml_files = [i.filename for i in folder.search('*.xml.gz AND type:file')] ``` ```python # List the dataset ids of every dataset that has Outcome somewhere in the path all_outcomes = [d.id for d in Object.all(regex=".*Outcome.*", type='dataset')] ``` ```python # List the filenames of all xml files within a specific path path = 'quartzbio:Public:/MEDLINE/2.3.4-2018' folder = Object.get_by_full_path(path) json_files = [i.filename for i in folder.files(regex="{}.*.json.gz".format(folder.path))] ``` ```python # Unix style wildcards are supported too json_files = [i.filename for i in folder.files(glob="{}*.json.gz".format(folder.path))] ``` -------------------------------- ### quartzbio.cli.tutorial Module Source: https://quartzbio.github.io/quartzbio-python/_sources/quartzbio.cli.tutorial.rst.txt Overview of the quartzbio.cli.tutorial module, which provides tutorial-related functionality for the QuartzBio command-line interface. ```APIDOC ## Module: quartzbio.cli.tutorial ### Description This module contains the tutorial implementation for the QuartzBio CLI. It provides interactive or guided steps to help users understand how to utilize the QuartzBio platform via the command line. ### Members - The module includes all public members defined within `quartzbio.cli.tutorial`. - Inherits from standard Python module structures for CLI integration. ``` -------------------------------- ### Get Extended Statistics for a Numerical Field Source: https://quartzbio.github.io/quartzbio-python/_sources/joining_and_aggregating_datasets.md.txt Use the 'stats' facet type with 'extended': True to retrieve comprehensive statistical information for a numerical field. This example calculates extended statistics for 'info.ALLELEID'. ```python from quartzbio import Dataset query = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37').query() # Get extended statistics for a numerical field. query.facets( **{'info.ALLELEID': { 'facet_type': 'stats', 'extended': True}}) ``` -------------------------------- ### Import File using Command Line Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt Import a data file using the quartzbio CLI. The --create-dataset flag will create the dataset if it does not exist. --follow tracks progress. ```bash # Import a file (create the dataset if necessary): quartzbio import --create-dataset --follow ~/test-dataset data.vcf.gz # Import files in upsert mode (create the dataset from a template if necessary): quartzbio import --create-dataset --template-file template.json --commit-mode=upsert --follow ~/test-dataset data.vcf.gz ``` -------------------------------- ### get Source: https://quartzbio.github.io/quartzbio-python/_sources/expression_functions.md.txt Get the value at any depth of a nested object. ```APIDOC ## get ### Description Get the value at any depth of a nested object based on the path described by path. ### Parameters - **obj** (list|dict) - Required - The object to process - **path** (str|list) - Required - List or '.' delimited string of path - **default** (any) - Optional - Default value to return if path doesn't exist ``` -------------------------------- ### Retrieve Template and Create Dataset Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt Retrieve an existing template by its ID and use its fields to create or get a dataset. This ensures the dataset structure matches the specified template. ```Python from quartzbio import Dataset, DatasetTemplate template = DatasetTemplate.retrieve('id of your template') dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields) # Dataset will now have the non-transient fields from the template # with desired titles/descriptions and expressions print(dataset.fields()) # But no records print(dataset.documents_count) ``` -------------------------------- ### Translate variant examples Source: https://quartzbio.github.io/quartzbio-python/expression_functions.html Various usage examples for the translate_variant function with different optional parameters. ```python translate_variant("GRCH38-7-117559590-117559593-A") ``` ```python translate_variant("GRCH38-7-117559590-117559593-A", gene_model="ensembl") ``` ```python translate_variant("GRCH38-7-117559590-117559593-A", transcript="NM_000492.3") ``` ```python translate_variant("GRCH38-7-117559590-117559593-A", include_effects=True) ``` -------------------------------- ### Create a Dataset in Python Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Initializes a new, empty dataset in a personal vault using the Dataset.get_or_create_by_full_path method. ```python from quartzbio import Dataset # Create a new, empty dataset in your personal vault (represented by "~/") dataset = Dataset.get_or_create_by_full_path('~/my_dataset') ``` -------------------------------- ### GET /v2/vaults/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt Retrieve metadata for a specific vault. ```APIDOC ## GET https:///v2/vaults/{ID} ### Description Retrieve a vault's metadata. This request requires an authorized user with "read" permission or higher on the vault. ### Method GET ### Endpoint https:///v2/vaults/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the vault. ### Response #### Success Response (200) - **vault** (object) - The Vault resource. ``` -------------------------------- ### Upload Local Folder using Command Line Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt Upload local files and folders to the QuartzBio vault using the quartzbio CLI. Only missing files and folders are uploaded. ```bash # Upload local_folder to the root of the personal vault: quartzbio upload ./local_folder ``` -------------------------------- ### GET /v2/vaults Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt List all available vaults accessible to the user. ```APIDOC ## GET https:///v2/vaults ### Description List all available vaults. All public vaults are included in this response. If the request is sent by an authenticated user, vaults which the user has "read" permission or higher on are also returned. ### Method GET ### Endpoint https:///v2/vaults ### Response #### Success Response (200) - **vaults** (array) - A list of vaults matching the provided filters. ``` -------------------------------- ### Create Dataset with Fields Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Demonstrates how to create a new dataset and define its fields using a template during the creation process. This includes setting various properties for each field. ```APIDOC ## POST /datasets ### Description Creates a new dataset with specified fields and properties. ### Method POST ### Endpoint `/datasets` ### Parameters #### Request Body - **fields** (list of objects) - Required - A list of field definitions, where each field is an object with properties like `name`, `description`, `data_type`, etc. - **capacity** (string) - Optional - The capacity setting for the dataset (e.g., 'small'). ### Request Example ```python from quartzbio import Dataset, DatasetField fields = [ { "name": "my_string_field", "description": "Just a string", "data_type": "string", "is_list": False, "is_hidden": False, "ordering": 0 }, { "name": "gene_symbol", "description": "HUGO gene symbol", "data_type": "string", "entity_type": "gene" } ] dataset_full_path = '~/python_examples/my_fields_dataset' dataset = Dataset.get_or_create_by_full_path( dataset_full_path, fields=fields, capacity='small' ) ``` ### Response #### Success Response (200) - **dataset** (object) - The created dataset object. #### Response Example (Dataset object details) ``` -------------------------------- ### Get Dataset Migration Source: https://quartzbio.github.io/quartzbio-python/creating_and_migrating_datasets.html Retrieves metadata about a specific dataset migration. ```APIDOC ## GET /v2/dataset_migrations/{ID} ### Description Retrieve metadata about a dataset migration. ### Method GET ### Endpoint https:///v2/dataset_migrations/{ID} ### Authorization This request requires an authorized user with permission. ### Response #### Success Response (200) - **DatasetMigration** (object) - The response contains a DatasetMigration resource. ### Path Parameters - **ID** (string) - Required - The ID of the dataset migration to retrieve. ``` -------------------------------- ### Create Dataset with Template Source: https://quartzbio.github.io/quartzbio-python/dataset_templates.html Instantiate a DatasetTemplate object from the prepared template dictionary and then create a new Dataset using this template. The dataset will be initialized with the non-transient fields defined in the template. ```python from quartzbio import Dataset, DatasetTemplate template = DatasetTemplate.create(**template) dataset = Dataset.get_or_create_by_full_path('your dataset path', fields=template.fields) # Dataset will now have the non-transient fields from the template # with desired titles/descriptions and expressions print(dataset.fields()) # But no records print(dataset.documents_count) ``` -------------------------------- ### launch_ipython_shell Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html Opens the standard QuartzBio shell (IPython wrapper). ```APIDOC ## launch_ipython_shell ### Description Opens the QuartzBio shell (IPython wrapper). ### Parameters - **args** (object) - Required - Arguments to pass to the shell initialization. ``` -------------------------------- ### GET /beacons/query Source: https://quartzbio.github.io/quartzbio-python/quartzbio.resource.beacon.html Executes a query against a Beacon to perform entity-based searches. ```APIDOC ## GET /beacons/query ### Description Performs a search query against the Beacon resource. ### Method GET ### Endpoint /beacons/query ### Parameters #### Query Parameters - **query** (string) - Required - The search query string. - **entity_type** (string) - Optional - The type of entity to filter the search by. ``` -------------------------------- ### GET /v2/saved_queries Source: https://quartzbio.github.io/quartzbio-python/_sources/querying_datasets_and_files.md.txt Retrieves a list of all Saved Queries available to the user. ```APIDOC ## GET /v2/saved_queries ### Description Retrieves all Saved Queries available to a user. ### Method GET ### Endpoint https:///v2/saved_queries ### Authorization This request requires an authorized user. ### Response #### Success Response (200) - **list of SavedQuery resources** (array) - The response contains a list of SavedQuery resources. #### Response Example { "example": "List of SavedQuery resources" } ``` -------------------------------- ### Build Documentation Locally Source: https://quartzbio.github.io/quartzbio-python/index.html Build the project documentation locally using Sphinx and Sphinx Autobuild. This command allows you to view the documentation in a browser as you make changes. ```bash make sphinx-autobuild ``` -------------------------------- ### Create a Dataset Template Source: https://quartzbio.github.io/quartzbio-python/dataset_templates.html Assemble the prepared fields into a template dictionary, including essential metadata like name, version, description, and template type. The 'template_type' must be set to 'dataset'. This structure is then used to create a DatasetTemplate object. ```python template = { "name": "My Variant Template", "version": '1.2.0', "description": 'Import a special CSV file. Genome is assumed to be GRCh38, also has variant entity for GRCh37.', "template_type": "dataset", "is_public": False, "entity_params": { 'disable': True }, "fields": fields } ``` -------------------------------- ### GET /v2/saved_queries/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/querying_datasets_and_files.md.txt Retrieves a specific Saved Query by its ID. ```APIDOC ## GET /v2/saved_queries/{ID} ### Description Retrieve a Saved Query. ### Method GET ### Endpoint https:///v2/saved_queries/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the Saved Query. ### Authorization This request requires an authorized user with permission. ### Response #### Success Response (200) - **SavedQuery resource** (object) - The response contains a SavedQuery resource. #### Response Example { "example": "SavedQuery resource" } ``` -------------------------------- ### GET /v2/dataset_exports/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt Retrieve metadata about a specific dataset export. ```APIDOC ## GET /v2/dataset_exports/{ID} ### Description Retrieve metadata about an export. ### Method GET ### Endpoint https:///v2/dataset_exports/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the export. ### Response #### Success Response (200) - **DatasetExport** (object) - The resource containing export metadata. ``` -------------------------------- ### GET /v2/dataset_snapshot_tasks Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Retrieve a list of all available dataset snapshot tasks. ```APIDOC ## GET /v2/dataset_snapshot_tasks ### Description Retrieve a list of available dataset snapshot tasks. ### Method GET ### Endpoint https:///v2/dataset_snapshot_tasks ### Response #### Success Response (200) - **List** (array) - The response contains a list of DatasetSnapshotTask resources. ``` -------------------------------- ### GET /v2/dataset_restore_tasks Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Retrieve a list of all available dataset restore tasks. ```APIDOC ## GET /v2/dataset_restore_tasks ### Description Retrieve a list of available dataset restore tasks. ### Method GET ### Endpoint https:///v2/dataset_restore_tasks ``` -------------------------------- ### Download Files and Folders using QuartzBio CLI Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html Command-line interface for downloading files and folders from QuartzBio vaults. Supports recursive downloads, exclusions, and dry runs. ```bash quartzbio download "~/path/to/file.txt" . ``` ```bash quartzbio download --recursive "~/path/to/folder" local_folder ``` ```bash quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.*" ``` ```bash quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.DS_store" ``` ```bash quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*" --include "*.pdf" ``` ```bash quartzbio download --recursive "~/path/to/folder" local_folder --delete --dry-run ``` ```bash quartzbio download --help ``` -------------------------------- ### GET /v2/dataset_imports/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt Retrieves details for a specific dataset import. ```APIDOC ## GET /v2/dataset_imports/{ID} ### Description Retrieves the details of a specific dataset import resource. ### Method GET ### Endpoint https:///v2/dataset_imports/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the dataset import. ### Response #### Success Response (200) - **Status** - HTTP 200 OK ``` -------------------------------- ### Import File with Template Fields Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_templates.md.txt Retrieve a template and create a dataset. Then, import a file into this dataset, specifying the template's fields as the target structure for the import. This process waits for the import to complete. ```Python from quartzbio import Dataset, DatasetTemplate template = DatasetTemplate.retrieve('id of your template') dataset = Dataset.get_or_create_by_full_path('your dataset path') # Only field should be "id" print(dataset.fields()) file_object = Object.retrieve('id of file uploaded to EDP') DatasetImport.create( dataset_id=dataset.id, object_id=file_object.id, target_fields=template.fields, commit_mode='append', ) # Wait for import to finish dataset.activity(follow=True) # Should now see all the non-transient fields from the template! print(dataset.fields()) ``` -------------------------------- ### GET /v2/dataset_commits/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Retrieve metadata for a specific dataset commit. ```APIDOC ## GET /v2/dataset_commits/{ID} ### Description Retrieve metadata about a dataset commit. ### Method GET ### Endpoint https:///v2/dataset_commits/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the dataset commit. ``` -------------------------------- ### QueryFile Class Initialization Source: https://quartzbio.github.io/quartzbio-python/quartzbio.query.html Initializes a new QueryFile instance for object content queries. ```APIDOC ## Class: quartzbio.query.QueryFile ### Description A QueryFile API request wrapper that generates a request for an object content query, and can iterate through streaming result sets. ### Parameters - **file_id** (string) - Required - The ID of the file to query. - **fields** (list) - Optional - List of fields to include. - **exclude_fields** (list) - Optional - List of fields to exclude. - **filters** (Filter) - Optional - Filter instance created using F class operators. - **limit** (number) - Optional - Maximum number of results (default: inf). - **page_size** (number) - Optional - Number of results per page (default: 1000). - **output_format** (string) - Optional - Format of the output (default: 'json'). - **header** (boolean) - Optional - Whether to include a header (default: True). ``` -------------------------------- ### GET /v2/objects Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt Lists all available objects. Requires read permissions to the vaults. ```APIDOC ## GET /v2/objects ### Description List all available objects. The response includes objects which exist inside vaults that the user has "read" permission or higher to. ### Method GET ### Endpoint https:///v2/objects ### Parameters #### Query Parameters - **id** (integer) - Optional - The ID of an object. - **vault_id** (integer) - Optional - The ID of the vault that will contain the object. - **vault_name** (string) - Optional - The name of the vault containing objects. - **vault_full_path** (text) - Optional - The full path of the vault containing objects. - **parent_object_id** (integer) - Optional - The ID of the existing folder object to place the new object into. To place at "/", set this value to null. - **filename** (string) - Optional - The filename of the object, not including its parent folder. This value cannot contain slashes. - **path** (string) - Optional - The path of the object, including its parent folder. - **object_type** (string) - Optional - The type of the object. Must be one of "file", "folder", or "dataset". - **depth** (integer) - Optional - The depth of the object in the Vault. Objects at the root have depth = 0. - **query** (string) - Optional - A string that matches any objects whose path contains that string. - **regex** (regex) - Optional - A regular expression which searches objects for matching paths (case-insensitive). ### Response #### Success Response (200) - **List of objects** (array) - The response returns a list of objects matching the provided filters. #### Response Example [ { "id": 789, "vault_id": 123, "parent_object_id": 456, "filename": "my_document.txt", "object_type": "file", "description": "This is a sample document.", "metadata": { "key1": "value1" }, "tags": ["important", "report"], "storage_class": "standard", "created_at": "2023-10-27T10:00:00Z", "updated_at": "2023-10-27T10:00:00Z" }, { "id": 987, "vault_id": 123, "parent_object_id": null, "filename": "my_folder", "object_type": "folder", "description": null, "metadata": {}, "tags": [], "storage_class": null, "created_at": "2023-10-26T09:00:00Z", "updated_at": "2023-10-26T09:00:00Z" } ] ``` -------------------------------- ### GET /v2/datasets/{DATASET_ID}/exports Source: https://quartzbio.github.io/quartzbio-python/_sources/exporting_data.md.txt List all exports associated with a specific dataset. ```APIDOC ## GET /v2/datasets/{DATASET_ID}/exports ### Description List the exports associated with a dataset. ### Method GET ### Endpoint https:///v2/datasets/{DATASET_ID}/exports ### Parameters #### Path Parameters - **DATASET_ID** (string) - Required - The unique identifier of the dataset. ### Response #### Success Response (200) - **List[DatasetExport]** (array) - A list of DatasetExport resources. ``` -------------------------------- ### launch_ipython_legacy_shell Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html Opens the QuartzBio shell (IPython wrapper) for older IPython versions. ```APIDOC ## launch_ipython_legacy_shell ### Description Open the QuartzBio shell (IPython wrapper) for older IPython versions. ### Parameters - **args** (object) - Required - Arguments to pass to the shell initialization. ``` -------------------------------- ### GET /v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Retrieve metadata about a specific dataset snapshot task. ```APIDOC ## GET /v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID} ### Description Retrieve metadata about a dataset snapshot task. ### Method GET ### Endpoint https:///v2/dataset_snapshot_tasks/{SNAPSHOT_TASK_ID} ### Parameters #### Path Parameters - **SNAPSHOT_TASK_ID** (string) - Required - The unique identifier of the snapshot task. ### Response #### Success Response (200) - **DatasetSnapshotTask** (object) - The response contains a DatasetSnapshotTask resource. ``` -------------------------------- ### Download files and folders using the CLI Source: https://quartzbio.github.io/quartzbio-python/_sources/vaults_and_objects.md.txt Command-line interface commands for downloading single files, entire folders recursively, and excluding specific file patterns. Use --dry-run with --delete for safety. ```Shell # Download a single file quartzbio download "~/path/to/file.txt" . # Download a folder quartzbio download --recursive "~/path/to/folder" local_folder # Download a folder, but exclude hidden files and folders quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.*" # Download a folder, but exclude DS_store files quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*/.DS_store" # Download only PDF files within a folder # --include always supersedes --exclude quartzbio download --recursive "~/path/to/folder" local_folder --exclude "*" --include "*.pdf" # The --delete flag will delete local files that do not match # those found in the vault. Always use the --dry-run mode first # with this option as it will delete files permanently. quartzbio download --recursive "~/path/to/folder" local_folder --delete --dry-run # For full usage: quartzbio download --help ``` -------------------------------- ### GET /v2/dataset_restore_tasks/{RESTORE_TASK_ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Retrieve metadata about a specific dataset restore task. ```APIDOC ## GET /v2/dataset_restore_tasks/{RESTORE_TASK_ID} ### Description Retrieve metadata about a dataset restore task. ### Method GET ### Endpoint https:///v2/dataset_restore_tasks/{RESTORE_TASK_ID} ### Parameters #### Path Parameters - **RESTORE_TASK_ID** (string) - Required - The unique identifier of the restore task. ``` -------------------------------- ### launch_ipython_5_shell Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.ipython.html Opens the QuartzBio shell (IPython wrapper) specifically for IPython 5+ versions. ```APIDOC ## launch_ipython_5_shell ### Description Open the QuartzBio shell (IPython wrapper) with IPython 5+. ### Parameters - **args** (object) - Required - Arguments to pass to the shell initialization. ``` -------------------------------- ### GET /v2/datasets/{DATASET_ID}/imports Source: https://quartzbio.github.io/quartzbio-python/_sources/import_parameters.md.txt Lists all imports associated with a specific dataset. ```APIDOC ## GET /v2/datasets/{DATASET_ID}/imports ### Description Returns a list of DatasetImport resources associated with the specified dataset. ### Method GET ### Endpoint https:///v2/datasets/{DATASET_ID}/imports ### Parameters #### Path Parameters - **DATASET_ID** (string) - Required - The unique identifier of the dataset. #### Query Parameters - **limit** (integer) - Optional - The number of objects to return per page. - **offset** (integer) - Optional - The offset within the list of available objects. ### Response #### Success Response (200) - **Body** - A list of DatasetImport resources. ``` -------------------------------- ### Enable Global Beacon Indexing for Datasets Source: https://quartzbio.github.io/quartzbio-python/metadata_and_global_beacons.html Initiate the Global Beacon indexing process for a dataset. The status and progress can be monitored via the returned dictionary. ```python # Getting the dataset dataset = Dataset.get_by_full_path('~/beacon-test-dataset') # Enabling Global Beacon on dataset dataset.enable_global_beacon() # Example Output: {'id': 125, 'datastore_id': 6, 'dataset_id': 1658666726768179211, 'status': 'indexing', 'progress_percent': 0, 'is_deleted': False} ``` -------------------------------- ### Import Data from Manifest into Dataset Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt Launch an import using a manifest into a new or existing dataset. Progress tracking is available through the API or web interface. ```python from quartzbio import Dataset, DatasetImport dataset = Dataset.get_or_create_by_full_path('~/python_examples/manifest_dataset') # Launch the import imp = DatasetImport.create( dataset_id=dataset.id, manifest=manifest.manifest, commit_mode='append' ) # Follow the import status dataset.activity(follow=True) ``` -------------------------------- ### GET /v2/datasets/{DATASET_ID}/commits Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt List all commits associated with a specific dataset. ```APIDOC ## GET /v2/datasets/{DATASET_ID}/commits ### Description Retrieve a list of dataset commits associated with a dataset. ### Method GET ### Endpoint https:///v2/datasets/{DATASET_ID}/commits ### Parameters #### Path Parameters - **DATASET_ID** (string) - Required - The unique identifier of the dataset. ``` -------------------------------- ### Create Dataset Import with Manifest Source: https://quartzbio.github.io/quartzbio-python/importing_data.html Use this to create a dataset import by providing a manifest, which specifies remote files to import. Ensure the manifest includes a valid URL and optionally other file details. ```json { "files": [{ "url": "https://example.com/file.json.gz", "name": "file.json.gz", "format": "json", "size": 100, "md5": "", "base64_md5": "" }] } ``` -------------------------------- ### Get Object Copy Task Source: https://quartzbio.github.io/quartzbio-python/vaults_and_objects.html Retrieves metadata for a specific object copy task. ```APIDOC ## GET /v2/object_copy_tasks/{ID} ### Description Retrieve metadata about an object copy task. ### Method GET ### Endpoint https:///v2/object_copy_tasks/{ID} ### Parameters #### Path Parameters - **ID** (integer) - Required - The ID of the object copy task to retrieve. ### Authorization This request requires that the authorized user is also the user who created the object copy task being retrieved. ### Response #### Success Response (200) - **task** (object) - The response contains an object copy task resource. ``` -------------------------------- ### POST /api/datasets/create Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.main.html Creates a new QuartzBio dataset with specified configurations. ```APIDOC ## POST /api/datasets/create ### Description This endpoint is used to create a new QuartzBio dataset. It supports options for creating the underlying vault if it does not exist, specifying a template, setting dataset capacity, adding tags, providing metadata, and performing a dry run. ### Method POST ### Endpoint /api/datasets/create ### Parameters #### Query Parameters - **--create-vault** (boolean) - Optional - Create the vault if it doesn't exist - **--template-id** (string) - Optional - The template ID used when creating a new dataset - **--template-file** (string) - Optional - A local template file to be used when creating a new dataset - **--capacity** (string) - Optional - Specifies the capacity of the dataset: small (default, <100M records), medium (<500M), large (>=500M) - **--tag** (string) - Optional - A tag to be added. Tags are case insensitive strings. Example tags: --tag GRCh38 --tag Tissue --tag "Foundation Medicine" - **--metadata** (KEY=VALUE) - Optional - Dataset metadata in the format KEY=VALUE - **--metadata-json-file** (string) - Optional - Metadata key value pairs in JSON format - **--dry-run** (boolean) - Optional - Dry run mode will not create the dataset - **full_path** (string) - Required - The full path to the dataset in the format: "domain:vault:/path/dataset". Defaults to your personal vault if no vault is provided. Defaults to the vault root if no path is provided. ### Request Example ```json { "full_path": "my-domain:my-vault:/data/new-dataset", "--template-id": "template-123", "--capacity": "large", "--tag": ["GRCh38", "WGS"], "--metadata": ["Project=ProjectX", "SampleID=Sample123"], "--dry-run": true } ``` ### Response #### Success Response (200) - **message** (string) - Confirmation message of dataset creation or dry run status. #### Response Example ```json { "message": "Dataset 'new-dataset' creation dry run successful." } ``` ``` -------------------------------- ### Get Current User Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.main.html Displays the email address of the currently authenticated QuartzBio user. ```APIDOC ## GET /whoami ### Description Show your QuartzBio email address. ### Method GET ``` -------------------------------- ### GET /v2/dataset_commits/{ID}/rollback Source: https://quartzbio.github.io/quartzbio-python/_sources/dataset_versioning.md.txt Check if a commit can be reverted and identify blocking commits. ```APIDOC ## GET /v2/dataset_commits/{ID}/rollback ### Description Returns whether or not a commit can be reverted and returns a reason why along with any commits that are blocking. ### Method GET ### Endpoint https:///v2/dataset_commits/{ID}/rollback ### Parameters #### Path Parameters - **ID** (string) - Required - The unique identifier of the dataset commit. ``` -------------------------------- ### Apply Field Filters Source: https://quartzbio.github.io/quartzbio-python/quartzbio.query.html Example of applying multiple field-based filters to a dataset query. ```python dataset.query(gene__in=['BRCA', 'GATA3'], chr='3', start__gt=10000, end__lte=20000) ``` -------------------------------- ### Dataset and Object Query Filtering Source: https://quartzbio.github.io/quartzbio-python/filters.html Demonstrates how to initialize a query on a dataset or object and apply basic filters. ```APIDOC ## Dataset.query().filter() ### Description Applies a filter to a dataset or object query based on field values. ### Parameters #### Query Parameters - **field_name__action** (any) - Required - The field name appended with an optional filter action (e.g., clinical_significance='pathogenic'). ### Request Example ```python from quartzbio import Dataset dataset = Dataset.get_by_full_path('quartzbio:Public:/ClinVar/5.2.0-20210110/Variants-GRCH37') dataset.query().filter(clinical_significance='pathogenic') ``` ``` -------------------------------- ### GET /v2/dataset_migrations/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Retrieves metadata for a specific dataset migration. Requires general permissions. ```APIDOC ## GET /v2/dataset_migrations/{ID} ### Description Retrieve metadata about a dataset migration. ### Method GET ### Endpoint https:///v2/dataset_migrations/{ID} ### Parameters #### Path Parameters - **ID** (integer) - Required - The ID of the dataset migration to retrieve. ### Response #### Success Response (200) - **DatasetMigration resource** (object) - The response contains a DatasetMigration resource. #### Response Example ```json { "id": 789, "commit_mode": "commit", "include_errors": true, "source_id": 123, "source_params": { "limit": 100, "fields": "name,email" }, "target_fields": { "name": "{{source.name}}", "email": "{{source.email}}" }, "target_id": 456, "priority": 1, "status": "completed" } ``` ``` -------------------------------- ### Get Task Queue Source: https://quartzbio.github.io/quartzbio-python/quartzbio.cli.data.html Retrieves all running and queued tasks for an account, grouped by user and status. ```APIDOC ## GET /api/queue ### Description Retrieves and displays running and queued tasks, grouped by user and status. ### Method GET ### Endpoint /api/queue ### Parameters #### Query Parameters - **statuses** (list) - Optional - A list of statuses to filter tasks by (default: ['running', 'queued']). ``` -------------------------------- ### Launch Dataset Import from Manifest Source: https://quartzbio.github.io/quartzbio-python/importing_data.html Launches an import task using a manifest to add data from remote URLs to a specified dataset. The commit_mode can be set to 'append' or 'overwrite'. ```python from quartzbio import Dataset, DatasetImport dataset = Dataset.get_or_create_by_full_path('~/python_examples/manifest_dataset') # Launch the import imp = DatasetImport.create( dataset_id=dataset.id, manifest=manifest.manifest, commit_mode='append' ) ``` -------------------------------- ### Authenticate with QuartzBio API Source: https://quartzbio.github.io/quartzbio-python/quartzbio.html Initializes a session with the QuartzBio EDP API using a specified host URL. ```python import quartzbio quartzbio.login( api_host="https://quartzbio.api.az.aws.quartz.bio", ) ``` -------------------------------- ### GET /v2/dataset_fields/{ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Retrieve a specific dataset field by its ID. Requires read permission on the dataset. ```APIDOC ## GET /v2/dataset_fields/{ID} ### Description Retrieve a dataset field. ### Method GET ### Endpoint https:///v2/dataset_fields/{ID} ### Parameters #### Path Parameters - **ID** (string) - Required - The ID of the dataset field to retrieve. ### Response #### Success Response (200) - **DatasetField** (object) - The retrieved DatasetField resource. ``` -------------------------------- ### quartzbio.cli.ipython Module Overview Source: https://quartzbio.github.io/quartzbio-python/_sources/quartzbio.cli.ipython.rst.txt This module provides the IPython CLI interface for QuartzBio. It includes members and inheritance details for interactive shell usage. ```APIDOC ## Module: quartzbio.cli.ipython ### Description This module provides the IPython CLI interface for the QuartzBio Python SDK. It is designed to facilitate interactive data analysis and workflow management within an IPython environment. ### Members - The module exposes all public members defined within `quartzbio.cli.ipython`. ### Inheritance - The module includes inheritance information for classes defined within the scope of the IPython CLI integration. ``` -------------------------------- ### GET /v2/datasets/{DATASET_ID} Source: https://quartzbio.github.io/quartzbio-python/_sources/creating_and_migrating_datasets.md.txt Retrieves metadata for a specific dataset. Requires authorization to view the target dataset. ```APIDOC ## GET /v2/datasets/{DATASET_ID} ### Description Retrieve metadata about a dataset. This request requires an authorized user with permission to view the target dataset. ### Method GET ### Endpoint GET https:///v2/datasets/{DATASET_ID} ### Parameters #### Path Parameters - **DATASET_ID** (string) - Required - The ID of the dataset to retrieve. ### Response #### Success Response (200) - **Dataset resource** - The response contains a Dataset resource. #### Response Example { "example": "{\"id\": 123, \"name\": \"MyDataset\", \"vault_id\": 1, \"vault_parent_object_id\": 0, \"fields\": [], \"metadata\": {\"key\": \"value\"}, \"tags\": [\"tag1\"], \"capacity\": \"medium\", \"storage_class\": \"standard\"}" } ``` -------------------------------- ### Log out and Log in with Credentials Source: https://quartzbio.github.io/quartzbio-python/_sources/authentication.md.txt Clear existing credentials and log in using a personal access token and API host. Replace TOKEN and DOMAIN with your actual values. The API host should not have a trailing slash. ```python # Clear your existing credentials quartzbio logout # Replace "TOKEN" with the Personal Access Token copied from the EDP web page # Replace "DOMAIN" with your account's subdomain (i.e. your company name) quartzbio login --access-token TOKEN --api-host https://DOMAIN.api.edp.aws.quartz.bio ``` -------------------------------- ### Create Manifest for URL Import Source: https://quartzbio.github.io/quartzbio-python/_sources/importing_data.md.txt Create a manifest object and add URLs to it. This manifest will be used to import data from remote servers. ```python from quartzbio import Manifest source_url = "https://s3.amazonaws.com/downloads.quartzbio.com/demo/interesting-variants.json.gz" manifest = Manifest() manifest.add_url(source_url) ```