### Run BigQuery Quickstart Sample Source: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/README.rst Execute the BigQuery quickstart sample. This requires the sample code to be present and dependencies installed. ```bash python quickstart.py ``` -------------------------------- ### Install BigQuery Client in Virtual Environment Source: https://github.com/googleapis/python-bigquery/blob/main/docs/README.rst Commands to set up a virtual environment and install the library on different operating systems. ```console pip install virtualenv virtualenv source /bin/activate /bin/pip install google-cloud-bigquery ``` ```console pip install virtualenv virtualenv \Scripts\activate \Scripts\pip.exe install google-cloud-bigquery ``` -------------------------------- ### Install google-cloud-bigquery without bqstorage extra Source: https://github.com/googleapis/python-bigquery/blob/main/UPGRADING.md The 'bqstorage' extra is now a no-op and should be omitted during installation. Install the library directly. ```bash pip install google-cloud-bigquery[bqstorage] ``` ```bash pip install google-cloud-bigquery ``` -------------------------------- ### Install Sample Dependencies Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_deps.tmpl.rst Install the Python dependencies required to run the samples using pip. Ensure you have a requirements.txt file in your current directory. ```bash pip install -r requirements.txt ``` -------------------------------- ### Install BigQuery Library Source: https://github.com/googleapis/python-bigquery/blob/main/docs/UPGRADING.md Updated installation commands for the BigQuery library following the removal of specific extras. ```bash $ pip install google-cloud-bigquery[bqstorage] ``` ```bash $ pip install google-cloud-bigquery ``` ```bash $ pip install google-cloud-bigquery[bignumeric_type] ``` -------------------------------- ### Install pandas for BigQuery Source: https://github.com/googleapis/python-bigquery/blob/main/docs/usage/pandas.md Install the pandas library to enable DataFrame conversions. Alternatively, install the BigQuery client with pandas support. ```bash pip install --upgrade pandas ``` ```bash pip install --upgrade 'google-cloud-bigquery[pandas]' ``` -------------------------------- ### Install PyAudio with PortAudio flags Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_portaudio.tmpl.rst If 'pip install' fails to find 'portaudio.h', use these flags to specify include and library paths. This is useful when PortAudio is installed in a non-standard location. ```bash pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' pyaudio ``` -------------------------------- ### Install PortAudio on Debian/Ubuntu Linux Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_portaudio.tmpl.rst Use apt-get to install the PortAudio development package on Debian or Ubuntu systems. This ensures PyAudio can be built correctly. ```bash apt-get install portaudio19-dev python-all-dev ``` -------------------------------- ### Install pandas and pyarrow for BigQuery DataFrame loading Source: https://github.com/googleapis/python-bigquery/blob/main/docs/usage/pandas.md Install the BigQuery client library with pandas and pyarrow support to load DataFrames into BigQuery tables. ```bash pip install --upgrade google-cloud-bigquery[pandas,pyarrow] ``` -------------------------------- ### DB-API Unnamed Parameters Example Source: https://github.com/googleapis/python-bigquery/blob/main/docs/dbapi.md This example demonstrates the qmark parameter style for unnamed/positional parameters in BigQuery DB-API queries. ```sql insert into people (name, income) values (?, ?) ``` -------------------------------- ### Run BigQuery Simple Application Sample Source: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/README.rst Execute the BigQuery simple application sample. This requires the sample code to be present and dependencies installed. ```bash python simple_app.py ``` -------------------------------- ### Get Help for BigQuery Sample Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/README.tmpl.rst This command displays the help message for a BigQuery Python sample, showing available arguments and options. ```bash {{get_help(sample.file)|indent}} ``` -------------------------------- ### DB-API Named Parameters Example Source: https://github.com/googleapis/python-bigquery/blob/main/docs/dbapi.md This example shows the pyformat parameter style for named parameters in BigQuery DB-API queries. ```sql insert into people (name, income) values (%(name)s, %(income)s) ``` -------------------------------- ### Install PortAudio on Mac OS X Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_portaudio.tmpl.rst Use Homebrew to install PortAudio on Mac OS X. This is a prerequisite for PyAudio. ```bash brew install portaudio ``` -------------------------------- ### GET /partitions Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Lists the partitions in a specific BigQuery table. ```APIDOC ## GET /partitions ### Description List the partitions in a table. ### Parameters #### Query Parameters - **table** (Union[Table, TableReference, TableListItem, str]) - Required - The table or reference from which to get partition info. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **partition_ids** (List[str]) - A list of the partition ids present in the partitioned table. ``` -------------------------------- ### load_table_from_uri Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Starts a job for loading data into a table from Cloud Storage. ```APIDOC ## load_table_from_uri ### Description Starts a job for loading data into a table from Cloud Storage. ### Parameters - **source_uris** (Union[str, Sequence[str]]) - Required - URIs of data files to be loaded; in format gs:///. - **destination** (Union[Table, TableReference, TableListItem, str]) - Required - Table into which data is to be loaded. - **job_id** (str) - Optional - Name of the job. - **job_id_prefix** (str) - Optional - The user-provided prefix for a randomly generated job ID. - **location** (str) - Optional - Location where to run the job. - **project** (str) - Optional - Project ID of the project of where to run the job. - **job_config** (LoadJobConfig) - Optional - Extra configuration options for the job. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response - **Return Type** (LoadJob) - A new load job. ``` -------------------------------- ### Clone Python Docs Samples Repository Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_deps.tmpl.rst Clone the repository containing Python documentation samples. This is the first step to access the code examples. ```bash $ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git ``` -------------------------------- ### Configure OpenTelemetry Tracing Source: https://github.com/googleapis/python-bigquery/blob/main/docs/README.rst Setup code to initialize a TracerProvider and export trace data to Google Cloud Trace. ```python from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter tracer_provider = TracerProvider() tracer_provider = BatchSpanProcessor(CloudTraceSpanExporter()) trace.set_tracer_provider(TracerProvider()) ``` -------------------------------- ### GET /projects Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Lists projects associated with the current client. ```APIDOC ## GET /projects ### Description List projects for the project associated with this client. ### Parameters #### Query Parameters - **max_results** (int) - Optional - Maximum number of projects to return. - **page_token** (str) - Optional - Token representing a cursor into the projects. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. - **page_size** (int) - Optional - Maximum number of projects to return in each page. ### Response #### Success Response (200) - **projects** (Iterator) - Iterator of Project accessible to the current client. ``` -------------------------------- ### Install OpenTelemetry Dependencies Source: https://github.com/googleapis/python-bigquery/blob/main/docs/README.rst Required packages for enabling OpenTelemetry tracing in the BigQuery client. ```console pip install google-cloud-bigquery[opentelemetry] opentelemetry-exporter-gcp-trace ``` -------------------------------- ### Clone Python Docs Samples Repository Source: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/README.rst Clone the python-docs-samples repository to access BigQuery samples. Ensure you have Git installed. ```bash git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git ``` -------------------------------- ### Get Table Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Fetches a BigQuery table using its reference. ```APIDOC ## GET /tables/{tableId} ### Description Fetch the table referenced by `table`. ### Method GET ### Endpoint /tables/{tableId} ### Parameters #### Path Parameters - **table** (Table | TableReference | TableListItem | str) - Required - A reference to the table to fetch from the BigQuery API. If a string is passed in, this method attempts to create a table reference from a string using `google.cloud.bigquery.table.TableReference.from_string()`. #### Query Parameters - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using `retry`. ### Returns #### Success Response (200) - **Table** (google.cloud.bigquery.table.Table) - A `Table` instance. ### Response Example ```json { "example": "{\"table\": \"your_table_object\"}" } ``` ``` -------------------------------- ### String Startswith Check Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Method to check if a string starts with a specified prefix. ```APIDOC ## startswith(prefix) ### Description Return `True` if the string `S` starts with the specified `prefix`, `False` otherwise. With optional `start`, test `S` beginning at that position. With optional `end`, stop comparing `S` at that position. `prefix` can also be a tuple of strings to try. ### Method `startswith` ### Parameters #### Path Parameters - **prefix** (str or tuple) - Required - The prefix string or tuple of strings to check for. ### Request Example ```python 'hello world'.startswith('hello') 'hello world'.startswith(('hi', 'hello')) ``` ### Response #### Success Response (200) - **result** (bool) - `True` if the string starts with the prefix, `False` otherwise. #### Response Example ```json { "example": true } ``` ``` -------------------------------- ### GET /models Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Lists models within a specified BigQuery dataset. ```APIDOC ## GET /models ### Description List models in the dataset. ### Parameters #### Query Parameters - **dataset** (Union[Dataset, DatasetReference, DatasetListItem, str]) - Required - A reference to the dataset whose models to list. - **max_results** (int) - Optional - Maximum number of models to return. - **page_token** (str) - Optional - Token representing a cursor into the models. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. - **page_size** (int) - Optional - Maximum number of models to return per page. ``` -------------------------------- ### Get Job Result Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Starts the job, waits for it to complete, and retrieves the results. Supports pagination and retry mechanisms for job execution. ```APIDOC ## POST /jobs/{jobId}/result ### Description Start the job and wait for it to complete and get the result. ### Method POST ### Endpoint /jobs/{jobId}/result ### Parameters #### Path Parameters - **jobId** (string) - Required - The ID of the job to retrieve results for. #### Query Parameters - **page_size** (int) - Optional - The maximum number of rows in each page of results from this request. Non-positive values are ignored. - **max_results** (int) - Optional - The maximum total number of rows from this request. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the call that retrieves rows. This only applies to making RPC calls. It isn’t used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is `DONE`, retrying is aborted early even if the results are not available, as this will not change anymore. - **timeout** (Union[float, google.api_core.future.polling.PollingFuture._DEFAULT_VALUE]) - Optional - The number of seconds to wait for the underlying HTTP transport before using `retry`. If `None`, wait indefinitely unless an error is returned. If unset, only the underlying API calls have their default timeouts, but we still wait indefinitely for the job to finish. - **start_index** (int) - Optional - The zero-based index of the starting row to read. - **job_retry** (google.api_core.retry.Retry) - Optional - How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing `None` disables job retry. Not all jobs can be retried. If `job_id` was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non-`None` non-default `job_retry` is also provided. ### Request Example ```json { "page_size": 100, "max_results": 1000, "retry": "", "timeout": 300.0, "start_index": 0, "job_retry": "" } ``` ### Response #### Success Response (200) - **RowIterator** (google.cloud.bigquery.table.RowIterator) - Iterator of row data (`Row`-s). During each page, the iterator will have the `total_rows` attribute set, which counts the total number of rows **in the result set** (this is distinct from the total number of rows in the current page: `iterator.page.num_items`). If the query is a special query that produces no results, e.g. a DDL query, an `_EmptyRowIterator` instance is returned. #### Response Example (No specific example provided in the source text, but would be an iterator of Row objects) #### Errors - **google.api_core.exceptions.GoogleAPICallError**: If the job failed and retries aren’t successful. - **concurrent.futures.TimeoutError**: If the job did not complete in the given timeout. - **TypeError**: If Non-`None` and non-default `job_retry` is provided and the job is not retryable. ``` -------------------------------- ### Run BigQuery User Credentials Sample Source: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/README.rst Execute the BigQuery user credentials sample. This sample requires a project ID and optionally a flag to launch a browser for authentication. ```bash python user_credentials.py ``` -------------------------------- ### GET /jobs/get Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Checks for the existence of a BigQuery job via a GET request. ```APIDOC ## GET /jobs/get ### Description Tests for the existence of the job via a GET request. ### Method GET ### Endpoint /jobs/get ### Parameters #### Query Parameters - **client** (google.cloud.bigquery.client.Client) - Optional - The client to use. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **exists** (bool) - Boolean indicating existence of the job. ``` -------------------------------- ### GET /jobs/get (exists) Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Checks for the existence of a BigQuery job using a GET request. ```APIDOC ## GET /jobs/get ### Description Tests for the existence of a job via a GET request. ### Method GET ### Endpoint /jobs/get ### Parameters #### Query Parameters - **client** (google.cloud.bigquery.client.Client) - Optional - The client to use. If not passed, falls back to the client stored on the current dataset. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using retry. ### Response #### Success Response (200) - **exists** (bool) - Boolean indicating existence of the job. ``` -------------------------------- ### Run BigQuery Sample Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/README.tmpl.rst Use this command to execute a BigQuery Python sample. Ensure you have the necessary API enabled and role permissions. ```bash $ python {{sample.file}} ``` -------------------------------- ### Initialize a BigQuery Client Source: https://github.com/googleapis/python-bigquery/blob/main/docs/usage/client.md Instantiate a client by providing a specific project ID to override environment-inferred defaults. ```python from google.cloud import bigquery client = bigquery.Client(project='PROJECT_ID') ``` -------------------------------- ### GET /jobs/get Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Refreshes the properties of a BigQuery job by performing a GET request to the API. ```APIDOC ## GET /jobs/get ### Description Refresh job properties via a GET request to the BigQuery API. ### Method GET ### Endpoint /jobs/get ### Parameters #### Query Parameters - **client** (google.cloud.bigquery.client.Client) - Optional - The client to use. If not passed, falls back to the client stored on the current dataset. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using retry. ``` -------------------------------- ### Create and Activate Virtual Environment Source: https://github.com/googleapis/python-bigquery/blob/main/scripts/readme-gen/templates/install_deps.tmpl.rst Create a Python virtual environment named 'env' and activate it. Samples are compatible with Python 3.9+. ```bash virtualenv env source env/bin/activate ``` -------------------------------- ### Create Partitioned Tables Source: https://context7.com/googleapis/python-bigquery/llms.txt Configure time-partitioning and clustering for tables to optimize query performance. ```python from google.cloud import bigquery client = bigquery.Client() schema = [ bigquery.SchemaField("name", "STRING"), bigquery.SchemaField("value", "FLOAT64"), bigquery.SchemaField("event_date", "DATE"), bigquery.SchemaField("event_timestamp", "TIMESTAMP"), ] table_id = f"{client.project}.my_dataset.events" table = bigquery.Table(table_id, schema=schema) # Partition by a DATE or TIMESTAMP column table.time_partitioning = bigquery.TimePartitioning( type_=bigquery.TimePartitioningType.DAY, field="event_date", # Column to partition on expiration_ms=1000 * 60 * 60 * 24 * 90, # 90 days retention ) # Optionally add clustering table.clustering_fields = ["name"] # Require partition filter in queries table.require_partition_filter = True table = client.create_table(table) print(f"Created partitioned table on column {table.time_partitioning.field}") ``` -------------------------------- ### Install google-cloud-bigquery without bignumeric_type extra Source: https://github.com/googleapis/python-bigquery/blob/main/UPGRADING.md The 'bignumeric_type' extra has been removed as BIGNUMERIC type is now automatically supported. Install the library directly. ```bash pip install google-cloud-bigquery[bignumeric_type] ``` ```bash pip install google-cloud-bigquery ``` -------------------------------- ### Client Initialization Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Initializes the BigQuery Client to bundle configuration needed for API requests. ```APIDOC ## Client Initialization ### Description Initializes the Client object to manage connections to the BigQuery API and bundle configuration for requests. ### Parameters #### Request Body - **project** (str) - Optional - Project ID for the project which the client acts on behalf of. - **credentials** (google.auth.credentials.Credentials) - Optional - The OAuth2 Credentials to use for this client. - **location** (str) - Optional - Default location for jobs, datasets, and tables. - **default_query_job_config** (QueryJobConfig) - Optional - Default configuration for query jobs. - **default_load_job_config** (LoadJobConfig) - Optional - Default configuration for load jobs. - **client_info** (ClientInfo) - Optional - Client info used to send a user-agent string. - **client_options** (ClientOptions/Dict) - Optional - Client options used to set user options. - **default_job_creation_mode** (str) - Optional - Sets the default job creation mode. ``` -------------------------------- ### Initialize BigQuery Client Source: https://context7.com/googleapis/python-bigquery/llms.txt The Client class is the primary entry point for BigQuery operations, handling authentication and connection management. ```python from google.cloud import bigquery # Basic initialization (uses default credentials from environment) client = bigquery.Client() # Initialize with specific project client = bigquery.Client(project="my-project-id") # Initialize with custom configuration client = bigquery.Client( project="my-project-id", location="US", # Default location for jobs/datasets default_query_job_config=bigquery.QueryJobConfig( use_legacy_sql=False, maximum_bytes_billed=10 * 1024 * 1024 * 1024, # 10 GB limit ), ) # Close client when done (optional, connections auto-reconnect) client.close() ``` -------------------------------- ### GET /jobs/get (reload) Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Refreshes the properties of an existing BigQuery job. ```APIDOC ## GET /jobs/get ### Description Refresh job properties via a GET request. ### Method GET ### Endpoint /jobs/get ### Parameters #### Query Parameters - **client** (google.cloud.bigquery.client.Client) - Optional - The client to use. If not passed, falls back to the client stored on the current dataset. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using retry. ``` -------------------------------- ### GET /tables/list Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Lists all tables within a specified BigQuery dataset. ```APIDOC ## GET /tables/list ### Description List tables in the specified dataset. ### Method GET ### Parameters #### Query Parameters - **dataset** (Union[Dataset, DatasetReference, DatasetListItem, str]) - Required - A reference to the dataset whose tables to list. - **max_results** (int) - Optional - Maximum number of tables to return. - **page_token** (str) - Optional - Token representing a cursor into the tables. - **page_size** (int) - Optional - Maximum number of tables to return per page. ### Response #### Success Response (200) - **Iterator** (google.api_core.page_iterator.Iterator) - An iterator of TableListItem objects. ``` -------------------------------- ### Run benchmark with all flags Source: https://github.com/googleapis/python-bigquery/blob/main/benchmark/README.md Executes the benchmark script with various configuration flags including reruns, project ID, table destination, and custom tags. ```bash python benchmark.py \ --reruns 5 \ --projectid test_project_id \ --table logging_project_id.querybenchmarks.measurements \ --create_table \ --tag source:myhostname \ --tag somekeywithnovalue \ --tag experiment:special_environment_thing ``` -------------------------------- ### Initialize TimePartitioning Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Instantiate a TimePartitioning object to configure time-based partitioning for a BigQuery table. Supports DAY, HOUR, MONTH, and YEAR partitioning types, and optionally a specific field for partitioning. ```python time_partitioning = TimePartitioning() table.time_partitioning = time_partitioning table.time_partitioning.field = 'timecolumn' ``` -------------------------------- ### GET list_rows Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Retrieves an iterator of row data from a specified BigQuery table. ```APIDOC ## GET list_rows ### Description Retrieves an iterator of row data from a BigQuery table. If the provided table object lacks a schema and selected_fields are not supplied, the method automatically fetches the table schema. ### Parameters #### Query Parameters - **table** (Union[Table, TableListItem, TableReference, str]) - Required - The table to list, or a reference to it. - **selected_fields** (Sequence[SchemaField]) - Optional - The fields to return. If not supplied, all columns are downloaded. - **max_results** (int) - Optional - Maximum number of rows to return. - **page_token** (str) - Optional - Token representing a cursor into the table’s rows. - **start_index** (int) - Optional - The zero-based index of the starting row to read. - **page_size** (int) - Optional - The maximum number of rows in each page of results. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - Seconds to wait for the underlying HTTP transport. - **timestamp_precision** (enums.TimestampPrecision) - Optional - [Private Preview] Controls precision for timestamp columns. ### Response #### Success Response (200) - **RowIterator** (object) - An iterator of Row objects containing the requested data. ``` -------------------------------- ### GET /routines/{routine_ref} Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Retrieves a routine referenced by the provided routine reference. ```APIDOC ## GET /routines/{routine_ref} ### Description [Beta] Get the routine referenced by routine_ref. ### Method GET ### Endpoint /routines/{routine_ref} ### Parameters #### Query Parameters - **routine_ref** (Routine | RoutineReference | str) - Required - A reference to the routine to fetch. - **retry** (Retry) - Optional - How to retry the API call. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **routine** (Routine) - A Routine instance. ``` -------------------------------- ### Perform a BigQuery SQL Query Source: https://github.com/googleapis/python-bigquery/blob/main/docs/README.rst Basic usage pattern for initializing a client, executing a SQL query, and iterating over the results. ```python from google.cloud import bigquery client = bigquery.Client() # Perform a query. QUERY = ( 'SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` ' 'WHERE state = "TX" ' 'LIMIT 100') query_job = client.query(QUERY) # API request rows = query_job.result() # Waits for query to finish for row in rows: print(row.name) ``` -------------------------------- ### GET get_iam_policy Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Retrieves the access control policy for a specified BigQuery table. ```APIDOC ## GET get_iam_policy ### Description Return the access control policy for a table resource. ### Parameters #### Path Parameters - **table** (Union[Table, TableReference, TableListItem, str]) - Required - The table to get the access control policy for. #### Query Parameters - **requested_policy_version** (int) - Optional - The maximum policy version that will be used to format the policy. Only version 1 is currently supported. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using retry. ### Response #### Success Response (200) - **Policy** (Object) - The access control policy. ``` -------------------------------- ### GET get_dataset Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Fetches a dataset resource from BigQuery based on the provided reference. ```APIDOC ## GET get_dataset ### Description Fetch the dataset referenced by dataset_ref. ### Parameters #### Path Parameters - **dataset_ref** (Union[DatasetReference, str]) - Required - A reference to the dataset to fetch from the BigQuery API. #### Query Parameters - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using retry. - **dataset_view** (DatasetView) - Optional - Specifies the view that determines which dataset information is returned (ACL, FULL, METADATA, DATASET_VIEW_UNSPECIFIED). ### Response #### Success Response (200) - **Dataset** (Object) - A Dataset instance. ``` -------------------------------- ### TimePartitioning Configuration Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Details on configuring time-based partitioning for BigQuery tables. ```APIDOC ## Class google.cloud.bigquery.table.TimePartitioning ### Description Configures time-based partitioning for a table. ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body * **type** (Optional[google.cloud.bigquery.table.TimePartitioningType]) - Specifies the type of time partitioning to perform. Defaults to DAY. Supported values are: HOUR, DAY, MONTH, YEAR. * **field** (Optional[str]) - If set, the table is partitioned by this field. If not set, the table is partitioned by pseudo column `_PARTITIONTIME`. The field must be a top-level TIMESTAMP, DATETIME, or DATE field. Its mode must be NULLABLE or REQUIRED. * **expiration_ms** (Optional[int]) - Number of milliseconds for which to keep the storage for a partition. * **require_partition_filter** (Optional[bool]) - DEPRECATED: Use `require_partition_filter` on the Table object instead. ### Request Example ```json { "type": "DAY", "field": "timestamp_column", "expiration_ms": 86400000, "require_partition_filter": false } ``` ### Response #### Success Response (200) * **type** (google.cloud.bigquery.table.TimePartitioningType) - The type of time partitioning. * **field** (str) - The field used for partitioning. * **expiration_ms** (int) - The number of milliseconds to keep partition storage. * **require_partition_filter** (bool) - Specifies whether partition filters are required for queries. #### Response Example ```json { "type": "DAY", "field": "timestamp_column", "expiration_ms": 86400000, "require_partition_filter": false } ``` ## Class google.cloud.bigquery.table.TimePartitioningType ### Description Specifies the type of time partitioning to perform. #### DAY Generates one partition per day. * **Type:** str #### HOUR Generates one partition per hour. * **Type:** str #### MONTH Generates one partition per month. * **Type:** str #### YEAR Generates one partition per year. * **Type:** str ## Class Method google.cloud.bigquery.table.TimePartitioning.from_api_repr(api_repr: dict) -> TimePartitioning ### Description Return a `TimePartitioning` object deserialized from a dict. ### Parameters * **api_repr** (Mapping[str, str]) - The serialized representation of the TimePartitioning. ### Returns The `TimePartitioning` object. ### Return type google.cloud.bigquery.table.TimePartitioning ## Property google.cloud.bigquery.table.TimePartitioning.expiration_ms ### Description Number of milliseconds to keep the storage for a partition. * **Type:** int ## Property google.cloud.bigquery.table.TimePartitioning.field ### Description Field in the table to use for partitioning. * **Type:** str ## Property google.cloud.bigquery.table.TimePartitioning.require_partition_filter ### Description Specifies whether partition filters are required for queries. DEPRECATED: Use `require_partition_filter` on the Table object instead. * **Type:** bool ## Property google.cloud.bigquery.table.TimePartitioning.type_ ### Description The type of time partitioning to use. * **Type:** google.cloud.bigquery.table.TimePartitioningType ## Method google.cloud.bigquery.table.TimePartitioning.to_api_repr() -> dict ### Description Return a dictionary representing this object. ### Returns A dictionary representing the TimePartitioning object in serialized form. ### Return type dict ``` -------------------------------- ### Run benchmark with shell substitutions Source: https://github.com/googleapis/python-bigquery/blob/main/benchmark/README.md Executes the benchmark script using environment variables and shell command substitutions for dynamic tagging. ```bash python benchmark.py \ --reruns 5 \ --table $BENCHMARK_TABLE \ --tag origin:$(hostname) \ --tag branch:$(git branch --show-current) \ --tag latestcommit:$(git log --pretty=format:'%H' -n 1) ``` -------------------------------- ### GET /serviceAccount Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Retrieves the email address of the project's BigQuery service account. ```APIDOC ## GET /serviceAccount ### Description Get the email address of the project’s BigQuery service account. ### Method GET ### Endpoint /serviceAccount ### Parameters #### Query Parameters - **project** (str) - Optional - ID of the project. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **email** (str) - The email address of the service account. ``` -------------------------------- ### GET /jobs/{job_id} Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Fetches a specific job by its ID from the project associated with the client. ```APIDOC ## GET /jobs/{job_id} ### Description Fetch a job for the project associated with this client. ### Method GET ### Endpoint /jobs/{job_id} ### Parameters #### Query Parameters - **job_id** (str | Job) - Required - Job identifier. - **project** (str) - Optional - ID of the project which owns the job. - **location** (str) - Optional - Location where the job was run. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **job** (Object) - Job instance, based on the resource returned by the API. ``` -------------------------------- ### Create AccessEntry for user Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Initialize an AccessEntry for a user by email. ```pycon >>> entry = AccessEntry('OWNER', 'userByEmail', 'user@example.com') ``` -------------------------------- ### QueryApiMethod Enum Source: https://github.com/googleapis/python-bigquery/blob/main/docs/enums.md API method used to start the query. The default value is INSERT. ```APIDOC ## QueryApiMethod Enum ### Description API method used to start the query. The default value is [`INSERT`](#google.cloud.bigquery.enums.QueryApiMethod.INSERT). ### Values - **INSERT** (*'INSERT'*): Submit a query job by using the [jobs.insert REST API method](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/insert). This supports all job configuration options. - **QUERY** (*'QUERY'*): Submit a query job by using the [jobs.query REST API method](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query). Differences from `INSERT`: * Many parameters and job configuration options, including job ID and destination table, cannot be used with this API method. See the [jobs.query REST API documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query) for the complete list of supported configuration options. * API blocks up to a specified timeout, waiting for the query to finish. * The full job resource (including job statistics) may not be available. Call [`reload()`](reference.md#google.cloud.bigquery.job.QueryJob.reload) or [`get_job()`](reference.md#google.cloud.bigquery.client.Client.get_job) to get full job statistics and configuration. * `query()` can raise API exceptions if the query fails, whereas the same errors don’t appear until calling [`result()`](reference.md#google.cloud.bigquery.job.QueryJob.result) when the `INSERT` API method is used. ``` -------------------------------- ### Training Options Source: https://github.com/googleapis/python-bigquery/blob/main/docs/bigquery/legacy_proto_types.md Configuration options for model training in BigQuery. ```APIDOC ## TrainingOptions ### Description Options used in model training. ### Fields - **max_iterations** (int) - The maximum number of iterations in training. Used only for iterative training algorithms. - **loss_type** ([google.cloud.bigquery_v2.types.Model.LossType](#google.cloud.bigquery_v2.types.Model.LossType)) - Type of loss function used during training run. - **learn_rate** (float) - Learning rate in training. Used only for iterative training algorithms. - **l1_regularization** (google.protobuf.wrappers_pb2.DoubleValue) - L1 regularization coefficient. - **l2_regularization** (google.protobuf.wrappers_pb2.DoubleValue) - L2 regularization coefficient. - **min_relative_progress** (google.protobuf.wrappers_pb2.DoubleValue) - When early_stop is true, stops training when accuracy improvement is less than ‘min_relative_progress’. Used only for iterative training algorithms. - **warm_start** (google.protobuf.wrappers_pb2.BoolValue) - Whether to train a model from the last checkpoint. - **early_stop** (google.protobuf.wrappers_pb2.BoolValue) - Whether to stop early when the loss doesn’t improve significantly any more (compared to min_relative_progress). Used only for iterative training algorithms. - **input_label_columns** (Sequence[str]) - Name of input label columns in training data. - **data_split_method** ([google.cloud.bigquery_v2.types.Model.DataSplitMethod](#google.cloud.bigquery_v2.types.Model.DataSplitMethod)) - The data split type for training and evaluation, e.g. RANDOM. - **data_split_eval_fraction** (float) - The fraction of evaluation data over the whole input data. The rest of data will be used as training data. The format should be double. Accurate to two decimal places. Default value is 0.2. ``` -------------------------------- ### Retrieve Row Values with get() Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Access row values by key with optional default values. ```pycon >>> Row(('a', 'b'), {'x': 0, 'y': 1}).get('x') 'a' ``` ```pycon >>> Row(('a', 'b'), {'x': 0, 'y': 1}).get('z') None ``` ```pycon >>> Row(('a', 'b'), {'x': 0, 'y': 1}).get('z', '') '' ``` ```pycon >>> Row(('a', 'b'), {'x': 0, 'y': 1}).get('z', default = '') '' ``` -------------------------------- ### Load BigQuery IPython Magics Source: https://github.com/googleapis/python-bigquery/blob/main/docs/magics.md Run this magic command in a Jupyter notebook cell to enable the `%%bigquery` magic. ```python %load_ext bigquery_magics ``` -------------------------------- ### POST /jobs/get Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Refreshes job properties by making a GET request to the BigQuery jobs API. ```APIDOC ## POST /jobs/get ### Description API call to refresh job properties via a GET request. See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get ### Method GET ### Endpoint /jobs/get ### Parameters #### Query Parameters - **client** (Optional[google.cloud.bigquery.client.Client]) - The client to use. If not passed, falls back to the client stored on the current dataset. - **retry** (Optional[google.api_core.retry.Retry]) - How to retry the RPC. - **timeout** (Optional[float]) - The number of seconds to wait for the underlying HTTP transport before using `retry`. ``` -------------------------------- ### GET /models/{model_ref} Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Fetches a machine learning model referenced by the provided model reference. ```APIDOC ## GET /models/{model_ref} ### Description [Beta] Fetch the model referenced by model_ref. ### Method GET ### Endpoint /models/{model_ref} ### Parameters #### Query Parameters - **model_ref** (ModelReference | str) - Required - A reference to the model to fetch. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **model** (Model) - A Model instance. ``` -------------------------------- ### Create a BigQuery Dataset Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Use this method to create a new dataset in BigQuery. The dataset ID must be provided. If `exists_ok` is True, errors for existing datasets are ignored. ```python from google.cloud import bigquery client = bigquery.Client() dataset = bigquery.Dataset('my_project.my_dataset') dataset = client.create_dataset(dataset) ``` -------------------------------- ### POST /jobs/load_table_from_dataframe Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Uploads the contents of a pandas DataFrame to a BigQuery table by creating and starting a LoadJob. ```APIDOC ## POST /jobs/load_table_from_dataframe ### Description Upload the contents of a table from a pandas DataFrame. ### Method POST ### Parameters #### Request Body - **dataframe** (DataFrame) - Required - The pandas DataFrame containing the data to upload. - **destination** (Union[Table, TableReference, str]) - Required - The destination table. - **num_retries** (int) - Optional - Number of retries. - **job_id** (str) - Optional - Unique identifier for the job. - **job_id_prefix** (str) - Optional - Prefix for the job ID. - **location** (str) - Optional - The geographic location of the job. - **project** (str) - Optional - The project ID. - **job_config** (LoadJobConfig) - Optional - Configuration for the load job. ### Response #### Success Response (200) - **LoadJob** (google.cloud.bigquery.job.LoadJob) - The created load job. ``` -------------------------------- ### GET /jobs/{jobId} Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Methods to check for the existence of a job or reload its current properties from the BigQuery API. ```APIDOC ## GET /jobs/{jobId} ### Description Checks for the existence of a job or refreshes the job's properties by performing a GET request to the BigQuery API. ### Method GET ### Endpoint /projects/{projectId}/jobs/{jobId} ### Parameters #### Query Parameters - **client** (google.cloud.bigquery.client.Client) - Optional - The client to use for the request. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **exists** (bool) - Returns true if the job exists. - **properties** (dict) - Returns the updated job representation when using reload(). ``` -------------------------------- ### POST /routines Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Creates a new routine in BigQuery. ```APIDOC ## POST /routines ### Description [Beta] Create a routine via a POST request. The dataset that the routine belongs to must already exist. ### Method POST ### Endpoint /routines ### Parameters #### Request Body - **routine** (Routine) - Required - A Routine to create. - **exists_ok** (bool) - Optional - Defaults to False. If True, ignore "already exists" errors. - **retry** (Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport. ### Response #### Success Response (200) - **routine** (Routine) - A new Routine returned from the service. ### Errors - **Conflict**: Raised if the routine already exists. ``` -------------------------------- ### Get Service Account Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Retrieves the service account email address that BigQuery uses for managing KMS-encrypted tables. ```APIDOC ## Get Service Account ### Description This method retrieves the service account email address that BigQuery uses to manage tables encrypted by a key in KMS. ### Parameters #### Query Parameters - **project** (str) - Optional - Project ID to use for retrieving service account email. Defaults to the client’s project. - **retry** (google.api_core.retry.Retry) - Optional - How to retry the RPC. - **timeout** (float) - Optional - The number of seconds to wait for the underlying HTTP transport before using `retry`. ### Returns - **str** - The service account email address. ``` -------------------------------- ### Create and Manage Routines Source: https://context7.com/googleapis/python-bigquery/llms.txt Define and manage scalar functions using SQL or JavaScript, and list or delete existing routines. ```python routine_id = f"{client.project}.my_dataset.multiply" routine = bigquery.Routine( routine_id, type_="SCALAR_FUNCTION", language="SQL", body="x * y", arguments=[ bigquery.RoutineArgument( name="x", data_type=bigquery.StandardSqlDataType( type_kind=bigquery.StandardSqlTypeNames.INT64 ), ), bigquery.RoutineArgument( name="y", data_type=bigquery.StandardSqlDataType( type_kind=bigquery.StandardSqlTypeNames.INT64 ), ), ], return_type=bigquery.StandardSqlDataType( type_kind=bigquery.StandardSqlTypeNames.INT64 ), ) routine = client.create_routine(routine, exists_ok=True) print(f"Created routine {routine.reference}") # Use the UDF in a query query = f""" SELECT `{client.project}.my_dataset.multiply`(5, 10) as result """ result = client.query_and_wait(query) for row in result: print(f"Result: {row.result}") # Create JavaScript UDF routine_id = f"{client.project}.my_dataset.parse_json" routine = bigquery.Routine( routine_id, type_="SCALAR_FUNCTION", language="JAVASCRIPT", body="return JSON.parse(json_string);", arguments=[ bigquery.RoutineArgument( name="json_string", data_type=bigquery.StandardSqlDataType( type_kind=bigquery.StandardSqlTypeNames.STRING ), ), ], return_type=bigquery.StandardSqlDataType( type_kind=bigquery.StandardSqlTypeNames.JSON ), ) routine = client.create_routine(routine, exists_ok=True) # List routines for routine in client.list_routines("my_dataset"): print(f"Routine: {routine.routine_id}") # Delete routine client.delete_routine(routine_id, not_found_ok=True) ``` -------------------------------- ### Create a BigQuery Routine Source: https://github.com/googleapis/python-bigquery/blob/main/docs/reference.md Use this method to create a new routine (e.g., a stored procedure or function) in BigQuery. The dataset must already exist. If `exists_ok` is True, errors for existing routines are ignored. ```python # Example for create_routine is not provided in the source text. ``` -------------------------------- ### Create Integer Range Partitioned Table Source: https://context7.com/googleapis/python-bigquery/llms.txt Configures a table with range partitioning based on an integer field. ```python schema = [ bigquery.SchemaField("customer_id", "INT64"), bigquery.SchemaField("name", "STRING"), ] table_id = f"{client.project}.my_dataset.customers" table = bigquery.Table(table_id, schema=schema) table.range_partitioning = bigquery.RangePartitioning( field="customer_id", range_=bigquery.PartitionRange(start=0, end=1000000, interval=10000), ) table = client.create_table(table) ```