### Create and Share Data using SQL Source: https://databrickslabs.github.io/partner-architecture/data-collaboration/getting-started This SQL snippet demonstrates the basic steps to create a share, add a table to it, create a recipient, and grant access. It assumes direct Databricks-to-Databricks (D2D) sharing where the recipient's sharing identifier is known. Ensure you have the necessary Unity Catalog setup and permissions. ```sql -- 1. Create a share CREATE SHARE my_first_share COMMENT 'Sample data for partners'; -- 2. Add a table ALTER SHARE my_first_share ADD TABLE catalog.schema.my_table; -- 3. Create a recipient (assumes D2D where sharing identifier is available) CREATE RECIPIENT partner_acme USING ID ''; -- 4. Grant access GRANT SELECT ON SHARE my_first_share TO RECIPIENT partner_acme; ``` -------------------------------- ### List Jobs via Databricks REST API Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/rest-apis This example demonstrates how to make a GET request to the Databricks Jobs API to list available jobs. It includes setting the Authorization and User-Agent headers. ```shell curl -X GET "$DATABRICKS_HOST/api/2.0/jobs/list" \ -H "Authorization: Bearer $DATABRICKS_TOKEN" \ -H "User-Agent: /" \ -H "Content-Type: application/json" ``` -------------------------------- ### Partitioned Sharing Example (SQL) Source: https://databrickslabs.github.io/partner-architecture/data-collaboration/sharing-patterns/d2d This SQL snippet demonstrates how to share a table with specific partitions, allowing recipients to access only relevant data subsets. It uses the ALTER SHARE command to add a table and specify partition conditions. ```sql ALTER SHARE my_share ADD TABLE catalog.schema.sales PARTITION (region = 'us-west', year >= 2024); ``` -------------------------------- ### Configure Java SDK User-Agent with Partner and Product Identifiers Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sdks This Java code snippet illustrates how to set partner and product identifiers for the Databricks SDK for Java using `UserAgent.withProduct()` and `UserAgent.withPartner()`. It configures authentication from environment variables and creates a `WorkspaceClient` to interact with Databricks, including an example of listing catalogs. ```java package com.example; import com.databricks.sdk.core.DatabricksConfig; import com.databricks.sdk.WorkspaceClient; import com.databricks.sdk.service.catalog.CatalogInfo; import com.databricks.sdk.service.catalog.ListCatalogsRequest; import com.databricks.sdk.core.UserAgent; public class ListCatalogsExample { public static void main(String[] args) { // Load configuration from environment String host = System.getenv("DATABRICKS_HOST"); String clientId = System.getenv("DATABRICKS_CLIENT_ID"); String clientSecret = System.getenv("DATABRICKS_CLIENT_SECRET"); // Set partner and product identifiers UserAgent.withProduct("", ""); UserAgent.withPartner(""); // Build configuration DatabricksConfig config = new DatabricksConfig() .setHost(host) .setClientId(clientId) .setClientSecret(clientSecret); // Create workspace client WorkspaceClient client = new WorkspaceClient(config); // Example: List catalogs starting with 's' for (CatalogInfo catalog : client.catalogs().list(new ListCatalogsRequest())) { String name = catalog.getName(); if (name != null && name.toLowerCase().startsWith("s")) { System.out.println("Catalog: " + name); } } } } ``` -------------------------------- ### Bulk User Provisioning and Group Assignment using Databricks SDK Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This code example shows how to onboard multiple users simultaneously and assign them to predefined groups. It iterates through a list of user definitions, creates each user, retrieves their respective groups, and patches the groups to add the new users as members. It provides feedback on the onboarding status for each user. ```python # Onboard multiple users at once new_users = [ {"email": "john.smith@acmecorp.com", "name": "John Smith", "groups": ["customer-acme-analysts"]}, {"email": "sarah.jones@acmecorp.com", "name": "Sarah Jones", "groups": ["customer-acme-engineers"]}, {"email": "mike.wilson@acmecorp.com", "name": "Mike Wilson", "groups": ["customer-acme-scientists"]} ] for user_def in new_users: # Create user user = w.users.create( user_name=user_def["email"], display_name=user_def["name"] ) # Assign to groups for group_name in user_def["groups"]: group = w.groups.get(display_name=group_name) w.groups.patch( id=group.id, operations=[ { "op": "add", "path": "members", "value": [{"value": user.id}] } ] ) print(f"✓ Onboarded {user.display_name}") ``` -------------------------------- ### User-Agent Format Example Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution Demonstrates the required format for a User-Agent string in Databricks partner integrations, including the ISV name, product name, and product version. ```text / ``` ```text AcmePartner_DatEngProduct/3.5 ``` -------------------------------- ### Create Databricks Service Principal and Assign to Group Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This example shows how to create a service principal for automated workloads and assign it to a specific group, granting it the necessary permissions. It also includes a placeholder comment for generating an OAuth secret, which is crucial for service principal authentication. ```python from databricks.sdk import AccountClient # Assuming 'w' is an initialized WorkspaceClient or AccountClient # Create service principal for customer's ETL pipeline sp = w.service_principals.create( display_name="acme-etl-pipeline", application_id="acme-etl-app" ) # Add service principal to engineers group engineers_group = w.groups.get(display_name="customer-acme-engineers") w.groups.patch( id=engineers_group.id, operations=[ { "op": "add", "path": "members", "value": [{"value": sp.id}] } ] ) # Generate OAuth secret for service principal authentication (commented out as it's a separate step) ``` -------------------------------- ### Configure Go SDK User-Agent with Partner and Product Identifiers Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sdks This Go code snippet shows how to configure the User-Agent for the Databricks SDK by registering partner and product identifiers using `useragent.WithPartner()` and `useragent.WithProduct()`. A new WorkspaceClient is then created, ready for making API calls. ```go package main import ( "github.com/databricks/databricks-sdk-go" "github.com/databricks/databricks-sdk-go/useragent" ) func main() { useragent.WithPartner("") useragent.WithProduct("", "") w := databricks.NewWorkspaceClient() // Example API call... } ``` -------------------------------- ### Unity Catalog Structure Example (Single-Tenant) Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/governance Illustrates a directory structure for a single-tenant Unity Catalog implementation, featuring a partner hub catalog and individual customer spoke catalogs. This structure helps in organizing shared and customer-specific data assets. ```plaintext metastore/ ├── partner_hub/ # Hub: Partner-owned (read-only for customers) │ ├── reference_data/ │ │ ├── dim_products │ │ ├── dim_regions │ │ └── dim_currency │ └── shared_models/ │ └── ml_scoring_model │ ├── customer_a/ # Spoke: Customer writes and manages assets │ ├── raw/ │ ├── curated/ │ └── analytics/ │ └── customer_b/ # Spoke: Customer writes and manages assets ├── raw/ ├── curated/ └── analytics/ ``` -------------------------------- ### Unity Catalog Structure Example (Schema per Tenant) Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/governance Shows a directory structure for a multi-tenant Unity Catalog deployment using the 'schema per tenant' pattern. It includes a shared common schema for read-only access and individual tenant schemas for specific data. ```plaintext catalog/ ├── common/ # Shared schema (read-only for tenants) │ ├── dim_products │ ├── dim_regions │ └── dim_currency │ ├── customer_a/ # Tenant schema │ ├── fact_orders │ ├── fact_inventory │ └── dim_locations # Tenant-specific dimension │ └── customer_b/ ├── fact_orders ├── fact_inventory └── dim_cost_centers # Different tenant-specific dimension ``` -------------------------------- ### Grant Catalog and Schema Permissions to Groups using SQL Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This SQL script demonstrates how to grant specific permissions on catalogs and schemas to different persona-based groups within Databricks Unity Catalog. It shows examples for Data Analysts (read-only), Data Engineers (ETL pipeline management), and Administrators (full control). This is crucial for enforcing the principle of group-based access management. ```sql -- Data Analysts: Read-only access to production data GRANT USE CATALOG ON CATALOG `customer_acme` TO `customer-acme-analysts`; GRANT USE SCHEMA ON SCHEMA `customer_acme`.`sales` TO `customer-acme-analysts`; GRANT SELECT ON SCHEMA `customer_acme`.`sales` TO `customer-acme-analysts`; -- Data Engineers: Full access to manage ETL pipelines GRANT USE CATALOG ON CATALOG `customer_acme` TO `customer-acme-engineers`; GRANT CREATE SCHEMA, USE SCHEMA, CREATE TABLE ON CATALOG `customer_acme` TO `customer-acme-engineers`; -- Admins: Full control GRANT ALL PRIVILEGES ON CATALOG `customer_acme` TO `customer-acme-admins`; ``` -------------------------------- ### Databricks SQL Connection Strings Source: https://databrickslabs.github.io/partner-architecture/isv-partners/lakehouse-patterns/access-auth/oauth-u2m Examples of how to use the obtained OAuth access token to establish a connection to Databricks SQL using JDBC and ODBC drivers. ```APIDOC ## Databricks SQL Connection Configuration ### Description Demonstrates how to configure JDBC and ODBC drivers to connect to Databricks SQL using an OAuth access token obtained via the U2M flow. ### JDBC Driver Configuration #### Parameters - **** (string) - Your Databricks workspace hostname. - **** (string) - The HTTP path for your Databricks SQL endpoint. - **AuthMech** (integer) - Set to `11` for OAuth token authentication. - **Auth_Flow** (integer) - Set to `0` for token-based authentication. - **Auth_AccessToken** (string) - The OAuth access token obtained from the token endpoint. #### Connection String Example ``` jdbc:databricks://:443/;AuthMech=11;Auth_Flow=0;Auth_AccessToken=YOUR_OAUTH_ACCESS_TOKEN ``` ### ODBC Driver Configuration #### Parameters - **Host** (string) - Your Databricks workspace hostname. - **Port** (integer) - Typically `443`. - **HTTPPath** (string) - The HTTP path for your Databricks SQL endpoint. - **AuthMech** (integer) - Set to `11` for OAuth token authentication. - **Auth_Flow** (integer) - Set to `0` for token-based authentication. - **Auth_AccessToken** (string) - The OAuth access token obtained from the token endpoint. #### Connection String Example ``` Host=;Port=443;HTTPPath=;AuthMech=11;Auth_Flow=0;Auth_AccessToken=YOUR_OAUTH_ACCESS_TOKEN ``` ``` -------------------------------- ### Configure Python SDK User-Agent with Partner and Product Identifiers Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sdks This Python code snippet demonstrates how to use the Databricks SDK to register partner and product identifiers in the User-Agent string. It utilizes `useragent.with_partner()` and `useragent.with_product()` for this purpose. Authentication is handled via environment variables, and a WorkspaceClient is instantiated for subsequent API calls. ```python import os from databricks.sdk import WorkspaceClient, useragent from databricks.sdk.core import Config # Set partner and product identifiers useragent.with_partner("") useragent.with_product("", "") # Auth from env cfg = Config( host=os.getenv("DATABRICKS_HOST"), client_id=os.getenv("DATABRICKS_CLIENT_ID"), client_secret=os.getenv("DATABRICKS_CLIENT_SECRET"), auth_type="oauth-m2m", ) # Workspace client w = WorkspaceClient(config=cfg) # Quick test for s in w.schemas.list(catalog_name="samples"): print("-", s.name) ``` -------------------------------- ### Databricks Cost-Optimized Compute Policy Example Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/cost-management This JSON policy defines a cost-optimized compute configuration for Databricks clusters. It restricts instance types to cost-effective options, enforces auto-termination and autoscaling limits, utilizes spot instances with fallback, and mandates specific tags like 'customer_id' and 'environment'. This policy helps prevent cost overruns by governing cluster creation parameters. ```json { "node_type_id": { "type": "allowlist", "values": ["i3.xlarge", "i3.2xlarge"], "defaultValue": "i3.xlarge" }, "driver_node_type_id": { "type": "fixed", "value": "i3.xlarge", "hidden": true }, "autotermination_minutes": { "type": "range", "minValue": 10, "maxValue": 60, "defaultValue": 30 }, "autoscale": { "type": "fixed", "value": { "min_workers": 1, "max_workers": 10 } }, "aws_attributes.availability": { "type": "fixed", "value": "SPOT_WITH_FALLBACK", "hidden": true }, "aws_attributes.spot_bid_price_percent": { "type": "fixed", "value": 100, "hidden": true }, "custom_tags.customer_id": { "type": "fixed", "value": "{{customer_id}}", "hidden": false }, "custom_tags.environment": { "type": "fixed", "value": "production", "hidden": true } } ``` -------------------------------- ### Create Persona-Based Groups using Databricks Python SDK Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This Python script utilizes the Databricks SDK to programmatically create persona-based groups within Unity Catalog. It defines a list of group configurations and iterates through them to create each group, printing the display name and ID of the newly created group. This is useful for automating the initial setup of access control structures. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Create persona-based groups groups = [ {"name": "customer-acme-analysts", "display_name": "Acme Corp Data Analysts"}, {"name": "customer-acme-engineers", "display_name": "Acme Corp Data Engineers"}, {"name": "customer-acme-scientists", "display_name": "Acme Corp Data Scientists"}, {"name": "customer-acme-admins", "display_name": "Acme Corp Administrators"} ] for group_def in groups: group = w.groups.create( display_name=group_def["display_name"] ) print(f"Created group: {group.display_name} (ID: {group.id})") ``` -------------------------------- ### Create Databricks SQL Warehouse with Tags and Permissions using Python SDK Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This Python code snippet illustrates how to create a Databricks SQL warehouse with specified configurations, custom tags for cost attribution, and grant usage permissions to a group using the Databricks SDK. It requires a WorkspaceClient instance and defines parameters like cluster size, auto-stop, and serverless compute. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Create SQL warehouse for customer warehouse = w.warehouses.create( name="customer-acme-analytics", cluster_size="Medium", min_num_clusters=1, max_num_clusters=3, auto_stop_mins=15, enable_serverless_compute=True, tags={ "custom_tags": [ {"key": "customer_id", "value": "acme_corp"}, {"key": "environment", "value": "production"}, {"key": "service", "value": "analytics"} ] } ) # Grant warehouse access to analysts group w.permissions.update( request_object_type="warehouses", request_object_id=warehouse.id, access_control_list=[ { "group_name": "customer-acme-analysts", "permission_level": "CAN_USE" } ] ) ``` -------------------------------- ### Make Iceberg REST API Call with User-Agent Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/rest-apis This Java code snippet demonstrates how to make a GET request to the Iceberg REST API for table credentials. It shows how to set the Authorization and User-Agent headers for the HTTP connection. ```java String v1CredsUrl = catalogApi + "/v1/catalogs/" + warehouse + "/namespaces/" + schema + "/tables/" + table + "/credentials"; HttpURLConnection conn = (HttpURLConnection) new URL(v1CredsUrl).openConnection(); conn.setRequestMethod("GET"); conn.setRequestProperty("Authorization", "Bearer " + accessToken); conn.setRequestProperty("User-Agent", "/"); ``` -------------------------------- ### Execute SQL Statement via Databricks REST API Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/rest-apis This example shows how to make a POST request to the Databricks SQL Execution API to run a SQL statement. It includes setting the Authorization and User-Agent headers, along with the request body containing warehouse details and the SQL query. ```shell curl -X POST "$DATABRICKS_HOST/api/2.0/sql/statements" \ -H "Authorization: Bearer $DATABRICKS_TOKEN" \ -H "User-Agent: /" \ -H "Content-Type: application/json" \ -d "{ \"warehouse_id\": \"$DATABRICKS_SQL_WAREHOUSE_ID\", \"catalog\": \"\", \"schema\": \"\", \"statement\": \"\" }" ``` -------------------------------- ### Set up Databricks External Location and Grant Permissions using Python SDK Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This Python code snippet demonstrates setting up an external location in Databricks for cloud storage and granting specific permissions to a group using the Databricks SDK. It requires a WorkspaceClient instance and details about the location, credential, and principal. ```python # Create external location for customer's cloud storage external_location = w.external_locations.create( name="customer_acme_raw_data", url="s3://acme-databricks-data/raw/", credential_name="aws-acme-credential", comment="Acme Corp raw data location" ) # Grant access to data engineers w.grants.update( securable_type="EXTERNAL_LOCATION", full_name="customer_acme_raw_data", changes=[ { "principal": "customer-acme-engineers", "add": ["CREATE_EXTERNAL_TABLE", "READ_FILES", "WRITE_FILES"] } ] ) ``` -------------------------------- ### Python Package Configuration with setup.py Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/libraries Configures a Python package's metadata using the legacy setup.py file. This method is still supported but pyproject.toml is recommended for new projects. ```python from setuptools import setup, find_packages setup( name="isvname-datatool", version="1.0.0", packages=find_packages(where="src"), package_dir={"": "src"}, ) ``` -------------------------------- ### Apply Tags When Creating SQL Warehouses Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/cost-management Illustrates how to assign tags, including 'customer_id', 'environment', and 'service', to a SQL warehouse during its creation using the Python SDK. ```APIDOC ## POST /api/2.0/sql/warehouses ### Description Creates a new Databricks SQL warehouse with specified configurations and custom tags. ### Method POST ### Endpoint /api/2.0/sql/warehouses ### Parameters #### Request Body - **name** (string) - Required - The name of the SQL warehouse. - **cluster_size** (string) - Required - The size of the cluster (e.g., Small, Medium). - **min_num_clusters** (integer) - Required - The minimum number of clusters. - **max_num_clusters** (integer) - Required - The maximum number of clusters. - **auto_stop_mins** (integer) - Required - The auto-stop time in minutes. - **tags** (object) - Optional - A list of tags to apply to the SQL warehouse. - **custom_tags** (array) - Required - An array of tag objects. - **key** (string) - Required - The tag key. - **value** (string) - Required - The tag value. ### Request Example ```json { "name": "customer-acme-warehouse", "cluster_size": "Small", "min_num_clusters": 1, "max_num_clusters": 3, "auto_stop_mins": 15, "tags": { "custom_tags": [ {"key": "customer_id", "value": "acme_corp"}, {"key": "environment", "value": "production"}, {"key": "service", "value": "analytics_api"} ] } } ``` ### Response #### Success Response (200) - **id** (string) - The ID of the created SQL warehouse. - **name** (string) - The name of the SQL warehouse. - **state** (string) - The current state of the SQL warehouse. #### Response Example ```json { "id": "abcdef1234567890", "name": "customer-acme-warehouse", "state": "RUNNING" } ``` ``` -------------------------------- ### Install Databricks SDK using pip Source: https://databrickslabs.github.io/partner-architecture/data-collaboration/operations/automation Installs the Databricks SDK, which provides programmatic access to Delta Sharing and Marketplace operations. This command is executed using pip, the standard package installer for Python. ```bash pip install databricks-sdk ``` -------------------------------- ### Databricks Python SDK for Marketplace Operations Source: https://databrickslabs.github.io/partner-architecture/data-collaboration/operations/automation Illustrates how to interact with Databricks Marketplace using the Python SDK. It covers provider operations such as listing and getting provider listings, listing access requests, and listing private exchanges. It also includes consumer operations for browsing and getting listing details. ```python # Provider operations w.provider_listings.list() # List your listings w.provider_listings.get(id="...") # Get listing details w.provider_personalization_requests.list() # List access requests w.provider_exchanges.list() # List private exchanges # Consumer operations w.consumer_listings.list() # Browse listings w.consumer_listings.get(id="...") # Get listing details ``` -------------------------------- ### Databricks Python SDK for Delta Sharing Operations Source: https://databrickslabs.github.io/partner-architecture/data-collaboration/operations/automation Demonstrates common Delta Sharing operations using the Databricks Python SDK. It shows how to list, get, create, update, delete shares, and manage their permissions. It also covers recipient operations like listing, getting, creating, rotating tokens, and deleting recipients. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Shares w.shares.list() # List all shares w.shares.get(name="my_share") # Get share details w.shares.create(name="...", ...) # Create share w.shares.update(name="...", ...) # Update share w.shares.delete(name="...") # Delete share w.shares.share_permissions(name) # Get share permissions w.shares.update_permissions(...) # Recipients w.recipients.list() # List all recipients w.recipients.get(name="...") # Get recipient details w.recipients.create(name="...", ...) # Create recipient w.recipients.rotate_token(name) # Rotate token w.recipients.delete(name="...") # Delete recipient ``` -------------------------------- ### Create Databricks Catalog and Schemas using Python SDK Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This Python code snippet demonstrates how to create a new catalog and multiple schemas within Databricks Unity Catalog using the Databricks SDK. It requires the 'databricks-sdk' package and a WorkspaceClient instance. The output includes the names of the created schemas. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Create customer catalog catalog = w.catalogs.create( name="customer_acme", comment="Acme Corp customer catalog" ) # Create schemas by data domain schemas = ["sales", "marketing", "operations", "analytics"] for schema_name in schemas: schema = w.schemas.create( name=schema_name, catalog_name="customer_acme", comment=f"Acme Corp {schema_name} data" ) print(f"Created schema: {catalog.name}.{schema.name}") ``` -------------------------------- ### Update ODBC Driver Connection Source: https://databrickslabs.github.io/partner-architecture/isv-partners/lakehouse-patterns/access-auth/oauth-u2m This section provides code examples for updating the access token and refreshing the connection for an ODBC driver. ```APIDOC ## Update driver connection ### Description Updates the access token and refreshes the connection for an ODBC driver. ### Method N/A (Code snippet for client-side driver) ### Endpoint N/A ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```cpp // Update the access token char *credentials = "Auth_AccessToken=$(new token)"; SQLSetConnectAttr(dbc, 122, credentials, SQL_NTS); // 122 = SQL_ATTR_CREDENTIALS // Refresh current connection __int32 refreshMode = -1; // Refresh now SQLSetConnectAttr(dbc, 123, reinterpret_cast(refreshMode), SQL_IS_SMALLINT); // 123 = SQL_ATTR_REFRESH_CONNECTION ``` ### Response #### Success Response (200) N/A #### Response Example N/A ``` -------------------------------- ### Configure SQLAlchemy User-Agent Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sql-connectors This example demonstrates how to configure the User-Agent identifier for SQLAlchemy connections to Databricks. The `user_agent_entry` is set within the `connect_args` of the `create_engine()` function. The `sqlalchemy` and `pyodbc` libraries are required. ```python from sqlalchemy import create_engine engine = create_engine( "databricks+pyodbc://token:@:443", connect_args={ "http_path": "", "user_agent_entry": "/" } ) ``` -------------------------------- ### Create Databricks User and Assign to Groups Source: https://databrickslabs.github.io/partner-architecture/built-on/operations/onboarding This Python script demonstrates how to create a new user in Databricks and subsequently add them to specified groups. It retrieves group IDs and uses patch operations to modify group memberships, ensuring the user inherits permissions associated with those groups. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Create user (or provision via SCIM from IdP) user = w.users.create( user_name="jane.doe@acmecorp.com", display_name="Jane Doe" ) # Add user to persona-based groups groups_to_join = [ "customer-acme-analysts", "customer-acme-users" ] for group_name in groups_to_join: # Get group ID group = w.groups.get(display_name=group_name) # Add user to group w.groups.patch( id=group.id, operations=[ { "op": "add", "path": "members", "value": [{"value": user.id}] } ] ) print(f"Added {user.user_name} to {group_name}") ``` -------------------------------- ### Set User-Agent for Databricks REST API Calls Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/rest-apis This snippet demonstrates the basic format for the User-Agent header required for Databricks REST API calls. It helps identify your product and ISV. ```shell -H "User-Agent: /" ``` -------------------------------- ### Databricks OAuth Token Response Source: https://databrickslabs.github.io/partner-architecture/isv-partners/lakehouse-patterns/access-auth/oauth-u2m Example JSON response from the Databricks token endpoint after a successful authorization code exchange. It includes the access token, refresh token, token type, scope, and expiration time. ```json { "access_token": "", "refresh_token": "", "token_type": "Bearer", "scope": "sql", "expires_in": 3600 } ``` -------------------------------- ### Apply Tags When Creating Databricks Clusters (Python) Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/cost-management Demonstrates how to apply custom tags, such as 'customer_id', 'environment', and 'service', when creating a new Databricks cluster using the Databricks SDK for Python. This is useful for attributing costs and organizing resources. ```python from databricks.sdk import WorkspaceClient w = WorkspaceClient() # Create cluster with customer tags cluster = w.clusters.create( cluster_name="customer-acme-etl", spark_version="13.3.x-scala2.12", node_type_id="i3.xlarge", num_workers=3, autotermination_minutes=30, custom_tags={ "customer_id": "acme_corp", "environment": "production", "service": "etl_pipeline" } ) ``` -------------------------------- ### Configure Python SQL Connector User-Agent Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sql-connectors This snippet shows how to set the User-Agent identifier for the Databricks SQL Connector for Python. It uses the `_user_agent_entry` parameter when establishing the connection. Ensure the `databricks` library is installed. ```python from databricks import sql import os with sql.connect( server_hostname=os.getenv("DATABRICKS_HOST"), http_path=os.getenv("DATABRICKS_HTTP_PATH"), credentials_provider=credential_provider, _user_agent_entry="/" ) as connection: # Your code here pass ``` -------------------------------- ### Apply Tags When Creating Jobs Source: https://databrickslabs.github.io/partner-architecture/built-on/architecture/cost-management Shows how to associate tags, such as 'customer_id' and 'service', with a Databricks job during its creation via the Python SDK. ```APIDOC ## POST /api/2.1/jobs/create ### Description Creates a new Databricks job with specified tasks and custom tags. ### Method POST ### Endpoint /api/2.1/jobs/create ### Parameters #### Request Body - **name** (string) - Required - The name of the job. - **tasks** (array) - Required - A list of tasks for the job. - **task_key** (string) - Required - The key of the task. - **notebook_task** (object) - Optional - Configuration for a notebook task. - **notebook_path** (string) - Required - The path to the notebook. - **existing_cluster_id** (string) - Required - The ID of an existing cluster to use. - **tags** (object) - Optional - A key-value map of custom tags to apply to the job. - **customer_id** (string) - Required - The primary billing entity tag. - **service** (string) - Required - The service tag. ### Request Example ```json { "name": "customer-acme-daily-report", "tasks": [ { "task_key": "generate_report", "notebook_task": { "notebook_path": "/Workspace/reports/daily_summary" }, "existing_cluster_id": "1234-567890-abcdefg" } ], "tags": { "customer_id": "acme_corp", "service": "reporting" } } ``` ### Response #### Success Response (200) - **job_id** (string) - The ID of the created job. #### Response Example ```json { "job_id": "abcdef1234567890" } ``` ``` -------------------------------- ### Configure Databricks Node.js Driver User-Agent Source: https://databrickslabs.github.io/partner-architecture/isv-partners/telemetry-attribution/sql-drivers This Node.js code snippet demonstrates how to set the 'userAgentEntry' property when initializing the DBSQLClient for the Databricks SQL Driver for Node.js. It shows how to pass this option during the client connection setup. ```javascript const { DBSQLClient } = require('@databricks/sql'); const client = new DBSQLClient(); client.connect({ host: process.env.DATABRICKS_SERVER_HOSTNAME, path: process.env.DATABRICKS_HTTP_PATH, // ... other connection options userAgentEntry: '/', }); ```