### Copy Test Environment Example

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Copy the example test environment file to your local directory. Update this file with your Databricks workspace details.

```bash
cp test.env.example test.env
```

--------------------------------

### Install and Create Hatch Environment

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Installs Hatch globally and creates the default development environment. Hatch manages Python versions automatically.

```bash
pip install hatch

hatch env create
```

--------------------------------

### Install dbt-databricks Adapter

Source: https://github.com/databricks/dbt-databricks/blob/main/README.md

Install the dbt-databricks adapter using pip. Ensure you have Python 3.7 or above.

```bash
pip install dbt-databricks
```

--------------------------------

### Basic Macro Test Setup

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Sets up a basic test for Jinja2 macros using `MacroTestBase`. Requires `template_name` and `macro_folders_to_load` fixtures.

```python
from unittest.mock import Mock
import pytest
from tests.unit.macros.base import MacroTestBase

class TestCreateViewMacros(MacroTestBase):
    @pytest.fixture(scope="class")
    def template_name(self) -> str:
        """Required: Name of the template file to test"""
        return "create.sql"  # File in dbt/include/databricks/macros/

    @pytest.fixture(scope="class")
    def macro_folders_to_load(self) -> list:
        """Required: Macro directories to load relative to dbt/include/databricks/"""
        return ["macros", "macros/relations/view"]
```

--------------------------------

### Install Hatch for Linux

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Installs Hatch, a recommended tool for managing Python environments and development workflows, on Linux systems. Ensure your PATH is updated to include the Hatch binary.

```bash
curl -Lo hatch.tar.gz https://github.com/pypa/hatch/releases/latest/download/hatch-x86_64-unknown-linux-gnu.tar.gz
tar -xzf hatch.tar.gz
mkdir -p $HOME/bin
mv hatch $HOME/bin/hatch
chmod +x $HOME/bin/hatch
echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc
export PATH="$HOME/bin:$PATH"

hatch env create
```

--------------------------------

### Jinja2 Whitespace Control Example

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

An example demonstrating the use of Jinja2 whitespace control characters (`-`) to prevent blank lines in generated SQL, ensuring cleaner output.

```jinja
select
    column1,
    column2
    {%- if config.get('extra_column') -%}
    , extra_column
    {%- endif %}
from table

```

--------------------------------

### Macro Test Example: Create Table

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

A macro test for the `create_table` macro, which generates SQL for creating a table. This test verifies the generated SQL against an expected output.

```python
from tests.unit.macros.base import MacroTestBase

class TestCreateTable(MacroTestBase):
    @pytest.fixture(scope="class")
    def template_name(self) -> str:
        return "create.sql"  # File in macros/relations/table/

    @pytest.fixture(scope="class")
    def macro_folders_to_load(self) -> list:
        return ["macros", "macros/relations/table"]

    def test_create_table_sql(self, template_bundle):
        result = self.run_macro(template_bundle.template, "create_table",
                               template_bundle.relation, "select 1")
        expected = "create table `database`.`schema`.`table` as (select 1)"
        self.assert_sql_equal(result, expected)
```

--------------------------------

### Databricks Notebook-Scoped Library Installation Commands

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/workflow-job-submission.md

Shows the Python commands prepended to model code when notebook-scoped libraries are enabled. This includes %pip install for specified packages and dbutils.library.restartPython() to ensure proper loading.

```python
%pip install -q pandas numpy==1.24.0 scikit-learn>=1.0
dbutils.library.restartPython()
# Your model code follows...
```

--------------------------------

### Cluster-Level Python Package Installation for dbt Models

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/workflow-job-submission.md

Installs Python packages at the cluster level for dbt models. Packages are installed when the cluster starts and are available for the entire cluster lifecycle. This is the default behavior.

```yaml
models:
  - name: my_model
    config:
      packages: ["pandas", "numpy==1.24.0", "scikit-learn>=1.0"]
      index_url: "https://pypi.org/simple"  # Optional: custom PyPI index
      notebook_scoped_libraries: false  # Default behavior
```

--------------------------------

### Notebook-Scoped Python Package Installation for dbt Models

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/workflow-job-submission.md

Installs Python packages at the notebook level using %pip install magic commands. This method is suitable for serverless compute and all-purpose clusters, allowing different models to use different package versions without cluster restarts.

```yaml
models:
  - name: my_model
    config:
      packages: ["pandas", "numpy==1.24.0", "scikit-learn>=1.0"]
      index_url: "https://pypi.org/simple"  # Optional: custom PyPI index
      notebook_scoped_libraries: true  # Enable notebook-scoped installation
```

--------------------------------

### dbt profiles.yml configuration for Databricks

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-jobs.md

Example configuration for a dbt profiles.yml file to connect to a Databricks cluster. It uses environment variables for sensitive information and specifies connection parameters.

```yaml
databricks_demo:
  target: databricks_cluster
  outputs:
    databricks_cluster:
      type: databricks
      connect_retries: 5
      connect_timeout: 180
      schema: "<your-schema>"
      threads: 8 # This can be increased or decreased to control the parallism
      host: "{{ env_var('DBT_HOST') }}"
      http_path: "sql/protocolv1/o/{{ env_var('DBT_ORG_ID') }}/{{ env_var('DBT_CLUSTER_ID') }}"
      token: "{{ env_var('DBT_ACCESS_TOKEN') }}"
```

--------------------------------

### Implementing Test Cleanup with Fixtures

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Use pytest fixtures with `yield` to define setup and cleanup logic for tests. Ensure resources like tables created during tests are properly dropped.

```python
class TestWithCleanup:
    @pytest.fixture(scope="class", autouse=True)
    def setUp(self, project):
        # Setup
        yield
        # Cleanup
        project.run_sql("DROP TABLE IF EXISTS my_test_table")
```

--------------------------------

### Unit Test Example: Redact Credentials

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

A unit test for the `redact_credentials` utility function, which masks sensitive values in SQL queries. This test ensures that credentials are correctly redacted.

```python
from dbt.adapters.databricks.utils import redact_credentials

def test_redact_credentials():
    sql = "WITH (credential ('KEY' = 'SECRET_VALUE'))"
    expected = "WITH (credential ('KEY' = '[REDACTED]'))"
    assert redact_credentials(sql) == expected
```

--------------------------------

### Configure Databricks CLI and get task details

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-jobs.md

Configure the Databricks CLI for API version 2.1 and retrieve task details for a specific job run. This is the first step to identify the dbt task's run ID.

```bash
$ databricks jobs configure --version=2.1
$ databricks runs get --run-id TASK_RUN_ID | jq .tasks
```

--------------------------------

### Reusable Job Cluster Configuration for dbt Models

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/workflow-job-submission.md

Defines a reusable job cluster configuration for dbt models, including task settings, post-hook tasks, and job cluster specifications. This allows for consistent environment setup for dbt jobs.

```yaml
models:
  - name: my_model

    config:
      python_job_config:
        additional_task_settings:
          { task_key: "task_a", job_cluster_key: "cluster_a" }
        post_hook_tasks:
          [
            {
              depends_on: [{ "task_key": "task_a" }],
              task_key: "OPTIMIZE_AND_VACUUM",
              job_cluster_key: "cluster_a",
              notebook_task:
                {
                  notebook_path: "/OPTIMIZE_AND_VACUUM",
                  source: "WORKSPACE",
                  base_parameters:
                    {
                      database: "{{ target.database }}",
                      schema: "{{ target.schema }}",
                      table_name: "my_model",
                    },
                },
            },
          ]
        job_clusters:
          [
            {
              job_cluster_key: "cluster_a",
              new_cluster:
                {
                  spark_version: "{{ var('dbr_versions')['lts_v14'] }}",
                  node_type_id: "{{ var('cluster_node_types')['large_job'] }}",
                  runtime_engine: "{{ var('job_cluster_defaults.runtime_engine') }}",
                  autoscale: { "min_workers": 1, "max_workers": 2 },
                },
            },
          ]
```
```

--------------------------------

### Running dbt-databricks Macros

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Demonstrates four methods for executing macros: basic execution with arguments, relation-first pattern, raw execution preserving whitespace, and a custom helper method for complex macros.

```python
def test_macro_execution_patterns(self, template_bundle):
    # Method 1: Basic macro execution with arguments
    # Use for simple macros that take explicit arguments
    result = self.run_macro(template_bundle.template, "macro_name", "arg1", "arg2")

    # Method 2: Relation-first pattern (very common in dbt-databricks)
    # Use for macros that follow the pattern macro(relation, other_args...)
    # Automatically passes the test relation as the first argument
    result = self.render_bundle(template_bundle, "create_table_as", "select 1")

    # Method 3: Raw execution (preserves whitespace/formatting)
    # Use when you need to test exact formatting, spacing, or newlines
    # Most tests should use run_macro() instead, which normalizes whitespace
    raw_result = self.run_macro_raw(template_bundle.template, "macro_name", "arg1")

    # Method 4: Custom helper method for complex macros
    # Use for macros with many arguments or complex setup
    # Makes tests more readable and reduces duplication
    result = self.render_create_view_as(template_bundle, sql="select 1")
```

```python
def render_create_view_as(self, template_bundle, sql="select 1"):
    """Helper method reduces complexity in individual test methods"""
    return self.run_macro(
        template_bundle.template,
        "databricks__create_view_as",  # Full macro name with adapter prefix
        template_bundle.relation,      # Relation as first argument
        sql                            # SQL to wrap in view
    )
```

--------------------------------

### V1 View Flow Diagram

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/view_flow.md

Illustrates the process for V1 view flow, including pre-hooks, dropping existing relations, creating or replacing views, applying grants, and post-hooks.

```mermaid
flowchart LR
    PRE[Run pre-hooks]
    DROP[Drop existing relation]
    CREATE[Create or replace view]
    GRANTS[Apply grants]
    TAGS[Apply tags via alter]
    POST[Run post-hooks]
    D1{Existing relation, and not view?}
    PRE-->D1
    D1--yes-->DROP
    D1--"no"-->CREATE
    DROP-->CREATE
    CREATE-->GRANTS
    GRANTS-->TAGS
    TAGS-->POST
```

--------------------------------

### Run Unit Tests

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Execute the project's unit tests using tox. This does not require a Databricks account and should be run before submitting a pull request.

```bash
tox -e unit
```

--------------------------------

### Run dbt-databricks Unit Tests with Hatch

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Executes only the unit tests, which are fast and do not require external dependencies. This is useful for quick checks during development.

```bash
# Unit tests only (fast)
hatch run unit
```

--------------------------------

### Configure dbt-databricks Test Environment Variables

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Sets up essential environment variables for connecting to Databricks test environments. Choose one authentication method and configure compute-specific paths and Unity Catalog settings as needed.

```bash
# Required for all environments
export DBT_DATABRICKS_HOST_NAME="your-workspace.databricks.com"

# Authentication (choose one method)
# Option 1: Personal Access Token
export DBT_DATABRICKS_TOKEN="your-token"

# Option 2: Databricks OAuth
export DBT_DATABRICKS_CLIENT_ID="your-client-id"
export DBT_DATABRICKS_CLIENT_SECRET="your-client-secret"

# Option 3: Azure AD OAuth
export DBT_DATABRICKS_AZURE_CLIENT_ID="your-azure-client-id"
export DBT_DATABRICKS_AZURE_CLIENT_SECRET="your-azure-client-secret"

# Compute-specific paths
export DBT_DATABRICKS_CLUSTER_HTTP_PATH="/sql/protocolv1/o/.../..."
export DBT_DATABRICKS_UC_CLUSTER_HTTP_PATH="/sql/protocolv1/o/.../..."
export DBT_DATABRICKS_UC_ENDPOINT_HTTP_PATH="/sql/warehouses/..."

# Unity Catalog settings
export DBT_DATABRICKS_UC_INITIAL_CATALOG="main"
export DBT_DATABRICKS_UC_INITIAL_SCHEMA="default_schema"
```

--------------------------------

### Get dbt artifact URL and download archive

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-jobs.md

Extract the dbt artifact URL from the task output and download the artifact archive using curl. Ensure you have the correct DBT_TASK_RUN_ID.

```bash
$ DBT_ARTIFACT_URL=$(databricks runs get-output --run-id DBT_TASK_RUN_ID | jq -r .dbt_output.artifacts_link)
$ curl $DBT_ARTIFACT_URL --output artifact.tar.gz
```

--------------------------------

### V1 Table Flow Diagram

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/table_flow.md

Illustrates the V1 table flow, detailing steps from pre-hooks to post-hooks, including decisions for relation existence, replaceability, and language.

```mermaid
flowchart LR
    PRE[Run pre-hooks]
    DROP[Drop existing relation]
    CSQL[create table...]
    CSQLD[create or replace table...]
    CPY[Create table with python]
    GRANTS[Apply grants]
    PYTBL[Apply tblproperties via alter]
    TAGS[Apply tags via alter]
    DOCS[Persist docs via alter]
    CONST[Apply constraints via alter]
    OPT[Run optimize]
    POST[Run post-hooks]
    D1{Existing relation?}
    D2{Replaceable?}
    D3{Language?}
    D4{Delta?}
    D5{Language?}
    PRE-->D1
    D1--yes-->D2
    D1--"no"-->D3
    D2--yes-->D3
    D2--"no"-->DROP
    DROP-->D3
    D3--SQL-->D4
    D3--Python-->CPY
    D4--yes-->CSQLD
    D4--"no"-->CSQL
    CPY-->GRANTS
    CSQLD-->GRANTS
    CSQL-->GRANTS
    GRANTS-->D5
    D5--Python-->PYTBL
    D5--SQL-->TAGS
    PYTBL-->TAGS
    TAGS-->DOCS
    DOCS-->CONST
    CONST-->OPT
    OPT-->POST
```

--------------------------------

### Build dbt-databricks Project

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Use Hatch to build the project's distribution packages. This command also cleans the dist folder before building.

```bash
hatch build -c
```

--------------------------------

### dbt Functional Test Class Example

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

A Python test class for dbt functional tests. It defines models using fixtures and includes a test method to run dbt and verify results.

```python
from dbt.tests import util
from tests.functional.adapter.my_feature import fixtures

class TestIncrementalModel:
    @pytest.fixture(scope="class")
    def models(self):
        return {
            "my_model.sql": fixtures.my_model_sql,
            "schema.yml": fixtures.my_schema_yml,
        }

    def test_incremental_run(self, project):
        results = util.run_dbt(["run"])
        assert len(results) == 1
        # Verify table exists and has expected data
        results = project.run_sql("select count(*) from my_model", fetch="all")
        assert results[0][0] == 1

```

--------------------------------

### Run Integration Tests with All-Purpose Cluster (UC)

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Execute integration tests using an all-purpose cluster with Unity Catalog enabled. This requires a Databricks account and specific compute resources.

```bash
tox -e integration-databricks-uc-cluster
```

--------------------------------

### Invoke databricks_copy_into Macro

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-copy-into-macro-aws.md

Load a CSV file from S3 into a Databricks Delta table using the `databricks_copy_into` macro. This example specifies temporary AWS credentials, file format options like header and schema merging, and copy options.

```sql
dbt run-operation databricks_copy_into --args "
target_table: <target_table>
source: 's3://<your_bucket>/path'
file_format: csv
source_credential:
  AWS_ACCESS_KEY: '<temp_access_key_id>'
  AWS_SECRET_KEY: '<temp_secret_access_key>'
  AWS_SESSION_TOKEN: '<temp_session_token>'
format_options:
  mergeSchema: 'true'
  header: true
copy_options:
  mergeSchema: 'true'
"
```

--------------------------------

### Use Capability System for Databricks Version Checks

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Prefer using the adapter's capability system over direct version comparisons for checking Databricks features. This ensures compatibility across different Databricks Runtime versions.

```python
if adapter.has_capability(DBRCapability.COMMENT_ON_COLUMN):
```

```jinja
{% if adapter.has_dbr_capability('comment_on_column') %}
```

--------------------------------

### Run Linters and Formatters with Tox

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Execute all linting and static type checks using Tox. This command should be run before committing changes to ensure code quality and style adherence.

```shell
tox -e linter
```

--------------------------------

### Run Integration Tests with SQL Warehouse (UC)

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Execute integration tests using a SQL Warehouse with Unity Catalog enabled. This requires a Databricks account and specific compute resources.

```bash
tox -e integration-databricks-uc-sql-endpoint
```

--------------------------------

### Run Code Quality Checks

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Executes code formatting, linting, and type-checking using Hatch. This command should be run before committing changes.

```bash
hatch run code-quality
```

--------------------------------

### SQL Comparison and Assertion Methods

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Explains three methods for comparing generated SQL: exact string comparison (preferred for handling differences), manual normalization for debugging, and partial matching for testing specific elements.

```python
def test_sql_comparison_methods(self, template_bundle):
    result = self.run_macro(template_bundle.template, "create_table",
                           template_bundle.relation)

    # Method 1: Exact string comparison (after normalization) - PREFERRED
    # Use for most tests - handles whitespace, case, and formatting differences
    # Provides clear error messages showing exactly what differs
    expected = "create table `some_database`.`some_schema`.`some_table` as (select 1)"
    self.assert_sql_equal(result, expected)

    # Method 2: Manual normalization for debugging
    # Use when assert_sql_equal fails and you need to see the normalized versions
    # Helpful for understanding why a test is failing
    clean_result = self.clean_sql(result)
    clean_expected = self.clean_sql(expected)
    assert clean_result == clean_expected, f"Expected: {clean_expected}, Got: {clean_result}"

    # Method 3: Partial matching for complex queries
    # Use when testing specific SQL elements without caring about exact formatting
    # Good for testing that required elements are present
    assert "`some_database`.`some_schema`.`some_table`" in result
    assert "create table" in result.lower()

    # Method 4: Complex validation
    # Use for testing multiple specific requirements
    lines = result.split('\n')
    assert any('CREATE TABLE' in line.upper() for line in lines)
    assert any('TBLPROPERTIES' in line.upper() for line in lines)
```

--------------------------------

### Run dbt-databricks Tests with Hatch

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Use hatch commands to specify the correct test profile for different Databricks environments when running tests from the command line.

```bash
hatch run cluster-e2e-dev    # Tests with databricks_cluster
```

```bash
hatch run uc-cluster-e2e-dev # Tests with databricks_uc_cluster (CI uses sharded prepare-shards path)
```

```bash
hatch run sqlw-e2e-dev       # Tests with databricks_uc_sql_endpoint (CI uses sharded prepare-shards path)
```

--------------------------------

### Hatch Environment Management Commands

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Provides commands for managing development environments with Hatch, including creation, removal, and entering the environment shell.

```bash
# Environment management
hatch env create
hatch env remove
hatch shell
```

--------------------------------

### Troubleshoot Hatch Environments

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

If Hatch is not reflecting changes in pyproject.toml, prune and recreate the environments to resolve issues.

```bash
hatch env prune
```

```bash
hatch env create
```

--------------------------------

### Run Relevant Tests

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Executes unit tests and functional tests as needed. Unit tests should always be run, while functional tests are for adapter changes.

```bash
hatch run unit

hatch run cluster-e2e-dev
```

--------------------------------

### Run Integration Tests with All-Purpose Cluster

Source: https://github.com/databricks/dbt-databricks/blob/main/CONTRIBUTING.MD

Execute integration tests using a standard all-purpose cluster. This requires a Databricks account and specific compute resources.

```bash
tox -e integration-databricks-cluster
```

--------------------------------

### V2 Table Flow Diagram

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/table_flow.md

Illustrates the V2 table flow, featuring more detailed steps like intermediate materialization, staging tables, and specific constraint/tagging applications.

```mermaid
flowchart LR
    PRE[Run pre-hooks]
    CLEANUP[Remove existing staging]
    INT[Create intermediate materialization of model via SQL]
    INTPY[Create intermediate materialization of model via Python]
    STAGE[Create staging table by model schema]
    FINAL[Create or replace target table by model schema]
    DROP[Drop existing relation]
    CHECK[Add check constraints via alter to target table]
    CHECK2[Add check constraints via alter to staging table]
    INSERT[Insert intermediate materialization into target table]
    INSERT2[Insert intermediate materialization into staging table]
    RENAME[Rename existing to backup]
    RENAME2[Rename staging to target]
    DROP2[Drop backup]
    TAGS[Apply tags via alter to target]
    TAGS2[Apply tags via alter to staging]
    GRANTS[Apply grants]
    OPT[Run optimize]
    POST[Run post-hooks]
    F1{{Flag: CanRename}}
    D0{Language?}
    D1{Existing relation?}
    D2{Replaceable?}
    CLEANUP-->PRE
    PRE-->D0
    D0--SQL-->INT
    D0--Python-->INTPY
    INT-->D1
    INTPY-->D1
    D1--yes-->F1
    D1--"no"-->FINAL
    F1--yes-->STAGE
    F1--"no"-->D2
    D2--"yes"-->FINAL
    D2--"no"-->DROP
    DROP-->FINAL
    FINAL-->CHECK
    STAGE-->CHECK2
    CHECK-->TAGS
    TAGS-->INSERT
    CHECK2-->TAGS2
    TAGS2-->INSERT2
    INSERT2-->RENAME
    RENAME-->RENAME2
    RENAME2-->DROP2
    DROP2-->GRANTS
    INSERT-->GRANTS
    GRANTS-->OPT
    OPT-->POST
```

--------------------------------

### Check dbt-databricks Version

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Display the current version of the dbt-databricks project managed by Hatch.

```bash
hatch version
```

--------------------------------

### Check Databricks Runtime Capabilities in Python

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Demonstrates how to check for specific Databricks Runtime (DBR) capabilities in Python code using the adapter's `has_capability` method.

```python
# In Python code
if adapter.has_capability(DBRCapability.ICEBERG):
    # Use Iceberg features

```

--------------------------------

### Essential Hatch Commands for Development

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Runs code quality checks, unit tests, and functional tests using Hatch. These commands ensure code integrity and functionality.

```bash
hatch run code-quality

hatch run unit

hatch run cluster-e2e-dev
```

--------------------------------

### Optional Macro Test Configuration

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Provides optional fixtures for loading additional Databricks or Spark templates, useful for macro dependencies.

```python
@pytest.fixture(scope="class")
def databricks_template_names(self) -> list:
    """Load additional Databricks templates for macro dependencies"""
    return ["adapters.sql", "relations/table.sql"]
```

```python
@pytest.fixture(scope="class")
def spark_template_names(self) -> list:
    """Load Spark templates when inheriting from dbt-spark macros"""
    return ["adapters.sql"]
```

--------------------------------

### Generate HTML Coverage Report

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Run the 'unit-with-cov' hatch command to generate an HTML report detailing test coverage. This report helps in identifying areas of the codebase that are not adequately tested.

```bash
hatch run unit-with-cov
# Opens HTML coverage report
```

--------------------------------

### Run dbt-databricks Development Commands

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Execute common development tasks for the dbt-databricks adapter, including code quality checks, unit tests, and functional tests. For specific tests, use pytest directly.

```bash
hatch run code-quality           # Format, lint, type-check
hatch run unit                   # Run unit tests
hatch run cluster-e2e-dev        # Run functional tests

# For specific tests, use pytest directly:
hatch run pytest path/to/test_file.py::TestClass::test_method -v
```

--------------------------------

### Databricks Adapter Inheritance Chain

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Illustrates the inheritance hierarchy of the DatabricksAdapter within the dbt ecosystem.

```text
DatabricksAdapter (impl.py)
  ↳ SparkAdapter (from dbt-spark)
    ↳ SQLAdapter (from dbt-core)
      ↳ BaseAdapter (from dbt-core)

```

--------------------------------

### Test Table Constraints with Arbitrary SQL

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

This test verifies table constraints by creating a table with constraints and querying the information schema to confirm their presence.

```python
    def test_table_constraints(self, project):
        # Create a table with constraints
        project.run_sql("""
            CREATE TABLE test_constraints (
                id INT NOT NULL,
                name STRING
            ) USING DELTA
        """)

        # Query information schema to verify constraints
        constraints = project.run_sql(
            f"""
            SELECT constraint_name, constraint_type
            FROM information_schema.table_constraints
            WHERE table_schema = '{project.test_schema}'
            AND table_name = 'test_constraints'
            """,
            fetch="all"
        )

        assert len(constraints) > 0
        assert any('NOT NULL' in str(constraint) for constraint in constraints)

```

--------------------------------

### Test SQL Generation Macro

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Test the SQL generation macro using string assertions without needing to mock adapter calls. This verifies that the macro produces the correct SQL string.

```python
def test_metadata_sql_generation(self, template_bundle):
    """Test SQL generation without mocking adapter calls"""
    result = self.run_macro(template_bundle.template, "get_table_metadata_sql",
                           template_bundle.relation)
    expected = "describe table extended `some_database`.`some_schema`.`some_table`"
    self.assert_sql_equal(result, expected)
```

--------------------------------

### Define SQL and YAML Fixtures for dbt Tests

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Use a Python file to define SQL models and YAML schemas for dbt functional tests. This promotes reusability and keeps tests clean.

```python
my_model_sql = """
{{ config(materialized='incremental', unique_key='id') }}
select 1 as id, 'test' as name
"""

my_schema_yml = """
version: 2
models:
  - name: my_model
    columns:
      - name: id
        description: 'ID column'
"""

```

--------------------------------

### Define Catalog for a Source in dbt

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/uc.md

Specify the Unity Catalog for a dbt source by using the 'catalog' keyword. Ensure schema is defined separately.

```yaml
version: 2

sources:
  - name: jaffle_shop
    catalog: ecommerce
    schema: product
    tables:
      - name: items
      - name: users
```

--------------------------------

### V2 View Flow Diagram

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/view_flow.md

Details the V2 view flow, which includes altering existing views, creating new views, applying grants, and running post-hooks. It also highlights the option to use a 'replace flow' for certain scenarios.

```mermaid
flowchart LR
    PRE[Run pre-hooks]
    ALTER[Alter existing view]
    CREATE[Create view]
    GRANTS[Apply grants]
    TAGS[Apply tags via alter]
    POST[Run post-hooks]
    REPLACE_FLOW[Use replace flow]
    D1{Existing relation?}
    D2{{Existing relation is view and update_via_alter?}}
    D4{Matches project definition?}
    PRE-->D1
    D1--yes-->D2
    D1--"no"-->CREATE
    D2--yes-->D4
    D2--"no"-->REPLACE_FLOW
    REPLACE_FLOW-->TAGS
    D4--yes-->GRANTS
    D4--"no"-->ALTER
    ALTER-->GRANTS
    CREATE-->TAGS
    TAGS-->GRANTS
    GRANTS-->POST
```

--------------------------------

### Define SQL and CSV Fixtures Separately

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

This pattern separates SQL and CSV content into dedicated fixture files for improved readability and reusability across multiple test files.

```python
# tests/functional/adapter/incremental/fixtures.py
incremental_model_sql = """
{{ config(materialized='incremental', unique_key='id') }}
select 1 as id, 'Alice' as name, 'active' as status
union all
select 2 as id, 'Bob' as name, 'inactive' as status
union all
select 3 as id, 'Charlie' as name, 'active' as status
"""

expected_result_csv = """id,name,status
1,Alice,active
2,Bob,inactive
3,Charlie,active"""

```

```python
# tests/functional/adapter/incremental/test_incremental.py
from tests.functional.adapter.incremental import fixtures

class TestIncrementalModel:
    @pytest.fixture(scope="class")
    def seeds(self):
        return {"expected_result.csv": fixtures.expected_result_csv}

    @pytest.fixture(scope="class")
    def models(self):
        return {"incremental_model.sql": fixtures.incremental_model_sql}

    def test_incremental_behavior(self, project):
        # Test implementation same as above

```

--------------------------------

### Test Macro Execution Logic

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Test the execution macro by mocking the `run_query` function to verify that it correctly processes and returns results. This focuses on the logic within the execution macro.

```python
def test_metadata_execution(self, template_bundle):
    """Test execution logic with minimal mocking"""
    template_bundle.context["run_query"] = Mock(
        return_value=Mock(table=[["col1", "string"], ["col2", "int"]])
    )
    result = self.run_macro(template_bundle.template, "get_table_metadata",
                           template_bundle.relation)
    # Test that it returns the table results correctly
    assert len(result) == 2
```

--------------------------------

### Check Databricks Runtime Capabilities in Jinja Macros

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Shows how to conditionally execute SQL based on Databricks Runtime (DBR) capabilities within Jinja macros.

```jinja
{% if adapter.has_dbr_capability('comment_on_column') %}
    COMMENT ON COLUMN ...
{% else %}
    ALTER TABLE ... ALTER COLUMN ...
{% endif %}

{% if adapter.has_dbr_capability('insert_by_name') %}
    INSERT INTO table BY NAME SELECT ...
{% else %}
    INSERT INTO table SELECT ... -- positional
{% endif %}

```

--------------------------------

### Manipulate Model Configuration for Macro Testing

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Use this when testing macros that read model configuration or adapter settings. It allows simulating different table properties, storage formats, or clustering configurations.

```python
def test_macro_with_config(self, config, template_bundle):
    """Test how macros respond to different model configurations"""
    config["tblproperties"] = {"key": "value"}        # Model-level table properties
    config["file_format"] = "delta"                   # Storage format setting
    config["liquid_clustering"] = ["col1", "col2"]    # Clustering configuration

    result = self.run_macro(template_bundle.template, "create_table_as",
                           template_bundle.relation, "select 1")
    # Verify the generated SQL includes the configuration
    assert "tblproperties" in result
    assert "delta" in result.lower()
```

--------------------------------

### Test Materialization Macro

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Pattern for testing materialization macros. This involves setting configuration options like 'materialized' and 'file_format' before rendering the macro.

```python
# Pattern 1: Testing materialization macros
def test_materialization_macro(self, config, template_bundle):
    config["materialized"] = "table"
    config["file_format"] = "delta"
    result = self.render_bundle(template_bundle, "materialization_macro")
```

--------------------------------

### Create Temporary Table for Testing

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Use the project's test schema to create temporary tables for testing purposes. This ensures that test data does not persist and is isolated to the test run.

```python
def test_creates_temp_table(self, project):
    table_name = f"{project.database}.{project.test_schema}.temp_table"
    project.run_sql(f"CREATE TABLE {table_name} (id INT)")
    # Test logic
```

--------------------------------

### Run dbt-databricks Functional Tests with Hatch

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Executes functional (integration) tests against specific Databricks environments for local development. These tests require a connection to a Databricks cluster or SQL Warehouse.

```bash
# Functional tests by environment (local development)
hatch run cluster-e2e-dev       # HMS cluster (CI runs this sharded)
hatch run uc-cluster-e2e-dev    # Unity Catalog cluster (CI runs this sharded)
hatch run sqlw-e2e-dev          # SQL Warehouse (CI runs this sharded)
```

--------------------------------

### Run dbt-databricks Unit Tests Across Python Versions with Hatch

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Executes unit tests across multiple Python versions (3.9-3.12) to ensure compatibility. This command is typically used to verify broader Python environment support.

```bash
# Unit tests across Python versions (for compatibility)
hatch run test:unit             # All Python versions (3.9-3.12)
```

--------------------------------

### Hatch Testing Commands

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/dbt-databricks-dev.md

Commands for running various types of tests using Hatch, including unit tests across different Python versions and functional tests for different environments.

```bash
# Testing
hatch run unit
hatch run test:unit
hatch run cluster-e2e-dev
hatch run uc-cluster-e2e-dev
hatch run sqlw-e2e-dev
```

--------------------------------

### Test Temporary AWS Credentials

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-copy-into-macro-aws.md

Verify that your temporary AWS credentials are valid by listing objects in a specified S3 bucket and path. Replace placeholders with your temporary credentials and bucket details.

```bash
$ AWS_ACCESS_KEY_ID=<temp_access_key_id> AWS_SECRET_ACCESS_KEY=<temp_secret_access_key> AWS_SESSION_TOKEN=<temp_session_token> aws s3 ls s3://<your_bucket>/path
```

--------------------------------

### Test Adapter Dispatch Macro

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Pattern for testing adapter-specific macros. This involves calling a macro that is expected to be dispatched by the adapter.

```python
# Pattern 2: Testing adapter dispatch
def test_adapter_specific_macro(self, template_bundle):
    result = self.run_macro(template_bundle.template, "databricks__specific_macro")
```

--------------------------------

### Test Macros with Database Operations

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Test macros that interact with the database during rendering by mocking adapter calls. This allows testing macro logic without actual database execution.

```python
class TestAdvancedMacroPatterns(MacroTestBase):
    @pytest.fixture(scope="class")
    def template_name(self) -> str:
        return "adapters.sql"

    @pytest.fixture(scope="class")
    def databricks_template_names(self) -> list:
        # Load dependencies when testing complex macros that call other macros
        # Essential for materialization macros that use table/view creation macros
        return ["relations/table.sql", "materializations/table.sql"]

    @pytest.fixture(autouse=True, scope="function")
    def setup_mocks(self, context):
        """Auto-setup common mocks for all tests in this class

        Use autouse fixtures when all tests in a class need the same mocks.
        Reduces repetition and ensures consistent test setup.
        """
        context["adapter"].get_columns_in_relation = Mock(return_value=[])
        context["load_result"] = Mock(return_value={"data": []})

    def test_macro_with_database_operations(self, template_bundle):
        """Test macros that make adapter calls during rendering

        Some macros call adapter.run_query() to get metadata or check table state.
        Mock these calls to test the macro logic without database dependencies.
        """
        # Mock adapter calls that the macro will make
        template_bundle.context["adapter"].run_query = Mock(
            return_value=Mock(table=[["result1"], ["result2"]])
        )

        result = self.run_macro(template_bundle.template, "get_table_metadata",
                               template_bundle.relation)

        # Verify both the SQL generated and adapter calls made
        expected_sql = "describe table extended `some_database`.`some_schema`.`some_table`"
        self.assert_sql_equal(result, expected_sql)

        # Verify adapter was called correctly (important for side-effect testing)
        template_bundle.context["adapter"].run_query.assert_called_once()
```

--------------------------------

### Upgrade dbt-databricks Adapter

Source: https://github.com/databricks/dbt-databricks/blob/main/README.md

Upgrade to the latest version of the dbt-databricks adapter using pip.

```bash
pip install --upgrade dbt-databricks
```

--------------------------------

### Jinja Macro for Multi-Statement SQL

Source: https://github.com/databricks/dbt-databricks/blob/main/AGENTS.md

Use this pattern to define a macro that returns a list of SQL statements for multi-statement execution. The incremental materialization automatically handles lists of statements.

```jinja
{% macro my_multi_statement_strategy(args) %}
  {%- set statements = [] -%}
  
  {#-- Build first statement --#}
  {%- set statement1 -%}
    DELETE FROM {{ target_relation }}
    WHERE some_condition
  {%- endset -%}
  {%- do statements.append(statement1) -%}
  
  {#-- Build second statement --#}
  {%- set statement2 -%}
    INSERT INTO {{ target_relation }}
    SELECT * FROM {{ source_relation }}
  {%- endset -%}
  {%- do statements.append(statement2) -%}
  
  {{- return(statements) -}}
{% endmacro %}
```

--------------------------------

### dbt-databricks Test Directory Structure

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Illustrates the standard organization for unit and functional tests within the dbt-databricks project. Includes conftest.py for global pytest configuration and profiles.py for test profile settings.

```directory
tests/
├── unit/                     # Unit tests (no external dependencies)
│   ├── api_client/           # API client functionality
│   ├── relation_configs/     # Relation configuration logic
│   ├── python/               # Python model handling
│   ├── macros/               # Macro testing framework
│   └── test_adapter.py       # Core adapter functionality
├── functional/               # Integration tests (requires Databricks)
│   └── adapter/              # Adapter-specific functionality
│       ├── basic/            # Basic model operations
│       ├── incremental/      # Incremental materialization
│       ├── python_model/     # Python model execution
│       ├── streaming_tables/ # Streaming table features
│       └── ...               # Feature-specific test suites
├── conftest.py               # Global pytest configuration, including the default compute used when running tests in your IDE
└── profiles.py               # Test profile configurations
```

--------------------------------

### Mock External Dependencies for Macro Testing

Source: https://github.com/databricks/dbt-databricks/blob/main/docs/testing.md

Use this when testing macros that call other dbt functions or adapter methods. It isolates the macro under test from external dependencies by mocking responses.

```python
def test_macro_with_context_mocks(self, template_bundle):
    """Mock external dependencies to isolate macro behavior"""
    # Mock dbt built-in functions that the macro calls
    template_bundle.context["get_columns_in_query"] = Mock(return_value=[])
    template_bundle.context["column_mask_exists"] = Mock(return_value=False)

    # Mock adapter methods to simulate database state
    template_bundle.context["adapter"].get_relation = Mock(return_value=None)
    template_bundle.context["adapter"].run_query = Mock(return_value=Mock(table=[]))

    result = self.run_macro(template_bundle.template, "my_macro",
                           template_bundle.relation)
    # Test that the macro handles the mocked responses correctly
```