### Install and Start DataJunction UI Source: https://github.com/datajunction/dj/blob/main/datajunction-ui/README.md Install project dependencies and start the DataJunction UI development server. ```bash yarn install yarn start ``` -------------------------------- ### Start JupyterLab Server Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/running-locally.md Navigate to the notebooks directory and start the JupyterLab server to access example notebooks. ```sh cd notebooks jupyter lab ``` -------------------------------- ### Install JupyterLab Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/running-locally.md Install JupyterLab using pip, which is required to run the example notebooks. ```sh pip install jupyterlab ``` -------------------------------- ### Clone Repository and Start Docker Compose Source: https://github.com/datajunction/dj/blob/main/datajunction-query/README.rst Clone the DJQS repository and start the Docker Compose environment to get started. ```sh git clone https://github.com/DataJunction/djqs cd djqs docker compose up ``` -------------------------------- ### Install DJ from Source Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Install the DataJunction library from its source code. ```python pip install . ``` -------------------------------- ### Launch Default Docker Compose Environment Source: https://github.com/datajunction/dj/blob/main/datajunction-server/README.md Starts the DataJunction UI with a minimal backend. Use this for basic setup. ```sh docker compose up ``` -------------------------------- ### Start Docker Compose Demo Environment Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/running-locally.md Start the full DataJunction demo environment, including the UI, API, reflection service, query service, and JupyterLab. ```sh cd dj/ docker compose --profile demo up ``` -------------------------------- ### GraphQL GET Endpoint Example Response Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of a successful response from the GET /graphql endpoint, which serves the GraphiQL IDE. ```json null ``` -------------------------------- ### Get Node Example Response Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md This is an example of a successful response when retrieving node information using the GET /nodes/{name} endpoint. It details the structure and fields returned for a node. ```json { "namespace": "string", "node_revision_id": 0, "node_id": 0, "type": "source", "name": "string", "display_name": "string", "version": "string", "status": "valid", "mode": "published", "catalog": { "name": "string", "engines": [] }, "schema_": "string", "table": "string", "description": "", "query": "string", "availability": { "min_temporal_partition": [], "max_temporal_partition": [], "catalog": "string", "schema_": "string", "table": "string", "valid_through_ts": 0, "url": "string", "categorical_partitions": [], "temporal_partitions": [], "partitions": [] }, "columns": [ { "name": "string", "display_name": "string", "type": "string", "attributes": [ { "attribute_type": { "namespace": "string", "name": "string" } } ], "dimension": { "name": "string" }, "partition": { "type_": "temporal", "format": "string", "granularity": "string", "expression": "string" } } ], "updated_at": "2019-08-24T14:15:22Z", "materializations": [ { "name": "string", "config": {}, "schedule": "string", "job": "string", "backfills": [ { "spec": { "column_name": "string", "values": [ null ], "range": [ null ] }, "urls": [ "string" ] } ], "strategy": "string" } ], "parents": [ { "name": "string" } ], "metric_metadata": { "direction": "higher_is_better", "unit": { "name": "string", "label": "string", "category": "string", "abbreviation": "string", "description": "string" } }, "dimension_links": [ { "dimension": { "name": "string" }, "join_type": "left", "join_sql": "string", "join_cardinality": "one_to_one", "role": "string", "foreign_keys": { "property1": "string", "property2": "string" } } ], "created_at": "2019-08-24T14:15:22Z", "tags": [], "current_version": "string", "missing_table": false } ``` -------------------------------- ### DJ Server Environment Variables Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/deploying-dj/running-a-dj-server.md Example of a .env file for configuring a DJ server. Demonstrates setting nested configuration options using double underscores. ```bash QUERY_SERVICE=http://djqs:8001 SECRET=a-fake-secretkey NODE_LIST_MAX=10000 # Writer DB (required) WRITER_DB__URI=postgresql+psycopg://dj:dj@postgres_metadata:5432/dj WRITER_DB__POOL_SIZE=20 WRITER_DB__MAX_OVERFLOW=20 WRITER_DB__POOL_TIMEOUT=10 WRITER_DB__CONNECT_TIMEOUT=5 WRITER_DB__POOL_PRE_PING=true WRITER_DB__ECHO=false WRITER_DB__KEEPALIVES=1 WRITER_DB__KEEPALIVES_IDLE=30 WRITER_DB__KEEPALIVES_INTERVAL=10 WRITER_DB__KEEPALIVES_COUNT=5 # Reader DB (optional) READER_DB__URI=postgresql+psycopg://dj:dj@postgres_metadata:5432/dj READER_DB__POOL_SIZE=10 READER_DB__MAX_OVERFLOW=10 READER_DB__POOL_TIMEOUT=5 READER_DB__CONNECT_TIMEOUT=5 READER_DB__POOL_PRE_PING=true READER_DB__ECHO=false READER_DB__KEEPALIVES=1 READER_DB__KEEPALIVES_IDLE=30 READER_DB__KEEPALIVES_INTERVAL=10 READER_DB__KEEPALIVES_COUNT=5 ``` -------------------------------- ### Real-Time Deployment Output Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/yaml.md This example shows the typical real-time output observed in the terminal when running the `dj push` command. It includes a status table with UUID, Status, and Progress. ```text Pushing project from: ./my-project ┌─────────────────────────────────────────────────────────────┐ │ Deployment Status │ ├──────────────────────┬─────────────┬────────────────────────┤ │ UUID │ Status │ Progress │ ├──────────────────────┼─────────────┼────────────────────────┤ │ abc123-def456-... │ RUNNING │ Deploying nodes... │ └──────────────────────┴─────────────┴────────────────────────┘ Deployment finished: SUCCESS ``` -------------------------------- ### Install DataJunction with MCP Extra Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/ai-assistants.md Install the DataJunction Python client with the MCP extra to enable integration with AI assistants. ```bash pip install datajunction[mcp] ``` -------------------------------- ### Build Specific Documentation Version Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Example of building the documentation for version 0.1.2 and setting it as the latest. ```sh ./build-docs.sh 0.1.2 true Start building sites … Total in 2496 ms ``` -------------------------------- ### GenericMaterializationConfigInput Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the GenericMaterializationConfigInput schema, used for materialization configurations. ```json { "spark": {}, "lookback_window": "string" } ``` -------------------------------- ### Get Help with DataJunction CLI Commands Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/using-the-cli.md Display all available CLI commands or get detailed help for a specific command. ```bash # Show all available commands dj --help # Show help for specific command dj push --help dj delete-node --help ``` -------------------------------- ### Example Dev Release Output (Python Client) Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/releasing-new-version.md This example shows the output of `make dev-release` for the Python client, including versioning, build artifacts, and publish success messages. ```shell % make dev-release hatch version dev Old: 0.0.1a20 New: 0.0.1a20.dev0 hatch build [sdist] dist/datajunction-0.0.1a20.dev0.tar.gz [wheel] dist/datajunction-0.0.1a20.dev0-py3-none-any.whl hatch publish dist/datajunction-0.0.1a20.dev0.tar.gz ... success dist/datajunction-0.0.1a20.dev0-py3-none-any.whl ... success [datajunction] https://pypi.org/project/datajunction/0.0.1a20.dev0/ ``` -------------------------------- ### GenericCubeConfigInput Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the GenericCubeConfigInput schema, used for configuring cubes. ```json { "spark": {}, "lookback_window": "string", "dimensions": [ "string" ], "measures": { "property1": { "metric": "string", "measures": [ { "name": "string", "field_name": "string", "agg": "string", "type": "string" } ], "combiner": "string" }, "property2": { "metric": "string", "measures": [ { "name": "string", "field_name": "string", "agg": "string", "type": "string" } ], "combiner": "string" } }, "metrics": [ { "name": "string", "type": "string", "column": "string", "node": "string", "semantic_entity": "string", "semantic_type": "string" } ] } ``` -------------------------------- ### List Catalogs Response Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md This is an example of a successful response when listing available catalogs. It shows the expected structure for catalog information. ```json [ { "name": "string", "engines": [] } ] ``` -------------------------------- ### Install DataJunction Javascript Client (CommonJS) Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/clients.md Install the DataJunction Javascript client using npm for use in Node.js projects. This command installs the necessary package. ```sh npm install datajunction ``` -------------------------------- ### Build Documentation Version (Not Latest) Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Example of building the documentation for version 0.1.1 without setting it as the latest. ```sh ./build-docs.sh 0.1.1 false Start building sites … Total in 2472 ms ``` -------------------------------- ### Install DataJunction Python Client Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/clients.md Install the DataJunction Python client using pip. This is the first step to using the client library in your Python projects. ```sh pip install datajunction ``` -------------------------------- ### Custom Cache Implementation Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/deploying-dj/caching.md An example of a custom cache class `MyCustomCache` that extends `CacheInterface`. It includes placeholders for custom initialization and caching logic. ```python from fastapi import Request from datajunction_server.internal.caching.noop_cache import noop_cache class MyCustomCache(CacheInterface): """A custom cache implementation""" def __init__(self): # Initialize your custom cache here ... def get(self, key: str) -> Optional[Any]: # Implement the logic to retrieve a cached value ... def set(self, key: str, value: Any, timeout: int = 300) -> None: # Implement the logic to cache a value ... def delete(self, key: str) -> None: # Implement the logic to delete a cache key ... def get_cache(request: Request) -> Optional[CacheInterface]: """Dependency for retrieving a custom cache implementation""" cache_control = request.headers.get("Cache-Control", "") skip_cache = "no-cache" in cache_control return noop_cache if skip_cache else MyCustomCache() ``` -------------------------------- ### Build Older Documentation Version Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Example of building the documentation for an older version, 0.1.0, without setting it as the latest. ```sh ./build-docs.sh 0.1.0 false Start building sites … Total in 2549 ms ``` -------------------------------- ### Dimension Link YAML Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/yaml.md Example demonstrating both join and reference type dimension links within a node's YAML configuration. Includes optional fields like default_value for join links. ```yaml description: Hard hat dimension display_name: Local Hard Hats query: ... primary_key: ... dimension_links: - type: join node_column: state_id dimension_node: ${prefix}roads.us_state default_value: Unknown # Optional: fallback for NULL values from LEFT JOIN - type: reference node_column: birth_date dimension: ${prefix}roads.date_dim.dateint role: birth_date ``` -------------------------------- ### Add Engines To Catalog Request Body Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md Example of the request body format for adding one or more engines to a catalog. Each engine object requires a name, version, URI, and dialect. ```json [ { "name": "string", "version": "string", "uri": "string", "dialect": "spark" } ] ``` -------------------------------- ### Clone DataJunction Demo Notebooks Repository Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/running-locally.md Clone the DataJunction demo repository to access example notebooks. ```sh git clone git@github.com:DataJunction/dj-demo.git ``` -------------------------------- ### Column Partition Configuration Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/yaml.md Example of configuring partition settings for a column, including format, granularity, and type. ```yaml columns: - name: utc_date partition: format: yyyyMMdd granularity: day type_: temporal ``` -------------------------------- ### Install DataJunction Claude Integration Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Installs the DataJunction skill and configures the Claude Code integration, including the DJ MCP server. ```bash dj setup-claude ``` -------------------------------- ### Add Catalog Request Body Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md This is an example of the request body structure required to add a new catalog. It includes the name and engines for the catalog. ```json { "name": "string", "engines": [] } ``` -------------------------------- ### Launch Full Docker Compose Environment with Demo Profile Source: https://github.com/datajunction/dj/blob/main/datajunction-server/README.md Starts the full suite of DataJunction services, including query and reflection services. Use the 'demo' profile for comprehensive testing. ```sh docker compose --profile demo up ``` -------------------------------- ### Install Only DataJunction Skill Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Installs only the DataJunction skill for Claude Code, skipping the MCP server setup. ```bash dj setup-claude --no-mcp ``` -------------------------------- ### Push with Real-Time Output Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/yaml.md This command pushes project files and demonstrates the real-time output you can expect during the deployment process. It helps in monitoring progress and identifying issues. ```sh dj push ./example_project --namespace my.namespace ``` -------------------------------- ### List Tags API Response Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md Example response for the GET /tags endpoint, detailing the structure of a successful tag list. ```json [ { "description": "string", "display_name": "string", "tag_metadata": {}, "name": "string", "tag_type": "string" } ] ``` -------------------------------- ### Example 200 Response for Get Data Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md Illustrates the structure of a successful response when retrieving data via a DJ SQL query. Includes query details, execution times, results, and pagination links. ```json { "id": "string", "engine_name": "string", "engine_version": "string", "submitted_query": "string", "executed_query": "string", "scheduled": "2019-08-24T14:15:22Z", "started": "2019-08-24T14:15:22Z", "finished": "2019-08-24T14:15:22Z", "state": "UNKNOWN", "progress": 0, "output_table": { "catalog": "string", "schema": "string", "table": "string" }, "results": [ { "sql": "string", "columns": [ { "name": "string", "type": "string", "column": "string", "node": "string", "semantic_entity": "string", "semantic_type": "string" } ], "rows": [ [ null ] ], "row_count": 0 } ], "next": "http://example.com", "previous": "http://example.com", "errors": [ "string" ], "links": [ "http://example.com" ] } ``` -------------------------------- ### Get Cube Information (JSON Response) Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md This is an example of a successful response when retrieving information about a specific cube using the DataJunction API. It details the cube's metadata, elements, dimensions, metrics, and materialization configurations. ```json { "node_revision_id": 0, "node_id": 0, "type": "source", "name": "string", "display_name": "string", "version": "string", "status": "valid", "mode": "published", "description": "", "availability": { "min_temporal_partition": [], "max_temporal_partition": [], "catalog": "string", "schema_": "string", "table": "string", "valid_through_ts": 0, "url": "string", "categorical_partitions": [], "temporal_partitions": [], "partitions": [] }, "cube_elements": [ { "name": "string", "display_name": "string", "node_name": "string", "type": "string", "partition": { "type_": "temporal", "format": "string", "granularity": "string", "expression": "string" } } ], "cube_node_metrics": [ "string" ], "cube_node_dimensions": [ "string" ], "query": "string", "columns": [ { "name": "string", "display_name": "string", "type": "string", "attributes": [ { "attribute_type": { "namespace": "string", "name": "string" } } ], "dimension": { "name": "string" }, "partition": { "type_": "temporal", "format": "string", "granularity": "string", "expression": "string" } } ], "updated_at": "2019-08-24T14:15:22Z", "materializations": [ { "name": "string", "config": {}, "schedule": "string", "job": "string", "backfills": [ { "spec": { "column_name": "string", "values": [ null ], "range": [ null ] }, "urls": [ "string" ] } ], "strategy": "string" } ], "tags": [ { "description": "string", "display_name": "string", "tag_metadata": {}, "name": "string", "tag_type": "string" } ] } ``` -------------------------------- ### Complete Development Workflow Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/yaml.md This sequence demonstrates a typical development workflow: pulling existing data, making changes, performing a dry run validation, pushing to a development namespace, and finally deploying to production. ```sh # 1. Export existing namespace to get started dj pull production.analytics ./my-project # 2. Make changes to YAML files # ... edit files ... # 3. Validate changes with dry run dj deploy ./my-project --dryrun # 4. Deploy to development namespace first dj push ./my-project --namespace development.analytics # 5. After testing, deploy to production dj push ./my-project --namespace production.analytics ``` -------------------------------- ### Install Only MCP Server Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Installs only the DJ MCP server for Claude Code integration, skipping the skill installation. ```bash dj setup-claude --no-skills ``` -------------------------------- ### ErrorCode Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the ErrorCode schema, which is an integer. ```json 0 ``` -------------------------------- ### Launch Hugo Server Locally Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Navigate to the docs directory and launch the Hugo server to preview the DataJunction specification page. ```sh cd docs hugo serve --contentDir=content/0.1.0 ``` -------------------------------- ### MaterializationJobTypeEnum Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of a MaterializationJobTypeEnum, detailing a Spark SQL materialization job. ```json { "name": "spark_sql", "label": "Spark SQL", "description": "Spark SQL materialization job", "allowed_node_types": [ "transform", "dimension", "cube" ], "job_class": "SparkSqlMaterializationJob" } ``` -------------------------------- ### HealthCheck Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the HealthCheck schema, indicating the status of a service. ```json { "name": "string", "status": "ok" } ``` -------------------------------- ### Start Docker Environment with Trino Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/dj-and-trino.md Use Docker Compose to launch the DataJunction query service and Trino containers. The `--profile demo` flag is required for the query service, and `--profile trino` is required for Trino. ```bash docker compose --profile demo --profile trino up ``` -------------------------------- ### HTTPValidationError Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the HTTPValidationError schema, used for validation errors. ```json { "detail": [ { "loc": [ "string" ], "msg": "string", "type": "string" } ] } ``` -------------------------------- ### Initialize DJ Client and Login Source: https://github.com/datajunction/dj/blob/main/datajunction-server/tests/api/files/client_test/include_client_setup.txt Initializes the DataJunction client with a given URL and performs basic authentication. ```python from datajunction import ( DJBuilder, Source, Dimension, Transform, Metric, Namespace, MetricUnit, MetricDirection, ColumnAttribute, ) DJ_URL = "http://test" dj = DJBuilder(DJ_URL) dj.basic_login("dj", "dj") ``` -------------------------------- ### AttributeOutput Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the AttributeOutput schema, specifying an attribute type. ```json { "attribute_type": { "namespace": "string", "name": "string" } } ``` -------------------------------- ### Install ANTLR4 Tools Source: https://github.com/datajunction/dj/blob/main/CONTRIBUTING.rst Installs the ANTLR generator tool required for generating parsers. ```sh pip install antlr4-tools ``` -------------------------------- ### Import an Existing Project Source: https://github.com/datajunction/dj/blob/main/notebooks/DJ Projects Tutorial.ipynb Import a DataJunction project from a specified namespace ('default') into a local directory ('./example_project'). Existing files will be ignored if 'ignore_existing_files' is set to True. ```python Project.pull(client=dj, namespace="default", target_path="./example_project", ignore_existing_files=True) ``` -------------------------------- ### Granularity Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the Granularity schema, representing a time dimension granularity. ```json "second" ``` -------------------------------- ### AttributeTypeIdentifier Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the AttributeTypeIdentifier schema, used for identifying attribute types. ```json { "namespace": "system", "name": "string" } ``` -------------------------------- ### AttributeTypeBase Schema Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md An example of the AttributeTypeBase schema, defining properties for an attribute type. ```json { "namespace": "system", "name": "string", "description": "string", "allowed_node_types": [ "source" ], "uniqueness_scope": [ "node" ], "id": 0 } ``` -------------------------------- ### Build Documentation Script Syntax Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md The build script requires a version and an optional flag to set it as the latest. Run without arguments for help. ```sh ./build-docs.sh ``` -------------------------------- ### Navigate to Docs Directory Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Change your current directory to the 'docs' folder within the cloned DJ repository. ```sh cd dj/docs ``` -------------------------------- ### Star Schema Dimensional Modeling Example Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/datajunction/skills/datajunction.md Illustrates a star schema with a fact table, dimension links to user, product, and date dimensions, and how a metric inherits these links. ```text [Fact Table: transactions] ├─ dimension_link: user_id → [Dimension: users] │ ├─ Attributes: country, signup_date, tier │ │ └─ dimension_link: signup_date → [Dimension: date] │ │ └─ Attributes: month, quarter, year │ └─ Attributes: country, signup_date, tier ├─ dimension_link: product_id → [Dimension: products] │ └─ Attributes: category, price, brand └─ dimension_link: date → [Dimension: dates] └─ Attributes: month, quarter, year [Metric: revenue] ├─ Defined on: transactions └─ Auto-inherits dimension links from transactions ├─ users.country (via user_id) ├─ products.category (via product_id) └─ dates.month (via date) ``` -------------------------------- ### Example JSON Response for Linking Dimension Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md This is an example of a successful JSON response when linking a dimension to a node. It indicates the operation was successful. ```json "string" ``` -------------------------------- ### Get Data for a Specific Metric Source: https://github.com/datajunction/dj/blob/main/CONTRIBUTING.rst Fetch data for a particular metric by its name. This is a basic GET request to the /data/{metric_name}/ endpoint. ```bash $ curl http://localhost:8000/data/avg_repair_price/ | jq ``` -------------------------------- ### Create Base Metrics for Ratio Calculation Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/datajunction/skills/datajunction-api.md Step 1 of creating a ratio metric: create the individual base metrics (e.g., clicks, impressions) using separate POST requests to the API. ```bash curl -X POST $DJ_URL/nodes/metric/ -d '{ "name": "finance.clicks", "query": "SELECT COUNT_IF(event = '\''click'\'') FROM finance.events", "owners": ["marketing@company.com"], "mode": "published" }' ``` ```bash curl -X POST $DJ_URL/nodes/metric/ -d '{ "name": "finance.impressions", "query": "SELECT COUNT_IF(event = '\''impression'\'') FROM finance.events", "owners": ["marketing@company.com"], "mode": "published" }' ``` -------------------------------- ### Initialize DJAdmin Client Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Instantiate the DJAdmin client to interact with the DataJunction server. Ensure the URL is correct for your environment. ```python from datajunction import DJAdmin djadmin = DJAdmin("http://localhost:8000") ``` -------------------------------- ### Load and Compile Project Source: https://github.com/datajunction/dj/blob/main/notebooks/DJ Projects Tutorial.ipynb Load a DataJunction project from a local directory and compile it. Compilation prepares the project for validation and deployment. ```python project = Project.load("./example_project") compiled_project = project.compile() ``` -------------------------------- ### DataJunction Node Type Examples Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/datajunction/skills/datajunction.md Examples of different node types used in DataJunction, including physical tables, transformations, dimensions, metrics, and cubes. ```text catalog.finance.transactions_table ``` ```text events.clean.user_events ``` ```text core.users ``` ```text core.dates ``` ```text finance.total_revenue ``` ```text finance.revenue_cube ``` -------------------------------- ### GraphQL GET Endpoint Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md Handles HTTP GET requests to the /graphql endpoint, typically used for interacting with the GraphiQL IDE or executing simple queries. ```APIDOC ## GET /graphql ### Description Handles HTTP GET requests to the /graphql endpoint, providing access to the GraphiQL integrated development environment or executing queries via GET. ### Method GET ### Endpoint /graphql ### Responses #### Success Response (200) - **Description**: The GraphiQL integrated development environment. #### Error Response (404) - **Description**: Not found if GraphiQL or query via GET are not enabled. ### Response Example ```json null ``` ``` -------------------------------- ### Serve Versions Page Locally Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Start a local Hugo development server specifically for the 'Versions' page, which acts as a sub-site listing release notes. ```sh hugo serve --contentDir content/versions/ ``` -------------------------------- ### List All Metrics (Python) Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/listing-metrics.md Use the DJClient in Python to list all available metrics. Ensure DJ_URL is set. ```python from datajunction import DJClient dj = DJClient(DJ_URL) metrics = dj.metrics() ``` -------------------------------- ### Spark Join Strategy Hint Example Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/dimension-links.md Demonstrates how to apply a single Spark join strategy hint to a dimension link. This hint is emitted immediately after SELECT using Spark's optimizer hint syntax. ```text SELECT /*+ BROADCAST(t2) */ t1.event_secs, t2.name AS user_name FROM events t1 LEFT JOIN user t2 ON t1.user_id = t2.id ``` -------------------------------- ### Get Table Columns for Reflection Source: https://github.com/datajunction/dj/blob/main/datajunction-query/README.rst Retrieve column names and types for a specific table using the GET /table/{table}/columns/ endpoint. This is useful for reflection services. ```sh curl -X 'GET' \ 'http://localhost:8001/table/djdb.roads.repair_orders/columns/?engine=sqlalchemy-postgresql&engine_version=15.2' \ -H 'accept: application/json' ``` -------------------------------- ### Example Dimension Nodes ER Diagram Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/data-modeling/dimensions.md This mermaid diagram illustrates the structure of three example dimension nodes: Country, Dispatcher, and Contractor. It shows their primary keys and attributes. ```mermaid %%{init: { "theme": "base", 'themeVariables': { 'primaryColor': '#ffefd0'}}}%% erDiagram "default.country | Country" { id int PK name str country_code str population long } "default.dispatcher | Dispatcher" { dispatcher_id int PK company_name str phone str } "default.contractor | Contractor" { contractor_id int PK company_name str contact_name str contact_title str address str city str state str postal_code str country str } ``` -------------------------------- ### Initialize DataJunction Full Python Client Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/clients.md Initialize the full DataJunction client in Python. This client allows for both querying and building/managing data pipelines. ```py from datajunction import DJBuilder dj = DJBuilder("http://localhost:8000") ``` -------------------------------- ### Add Dependency with Version Constraint (Specific) Source: https://github.com/datajunction/dj/blob/main/CONTRIBUTING.rst When adding a new dependency with a version less than 1.0, pin the exact version in setup.cfg. This ensures stability if the dependency is expected to change frequently. ```config some-package==0.0.1 ``` -------------------------------- ### Create Spark Engine Source: https://github.com/datajunction/dj/blob/main/datajunction-query/README.rst Configure a 'spark' engine with its version and URI. The URI 'spark://local[*]' specifies a local Spark cluster. ```bash curl -X 'POST' \ 'http://localhost:8001/engines/' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "name": "spark", "version": "3.3.2", "uri": "spark://local[*]" }' ``` -------------------------------- ### Initialize DJ Builder Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Initializes the DJBuilder client. Use 'http://dj:8000' if running in the demo docker container. ```python from datajunction import DJBuilder djbuilder = DJBuilder("http://localhost:8000") ``` -------------------------------- ### Setup Claude Code with Custom DJ Server URL Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/ai-assistants.md Configure Claude Code with a custom DataJunction server URL by setting the DJ_URL environment variable before running the setup command. ```bash DJ_URL=https://dj.yourcompany.com dj setup-claude ``` -------------------------------- ### Create DuckDB Engine Source: https://github.com/datajunction/dj/blob/main/datajunction-query/README.rst Configure a 'duckdb' engine with its version and URI. The URI 'duckdb://local[*]' indicates a local DuckDB instance. ```bash curl -X 'POST' \ 'http://localhost:8001/engines/' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "name": "duckdb", "version": "0.7.1", "uri": "duckdb://local[*]" }' ``` -------------------------------- ### Selective Installation of Claude Components Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/getting-started/ai-assistants.md Perform selective installations of DataJunction components for Claude Code using command-line flags to include or exclude specific parts like MCP, skills, or agents. ```bash dj setup-claude --no-mcp ``` ```bash dj setup-claude --no-skills ``` ```bash dj setup-claude --no-agents ``` -------------------------------- ### Serve Specific Docs Version Locally Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/docs-development.md Start a local Hugo development server to preview a specific version of the DJ documentation (e.g., 0.1.0). The site will be available at http://localhost:1313/ by default. ```sh hugo serve --contentDir content/0.1.0/ ``` -------------------------------- ### Get A Metric Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/the-datajunction-api-specification.md Retrieves a specific metric by its name. ```APIDOC ## GET /metrics/{name} ### Description Return a metric by name. ### Method GET ### Endpoint /metrics/{name} ### Parameters #### Path Parameters - **name** (string) - Required - The name of the metric to retrieve. ### Response #### Success Response (200) - **id** (integer) - The metric's ID. - **name** (string) - The metric's name. - **display_name** (string) - The metric's display name. - **current_version** (string) - The current version of the metric. - **description** (string) - The metric's description. - **created_at** (string) - The timestamp when the metric was created. - **updated_at** (string) - The timestamp when the metric was last updated. - **query** (string) - The metric's query. - **upstream_node** (string) - The name of the upstream node. - **expression** (string) - The metric's expression. - **dimensions** (array) - A list of dimensions associated with the metric. - **metric_metadata** (object) - Metadata about the metric's direction and unit. - **required_dimensions** (array) - A list of dimensions required for the metric. ### Response Example ```json { "id": 0, "name": "string", "display_name": "string", "current_version": "string", "description": "", "created_at": "2019-08-24T14:15:22Z", "updated_at": "2019-08-24T14:15:22Z", "query": "string", "upstream_node": "string", "expression": "string", "dimensions": [ { "name": "string", "node_name": "string", "node_display_name": "string", "is_primary_key": true, "type": "string", "path": [ "string" ] } ], "metric_metadata": { "direction": "higher_is_better", "unit": { "name": "string", "label": "string", "category": "string", "abbreviation": "string", "description": "string" } }, "required_dimensions": [ "string" ] } ``` ``` -------------------------------- ### Get Engine Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Retrieves details for a specific engine. ```APIDOC ## Get Engine ### Description Retrieves details for a specific engine. ### Method ```python get_engine(name: str) ``` ### Endpoint N/A (Client method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body - **name** (str) - Required - The name of the engine to retrieve. ### Request Example ```python djadmin.get_engine(name="Spark") ``` ### Response #### Success Response (200) - **engine_details** (dict) - Details of the specified engine. ``` -------------------------------- ### Make Dev Release Source: https://github.com/datajunction/dj/blob/main/docs/content/0.1.0/docs/developers/releasing-new-version.md Run this command in the component directory to create a development release. It will output the version and target release URL. ```shell make dev-release ``` -------------------------------- ### Get Catalog Source: https://github.com/datajunction/dj/blob/main/datajunction-clients/python/README.md Retrieves details for a specific catalog. ```APIDOC ## Get Catalog ### Description Retrieves details for a specific catalog. ### Method ```python get_catalog(name: str) ``` ### Endpoint N/A (Client method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body - **name** (str) - Required - The name of the catalog to retrieve. ### Request Example ```python djadmin.get_catalog(name="my-new-catalog") ``` ### Response #### Success Response (200) - **catalog_details** (dict) - Details of the specified catalog. ```