### Setup Python Environment and Install Dependencies Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/ingestion/source/database/my_db/CONNECTOR_CONTEXT.md Set up the Python virtual environment and install development dependencies, including code generation tools. Always activate the environment before running commands. ```bash # From the root of the OpenMetadata project python3.11 -m venv env source env/bin/activate make install_dev generate ``` ```bash source env/bin/activate ``` -------------------------------- ### Prerequisites and Setup Commands Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Commands to check system requirements, activate virtual environments, install dependencies, and generate code. ```bash make prerequisites # Check system requirements ``` ```bash source env/bin/activate # ALWAYS activate venv first ``` ```bash cd ingestion && make install_dev_env # Install Python dev dependencies ``` ```bash make generate # Generate Pydantic models from JSON schemas ``` ```bash make yarn_install_cache # Install UI dependencies ``` -------------------------------- ### Build and Run Local Docker Environment (Full) Source: https://github.com/open-metadata/openmetadata/blob/main/skills/connector-building/GUIDE.md This command builds all components and starts a full local OpenMetadata stack using Docker. Use this for initial setup or after significant Java/UI changes. ```bash # Full build (first time or after Java/UI changes) ./docker/run_local_docker.sh -m ui -d mysql -s false -i true -r true ``` -------------------------------- ### Full Build and Deploy Local Docker Source: https://github.com/open-metadata/openmetadata/blob/main/skills/commands/test-locally.md Builds the Java backend, UI, and ingestion Docker image, then starts all OpenMetadata services. Use this for the first-time setup or when Java/UI changes are made. ```bash ./docker/run_local_docker.sh -m ui -d mysql -s false -i true -r true ``` -------------------------------- ### Start Frontend Development Server Source: https://github.com/open-metadata/openmetadata/blob/main/AGENTS.md Starts the React development server for the UI. Ensure you are in the correct directory. ```bash cd openmetadata-ui/src/main/resources/ui yarn start # Start development server on localhost:3000 ``` -------------------------------- ### Leverage Fixtures for Reusable Data Setup Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/playwright/PLAYWRIGHT_DEVELOPER_HANDBOOK.md Define reusable data setup logic in fixtures. This example shows how to create a user via API in a fixture, use it in tests, and clean it up afterward. ```typescript // In fixtures file export const test = base.extend({ testUser: async ({ apiContext }, use) => { const user = await apiContext.post('/api/v1/users', { data: userData }); await use(user); await apiContext.delete(`/api/v1/users/${user.id}`); }, }); ``` -------------------------------- ### Start OpenMetadata Stack with Jupyter and Postgres Source: https://github.com/open-metadata/openmetadata/blob/main/examples/python-sdk/data-quality/README.md This command starts the OpenMetadata stack, including Jupyter and Postgres instances, for running the data quality examples. ```bash ./start [-v ] ``` -------------------------------- ### Start Docker Development Services Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Start the necessary development services like MySQL and Elasticsearch using Docker Compose. Ensure Docker is installed and running. ```bash docker compose -f docker/development/docker-compose.yml up -d ``` -------------------------------- ### Install OpenMetadata Python SDK Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/sdk/README.md Install the SDK using pip. Additional packages can be installed for specific workloads like pandas or database connectors. ```bash pip install openmetadata-ingestion ``` ```bash pip install 'openmetadata-ingestion[pandas]' pip install 'openmetadata-ingestion[mysql]' pip install 'openmetadata-ingestion[postgres]' ``` -------------------------------- ### Install Required Tools Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/helm/README.md Installs necessary tools like Helm, kubectl, and minikube. Use brew for macOS or snap/apt for Ubuntu/Debian. ```bash # Install required tools brew install helm kubectl minikube docker-compose # Or on Ubuntu/Debian: # sudo snap install helm kubectl minikube # sudo apt-get install docker-compose ``` -------------------------------- ### Start Development Environment Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-airflow-apis/README.md Set the AIRFLOW_HOME environment variable and start the Airflow webserver for development. ```bash export AIRFLOW_HOME=$(pwd)/openmetadata-airflow-managed-api/development/airflow airflow webserver ``` -------------------------------- ### Start Local Docker Environment Source: https://github.com/open-metadata/openmetadata/blob/main/skills/pr-checklist/SKILL.md Command to start a local Docker environment for testing, including the UI and MySQL database. ```bash ./docker/run_local_docker.sh -m ui -d mysql ``` -------------------------------- ### Start OpenMetadata Server with Docker Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/tests/integration/sdk/README.md Use this command to start a local OpenMetadata server with a MySQL database for integration testing. ```bash # Start OpenMetadata server ./docker/run_local_docker.sh -m ui -d mysql ``` -------------------------------- ### Install Dev Tools and Format Source: https://github.com/open-metadata/openmetadata/blob/main/skills/standards/code_style.md If `make py_format` fails due to missing tools, install development dependencies and then run the formatting command. ```bash source env/bin/activate make install_dev generate make py_format ``` -------------------------------- ### Install Development Dependencies Source: https://github.com/open-metadata/openmetadata/blob/main/AGENTS.md Installs all necessary development dependencies for the project. ```bash make prerequisites # Check system requirements make install_dev_env # Install all development dependencies make yarn_install_cache # Install UI dependencies ``` -------------------------------- ### Install UI Dependencies Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Install frontend dependencies using Yarn. This command should be run after setting up the development environment. ```bash make yarn_install_cache ``` -------------------------------- ### Install UI Dependencies Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/README.md Installs the necessary dependencies for the OpenMetadata UI. This is a one-time task unless package.json changes. ```shell # installing dependencies > make yarn_install_cache ``` -------------------------------- ### Start OpenMetadata Server Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/README.md Starts the OpenMetadata server locally from the distribution directory. Ensure you have a 'target' directory within 'openmetadata-dist' containing the server distribution. ```shell ./bin/openmetadata-server-start.sh conf/openmetadata.yaml ``` -------------------------------- ### Start UI Development Server Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/README.md Starts the OpenMetadata UI development server locally. Access the UI at http://localhost:3000. ```shell # starting the UI locally > make yarn_start_dev_ui ``` -------------------------------- ### Start Distributed Test Environment Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/distributed-test/README.md Starts all services for the distributed test environment. Builds images on the first run. Use --build to force a rebuild. ```bash cd docker/development/distributed-test ./scripts/start.sh # Or force rebuild ./scripts/start.sh --build ``` -------------------------------- ### Quick Start Load Test Source: https://github.com/open-metadata/openmetadata/blob/main/bin/distributed-test/USAGE.md A sequence of commands to start the environment, load test data, trigger reindexing, monitor logs, and stop the environment. ```bash # 1. Start the environment ./scripts/start.sh # 2. Load test data (~50K entities) ./scripts/perf-test.sh --scale small --server http://localhost:8585 # 3. Trigger reindex ./scripts/trigger-reindex.sh # 4. Monitor logs ./scripts/logs.sh # 5. Stop the environment ./scripts/stop.sh ``` -------------------------------- ### Run Local OpenMetadata Servers Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/distributed-test/README.md Starts OpenMetadata servers locally. Can start all servers or specific ones by number. ```bash # Start all 3 servers in separate terminals ./local/run-local-servers.sh # Or specific servers ./local/run-local-servers.sh 1 2 ``` -------------------------------- ### Start Minikube Kubernetes Cluster Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/helm/README.md Starts a Minikube Kubernetes cluster with specified resources and enables the ingress addon. Verifies the cluster and nodes are ready. ```bash # Start minikube with sufficient resources minikube start --cpus 4 --memory 8192 --driver docker # Enable ingress addon (optional) minikube addons enable ingress # Verify cluster is ready kubectl cluster-info kubectl get nodes ``` -------------------------------- ### Start Dependencies for Local Development Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/distributed-test/README.md Starts only the dependencies (MySQL, OpenSearch) using Docker Compose for local development and debugging. ```bash cd docker/development/distributed-test docker compose -f local/docker-compose-deps.yml up -d ``` -------------------------------- ### Run Local Docker with MySQL Source: https://github.com/open-metadata/openmetadata/blob/main/docs/rdf-local-development.md Starts the standard local Docker environment with MySQL. Does not enable RDF or start Fuseki. ```bash cd /path/to/OpenMetadata ./docker/run_local_docker.sh -d mysql ``` -------------------------------- ### Run Local Docker with PostgreSQL Source: https://github.com/open-metadata/openmetadata/blob/main/docs/rdf-local-development.md Starts the standard local Docker environment with PostgreSQL. Does not enable RDF or start Fuseki. ```bash ./docker/run_local_docker.sh -d postgresql ``` -------------------------------- ### Start Local Dependencies with Docker Compose Source: https://github.com/open-metadata/openmetadata/blob/main/docker/development/helm/README.md Starts PostgreSQL and OpenSearch for local development using Docker Compose. Waits for services to be ready and verifies their status. ```bash cd docker/development/helm # Start PostgreSQL and OpenSearch docker-compose -f docker-compose-deps.yml up -d # Wait for services to be ready (2-3 minutes) docker-compose -f docker-compose-deps.yml logs -f # Verify services are running curl http://localhost:9200/_cluster/health docker exec openmetadata_postgres_test psql -U openmetadata_user -d openmetadata_db -c "SELECT 1" ``` -------------------------------- ### Install Python Ingestion Development Environment Source: https://github.com/open-metadata/openmetadata/blob/main/AGENTS.md Sets up the Python ingestion framework in development mode, including installing necessary dependencies. ```bash cd ingestion make install_dev_env # Install in development mode ``` -------------------------------- ### K8s Test Output Example Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-integration-tests/K8S_TESTS.md Example log output when Kubernetes tests are enabled, indicating K3s container startup and client configuration. ```text K8s tests enabled: true Starting K3s container with image: rancher/k3s:v1.28.5-k3s1 K3s container started K8s pipeline service client configured and ready ``` -------------------------------- ### Full Local Environment Setup Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Commands to launch the complete local OpenMetadata environment with UI and database options. ```bash ./docker/run_local_docker.sh -m ui -d mysql # Complete local setup with UI ``` ```bash ./docker/run_local_docker.sh -m no-ui -d postgresql # Backend only with PostgreSQL ``` ```bash ./docker/run_local_docker.sh -s true # Skip Maven build step ``` -------------------------------- ### Manual SDK Initialization with Configuration Object Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/sdk/README_IMPROVED.md For advanced setup or tests, manually initialize the SDK using 'OpenMetadataConfig' and 'OpenMetadata.initialize'. This provides more control over the client configuration. ```python from metadata.sdk import OpenMetadata, OpenMetadataConfig config = OpenMetadataConfig( server_url="http://localhost:8585/api", jwt_token="your-jwt-token", ) client = OpenMetadata.initialize(config) ``` -------------------------------- ### Install OpenMetadata Skills Plugin Source: https://github.com/open-metadata/openmetadata/blob/main/skills/README.md Install the Claude Code plugin for OpenMetadata skills from the repository root. This command loads all skills, agents, hooks, and commands, automatically enabling the OpenMetadata workflow on session start. ```bash # From the OpenMetadata repo root claude plugin install skills/ ``` -------------------------------- ### Start Airflow Webserver and Scheduler Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-airflow-apis/README.md Restart the Airflow webserver and scheduler after installing the plugin and creating the configuration directory. ```bash airflow webserver airflow scheduler ``` -------------------------------- ### Initialize OpenMetadata UI Boot Shell and Theme Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/index.html Sets the base path, applies dark mode if configured in local storage, and marks the application start time for performance monitoring. ```javascript window.BASE_PATH = '${basePath}'; try { if (localStorage.getItem('ui-theme') === 'dark') { document.documentElement.classList.add('dark-mode'); } } catch (e) {} try { performance.mark('appStart'); } catch (e) {} ``` -------------------------------- ### Handle Authentication and Fetch Test Cases Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/tests/load/README.md Implement the on_start hook to authenticate the user and fetch a list of test cases to be used in subsequent tasks. ```python from _openmetadata_testutils.helpers.login_user import login_user class TestCaseResultTasks(TaskSet): """Test case result resource load test""" [...] # type: ignore def on_start(self): """Get a list of test cases to fetch results for""" self.bearer = login_user(self.client) resp = self.client.get(f"{TEST_CASE_RESOURCE_PATH}", params={"limit": 100}, auth=self.bearer) json = resp.json() self.test_cases = json.get("data", []) ``` -------------------------------- ### Example: Get Entity as JSON-LD with curl Source: https://github.com/open-metadata/openmetadata/blob/main/docs/rdf-local-development.md Demonstrates fetching a table entity's RDF representation in JSON-LD format using curl. Replace '' with a valid authentication token and '' with the actual table ID. ```bash # Get a table entity as JSON-LD curl -s -H "Authorization: Bearer " \ "http://localhost:8585/api/v1/rdf/entity/table/" | jq ``` -------------------------------- ### Create, Retrieve, and List Users Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/sdk/README.md Demonstrates creating a new user, retrieving a user by name with specific fields, and listing all users in batches. ```python from metadata.generated.schema.api.teams.createUser import CreateUserRequest from metadata.sdk import Users request = CreateUserRequest( name="john.doe", email="john@example.com", displayName="John Doe", ) user = Users.create(request) user = Users.retrieve_by_name("john.doe", fields=["teams", "roles"]) for user in Users.list_all(batch_size=100): print(user.email) ``` -------------------------------- ### Python Client Wrapper for MyDash API Source: https://github.com/open-metadata/openmetadata/blob/main/skills/standards/connection.md This Python class `MyDashClient` acts as a wrapper for interacting with a hypothetical MyDash API. It handles authentication, making GET requests, and retrieving dashboard data. Ensure `requests` library is installed. ```python import requests class MyDashClient: def __init__(self, config: MyDashConnection, verify_ssl=None): self.config = config self._session = requests.Session() self._base_url = config.hostPort self._setup_auth() if verify_ssl is not None: self._session.verify = verify_ssl def _setup_auth(self): if self.config.token: self._session.headers["Authorization"] = ( f"Bearer {self.config.token.get_secret_value()}" ) def _get(self, endpoint: str, **kwargs): response = self._session.get(f"{self._base_url}{endpoint}", **kwargs) response.raise_for_status() return response.json() def test_access(self): """Raises on failure.""" self._get("/api/v1/health") def get_dashboards(self) -> list: return list(self._paginate("/api/v1/dashboards")) ``` -------------------------------- ### Frontend Development Commands Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Commands for starting the development server, running unit and E2E tests, linting, and building the frontend. ```bash cd openmetadata-ui/src/main/resources/ui ``` ```bash yarn start # Start development server on localhost:3000 ``` ```bash yarn test # Run Jest unit tests ``` ```bash yarn test path/to/test.spec.ts # Run a specific test file ``` ```bash yarn test:watch # Run tests in watch mode ``` ```bash yarn playwright:run # Run E2E tests ``` ```bash yarn lint # ESLint check ``` ```bash yarn lint:fix # ESLint with auto-fix ``` ```bash yarn build # Production build ``` -------------------------------- ### Workflow Definition with Task Lifecycle Node Source: https://github.com/open-metadata/openmetadata/blob/main/adr-incident-manager-governance-workflows.md An example of a complete workflow definition using the taskLifecycleNode. It includes trigger configuration, node definitions (start, taskLifecycleNode, automatedTask, end), and edges that define the workflow transitions based on task statuses. ```json { "name": "incident-lifecycle", "trigger": { "type": "eventBasedEntity", "config": { "entityTypes": ["TestCase"], "events": ["Updated"], "filter": { "TestCase": { "==": [{"var": "testCaseStatus"}, "Failed"] } } } }, "nodes": [ { "type": "startEvent", "name": "start" }, { "type": "taskLifecycleNode", "name": "incident", "config": { "template": "incident", "statuses": ["New", "Ack", "Assigned", "Resolved"], "terminal": ["Resolved"], "responsibles": { "source": "tableOwner" }, "ttl": "P30D" }}, { "type": "automatedTask", "subType": "sinkTask", "name": "notifySlack" }, { "type": "endEvent", "name": "end" } ], "edges": [ { "from": "start", "to": "incident" }, { "from": "incident", "to": "incident", "condition": { "status": "Ack" } }, { "from": "incident", "to": "notifySlack", "condition": { "status": "Assigned" } }, { "from": "notifySlack", "to": "incident" }, { "from": "incident", "to": "end", "condition": { "status": "Resolved" } } ] } ``` -------------------------------- ### Nested Describe Blocks with Setup Hooks Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/playwright/PLAYWRIGHT_DEVELOPER_HANDBOOK.md Demonstrates the execution order of `beforeAll` hooks within nested `describe` blocks. Place common or expensive setups in the outer block. ```typescript describe('Outer describe', () => { beforeAll(async () => { // Executes before all the tests inside inner describe 1 & 2 // Only common/expensive setups that are necessary for both the describe blocks should come in here. }); describe('Inner describe 1', () => { beforeAll(async () => { // Executes before all tests inside inner describe 1 }); }); describe('Inner describe 2', () => { beforeAll(async () => { // Executes before all tests inside inner describe 2 }); }); }); ``` -------------------------------- ### Install OpenMetadata Airflow Package Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-airflow-apis/README.md Install the openmetadata-airflow-managed-apis package using pip. ```bash pip install openmetadata-airflow-managed-apis ``` -------------------------------- ### Get Aggregations Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-sdk/README.md Perform an aggregation query on a specific index to get summarized data. ```java // Aggregations String aggregations = Search.aggregate("type:Table", "table_search_index", "database"); ``` -------------------------------- ### Install Ingestion Module Development Dependencies Source: https://github.com/open-metadata/openmetadata/blob/main/CLAUDE.md Install the ingestion module with all development dependencies. This is a prerequisite for running 'make generate'. Use 'make install_dev_env' for a full environment or 'make install_dev' for a lighter install. ```bash source env/bin/activate # Install ingestion module with all dev dependencies (required before make generate) cd ingestion make install_dev_env # Full dev environment (edit mode + all extras) # OR for lighter install: make install_dev # Just dev dependencies cd .. ``` -------------------------------- ### Install Playwright Browsers Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/README.md Run this command to install the necessary Playwright browsers for end-to-end testing. ```shell npx playwright install ``` -------------------------------- ### Install Specific Airflow Version Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-airflow-apis/README.md Install a specific version of Apache Airflow along with its constraints. ```bash pip install "apache-airflow==2.3.3" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.3.3/constraints-3.9.txt" ``` -------------------------------- ### Install openmetadata-ingestion Source: https://github.com/open-metadata/openmetadata/blob/main/examples/python-sdk/data-quality/notebooks/test_dataframe.ipynb Install the openmetadata-ingestion library with pandas and postgres connectors. Ensure version 1.11.0.0 or above is used. ```python !pip install "openmetadata-ingestion[pandas,postgres]>=1.11.0.0" ``` -------------------------------- ### Start OpenMetadata UI in Development Mode Source: https://github.com/open-metadata/openmetadata/blob/main/openmetadata-ui/src/main/resources/ui/README.md Initiates the OpenMetadata UI development server. This command assumes that the necessary prerequisites, including a running OpenMetadata server (either locally or remotely configured via DEV_SERVER_TARGET), are in place. ```shell make yarn_start_dev_ui ``` -------------------------------- ### Configure SDK with Host and JWT Token Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/sdk/README_IMPROVED.md Use the 'configure' function for most scripts to set up the SDK with the server host and JWT token. This initializes the default client used by facade classes. ```python from metadata.sdk import configure configure(host="http://localhost:8585/api", jwt_token="your-jwt-token") ``` -------------------------------- ### Install open-metadata-ingestion with Great Expectations Source: https://github.com/open-metadata/openmetadata/blob/main/ingestion/src/metadata/great_expectations/README.md Use this command to install the necessary subpackage for integrating OpenMetadata with Great Expectations. ```bash pip install open-metadata-ingestion[great-expectations] ``` -------------------------------- ### Set Up Python Dev Environment Source: https://github.com/open-metadata/openmetadata/blob/main/skills/connector-building/SKILL.md Activate the Python virtual environment and install development dependencies. Always activate the environment before running commands. ```bash python3.11 -m venv env source env/bin/activate make install_dev generate ``` ```bash source env/bin/activate ```