### Flytekit Quickstart Example Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/index.md A basic example demonstrating how to define tasks and a workflow in Flytekit. This code defines a 'sum' task, a 'square' task, and a 'my_workflow' that uses them. It then prints the output of the workflow with specific inputs. ```python from flytekit import task, workflow @task def sum(x: int, y: int) -> int: return x + y @task def square(z: int) -> int: return z * z @workflow def my_workflow(x: int, y: int) -> int: return sum(x=square(z=x), y=square(z=y)) print(f"my_workflow output: {my_workflow(x=1, y=2)}") ``` -------------------------------- ### Flytekit Quickstart Expected Output Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/index.md The expected output when running the Flytekit quickstart example. ```default my_workflow output: 5 ``` -------------------------------- ### Setup Development Environment Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Installs Flytekit dependencies and Flytekit in editable mode within a virtual environment. ```bash virtualenv ~/.virtualenvs/flytekit source ~/.virtualenvs/flytekit/bin/activate make setup ``` -------------------------------- ### Install Documentation Requirements Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Install the necessary Python packages for building and previewing the documentation locally. ```bash make doc-requirements.txt ``` -------------------------------- ### Install Flytekit Dolt Plugin and Dolt CLI Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dolt/README.md Install the necessary packages for the Flytekit Dolt plugin and the Dolt command-line tool. Ensure Dolt is installed globally. ```bash pip install flytekitplugins.dolt sudo bash -c 'curl -L https://github.com/github.com/dolthub/dolt/releases/latest/download/install.sh | sudo bash' ``` -------------------------------- ### Install Flytekit DGXC Lepton Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dgxc-lepton/README.md Install the plugin using pip. ```bash pip install flytekitplugins-dgxc-lepton ``` -------------------------------- ### Install Flytekit Weights and Biases Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-wandb/README.md Install the plugin using pip. This command should be run in your environment where Flytekit is installed. ```bash pip install flytekitplugins-wandb ``` -------------------------------- ### Install Flytekit Comet ML Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-comet-ml/README.md Install the plugin using pip. Ensure you have Flytekit installed. ```bash pip install flytekitplugins-comet-ml ``` -------------------------------- ### Install Flytekit Kubeflow MPI Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-kf-mpi/README.md Install the plugin using pip. ```bash pip install flytekitplugins-kfmpi ``` -------------------------------- ### Install Flytekit AWS Batch Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-batch/README.md Install the plugin using pip. This command should be run in your environment where Flytekit is installed. ```bash pip install flytekitplugins-awsbatch ``` -------------------------------- ### Install whylogs Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-whylogs/README.md Install the whylogs plugin for Flytekit using pip. ```bash pip install flytekitplugins-whylogs ``` -------------------------------- ### Install Flytekit Ray Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-ray/README.md Install the plugin using pip to enable Flyte backend integration with Ray. ```bash pip install flytekitplugins-ray ``` -------------------------------- ### Install Flytekit MMCloud Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-mmcloud/README.md Install the plugin using pip. This command is necessary to enable the MMCloud connector for Flyte. ```bash pip install flytekitplugins-mmcloud ``` -------------------------------- ### Install Flytekit Neptune Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-neptune/README.md Install the plugin using pip. For Neptune 2.x, use the `[legacy]` extra. ```bash pip install flytekitplugins-neptune ``` -------------------------------- ### Install Flytekit ONNX TensorFlow Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-onnx-tensorflow/README.md Use this command to install the plugin. Ensure you have pip available. ```bash pip install flytekitplugins-onnxtensorflow ``` -------------------------------- ### Run Airflow Sensors and Tasks in Flyte Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-airflow/README.md This example demonstrates how to use an Airflow FileSensor within a Flyte workflow. Ensure the Airflow plugin is installed. ```python from airflow.sensors.filesystem import FileSensor from flytekit import task, workflow @task() def t1(): print("flyte") @workflow def wf(): sensor = FileSensor(task_id="id", filepath="/tmp/1234") sensor >> t1() if __name__ == '__main__': wf() ``` -------------------------------- ### Install Flytekit Source: https://github.com/flyteorg/flytekit/blob/master/README.md Install the Flytekit Python package using pip. This is the primary step to begin using Flytekit. ```bash pip install flytekit ``` -------------------------------- ### Flytekit Configuration Examples Source: https://context7.com/flyteorg/flytekit/llms.txt Flytekit configuration can be loaded automatically from `~/.flyte/config.yaml` or environment variables. Examples show auto-loading, sandbox shortcut, and programmatic configuration. ```yaml # ~/.flyte/config.yaml admin: endpoint: dns:///flyte.example.com:81 authType: Pkce insecure: false console: endpoint: https://flyte.example.com storage: type: s3 stow: kind: s3 config: region: us-east-1 disable_ssl: false ``` ```python from flytekit.remote import FlyteRemote from flytekit.configuration import Config, PlatformConfig, DataConfig # Auto-load from file or environment variables remote = FlyteRemote(config=Config.auto()) # Sandbox shortcut sandbox_remote = FlyteRemote(config=Config.for_sandbox()) # Fully programmatic from flytekit.configuration import S3Config explicit_remote = FlyteRemote( config=Config( platform=PlatformConfig(endpoint="flyte.corp.io", insecure=False, auth_mode="client_credentials"), data_config=DataConfig(s3=S3Config(region="eu-west-1")), ), default_project="analytics", default_domain="production", ) ``` -------------------------------- ### Install AWS SageMaker Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-sagemaker/README.md Install the plugin using pip. This command adds the necessary packages to your Python environment. ```bash pip install flytekitplugins-awssagemaker ``` -------------------------------- ### Install FlyteKit OpenAI Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-openai/README.md Install the FlyteKit OpenAI plugin using pip. ```bash pip install flytekitplugins-openai ``` -------------------------------- ### Install FlyteKit Inference Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-inference/README.md Install the flytekitplugins-inference package using pip. ```bash pip install flytekitplugins-inference ``` -------------------------------- ### Install MLflow Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-mlflow/README.md Install the Flytekit MLflow plugin using pip. ```bash pip install flytekitplugins-mlflow ``` -------------------------------- ### Install a Specific Flytekit Plugin from a Commit Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Installs a specific Flytekit plugin (e.g., 'pod') from a given commit hash. ```bash pip install https://github.com/flyteorg/flytekit/archive/e128f66dda48bbfc6076d240d39e4221d6af2d2b.zip#subdirectory=plugins/pod&egg=flytekitplugins-pod ``` -------------------------------- ### Install Cert Manager Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-identity-aware-proxy/README.md Installs the cert-manager Helm chart to manage TLS certificates. Ensure CRDs are installed with `installCRDs=true`. ```bash helm repo add jetstack https://charts.jetstack.io helm repo update helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true ``` -------------------------------- ### Flytekit Plugin Auto-loading Entry Point Source: https://github.com/flyteorg/flytekit/blob/master/plugins/README.md Configuration snippet for `setup.py` to enable auto-loading of plugin modules upon installation. This is particularly useful for data and type plugins. ```python setup( entry_points={"flytekit.plugins": [f"{PLUGIN_NAME}=flytekitplugins.{PLUGIN_NAME}"]}, ) ``` -------------------------------- ### Install Flytekit Async fsspec Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-async-fsspec/README.md Install the plugin using pip. This command ensures that the optimized file systems are available for your Flyte workflows. ```bash pip install flytekitplugins-async-fsspec ``` -------------------------------- ### Install Flytekit Dask Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dask/README.md Install the plugin using pip. This command adds the necessary components to your Python environment. ```bash pip install flytekitplugins-dask ``` -------------------------------- ### Install Memray Profiling Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-memray/README.md Install the plugin using pip. This command is necessary to enable the memray profiling capabilities within your Flyte environment. ```bash pip install flytekitplugins-memray ``` -------------------------------- ### Advanced Optuna Configuration with Dictionary Inputs Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-optuna/README.md Demonstrates using a dictionary to define hyperparameters for Optuna optimization, allowing for more complex and structured parameter suggestions. This example uses a pre-created Optuna study. ```python import flytekit as fl from flytekitplugins.optuna import Optimizer, suggest image = fl.ImageSpec(packages=["flytekitplugins.optuna"]) @fl.task(container_image=image) async def objective(params: dict[str, int | float | str]) -> float: ... # Objective function implementation @fl.eager(container_image=image) async def train(concurrency: int, n_trials: int): study = optuna.create_study(direction="maximize") optimizer = Optimizer(objective=objective, concurrency=concurrency, n_trials=n_trials, study=study) params = { "lambda": suggest.float(1e-8, 1.0, log=True), "alpha": suggest.float(1e-8, 1.0, log=True), "subsample": suggest.float(0.2, 1.0), "colsample_bytree": suggest.float(0.2, 1.0), "max_depth": suggest.integer(3, 9, step=2), "objective": "binary:logistic", "tree_method": "exact", "booster": "dart", } await optimizer(params=params) ``` -------------------------------- ### Install Flytekit Vaex Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-vaex/README.md Install the plugin using pip. This command is necessary to enable Vaex DataFrame support in Flyte. ```bash pip install flytekitplugins-vaex ``` -------------------------------- ### Install Flytekit Snowflake Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-snowflake/README.md Install the plugin using pip. Ensure you have a compatible Python environment. ```bash pip install flytekitplugins-snowflake ``` -------------------------------- ### Install Flytekit FlyteInteractive Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-flyteinteractive/README.md Install the plugin using pip. This command adds the necessary packages to your Python environment. ```bash pip install flytekitplugins-flyteinteractive ``` -------------------------------- ### Install Flytekit Plugins in Editable Mode Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Installs all plugins located in the 'plugins' directory in editable mode. ```bash source ~/.virtualenvs/flytekit/bin/activate cd plugins pip install -e . ``` -------------------------------- ### Initialize FlyteRemote for Sandbox Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/design/control_plane.md Use the `for_sandbox()` method for a quick setup to connect to a local Flyte cluster. It defaults to localhost:30081 and standard Minio credentials. ```python from flytekit import FlyteRemote remote = FlyteRemote.for_sandbox() ``` -------------------------------- ### Install Flytekit Airflow Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-airflow/README.md Install the plugin using pip to enable Airflow integration with Flytekit. ```bash pip install flytekitplugins-airflow ``` -------------------------------- ### Install Flytekit Kubernetes Pod Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-k8s-pod/README.md Install the plugin using pip. This plugin is no longer needed for new versions of Flyte. ```bash pip install flytekitplugins-pod ``` -------------------------------- ### Install Flytekit dbt Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dbt/README.md Install the plugin using pip. This command is necessary to enable dbt functionality within Flyte. ```bash pip install flytekitplugins-dbt ``` -------------------------------- ### Install Flytekit Plugins Source: https://context7.com/flyteorg/flytekit/llms.txt Install optional Flytekit plugin packages using pip to integrate with various distributed runtimes and services. ```bash pip install flytekitplugins-spark # Apache Spark pip install flytekitplugins-kfpytorch # Kubeflow PyTorch distributed training pip install flytekitplugins-kftensorflow pip install flytekitplugins-bigquery pip install flytekitplugins-dask pip install flytekitplugins-huggingface pip install flytekitplugins-deck-standard ``` -------------------------------- ### Install Istio Ingress Gateway Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-identity-aware-proxy/README.md Installs the Istio ingress gateway using Helm. Ensure `istio-values.yaml` is configured correctly for NodePort service type and backend annotations. ```bash helm repo add istio https://istio-release.storage.googleapis.com/charts helm repo update kubectl create namespace istio-system helm install istio-base istio/base -n istio-system helm install istiod istio/istiod -n istio-system --wait helm install istio-ingress istio/gateway -n istio-system -f istio-values.yaml --wait ``` -------------------------------- ### Install Xarray Zarr Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/community/flytekit-xarray-zarr/README.md Install the plugin using pip. This command is required before using the xarray-zarr functionalities. ```bash pip install flytekitplugins-xarray-zarr ``` -------------------------------- ### Automatic Plugin Registration Example Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dgxc-lepton/README.md Demonstrates how the LeptonEndpointDeploymentTask is automatically registered with Flytekit, allowing it to be used directly after importing. ```python # Automatic registration enables this usage pattern task = LeptonEndpointDeploymentTask(config=config) ``` -------------------------------- ### Install ONNX ScikitLearn Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-onnx-scikitlearn/README.md Install the plugin using pip. This command is necessary to enable the ONNX ScikitLearn functionality within Flytekit. ```bash pip install flytekitplugins-onnxscikitlearn ``` -------------------------------- ### Install Flytekit from a Specific Commit Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Installs Flytekit from a specific commit hash, useful for testing or using a particular version. ```bash pip install https://github.com/flyteorg/flytekit/archive/a32ab82bef4d9ff53c2b7b4e69ff11f1e93858ea.zip#egg=flytekit ``` -------------------------------- ### Install Flytekit SQLAlchemy Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-sqlalchemy/README.md Install the plugin using pip. This command is required before using the SQLAlchemy plugin in your Flyte workflows. ```bash pip install flytekitplugins-sqlalchemy ``` -------------------------------- ### Install Pre-commit Hooks Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Install pre-commit hooks to automate linting and code formatting on every commit. Ensure your dev environment is set up first. ```bash pre-commit install ``` -------------------------------- ### Install Flytekit Hive Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-hive/README.md Install the plugin using pip. This command is necessary to enable Hive integration within Flytekit. ```bash pip install flytekitplugins-hive ``` -------------------------------- ### Install Flytekit Papermill Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-papermill/README.md Install the plugin using pip. This command is necessary to enable Papermill integration within Flytekit. ```bash pip install flytekitplugins-papermill ``` -------------------------------- ### Install Flytekit Envd Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-envd/README.md Install the plugin using pip. This command is necessary to enable the envd functionality within Flytekit. ```bash pip install flytekitplugins-envd ``` -------------------------------- ### Install Hugging Face Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-huggingface/README.md Install the Flytekit Hugging Face plugin using pip. This command is necessary to enable the integration. ```bash pip install flytekitplugins-huggingface ``` -------------------------------- ### Install Flytekit Great Expectations Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-greatexpectations/README.md Install the plugin using pip. This command adds the necessary libraries to your Python environment. ```bash pip install flytekitplugins-great-expectations ``` -------------------------------- ### Install Flytekit Polars Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-polars/README.md Install the plugin using pip. This command is necessary to enable Polars support in your Flyte environment. ```bash pip install flytekitplugins-polars ``` -------------------------------- ### Install Flytekit BigQuery Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-bigquery/README.md Install the BigQuery plugin using pip. This command adds the necessary libraries to your Python environment. ```bash pip install flytekitplugins-bigquery ``` -------------------------------- ### Install Flytekit ONNX PyTorch Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-onnx-pytorch/README.md Install the plugin using pip. This command is necessary to enable the ONNX PyTorch functionality within Flytekit. ```bash pip install flytekitplugins-onnxpytorch ``` -------------------------------- ### Install Flytekit AWS Athena Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-athena/README.md Install the plugin using pip. This command adds the necessary libraries to your Python environment. ```bash pip install flytekitplugins-athena ``` -------------------------------- ### Install K8s Data Service Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-k8sdataservice/README.md Install the plugin using pip. This command is used to add the necessary packages to your Python environment. ```bash pip install flytekitplugins-k8sdataservice ``` -------------------------------- ### Install Flytekit Spark Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-spark/README.md Install the Flytekit Spark plugin using pip. This command adds the necessary components to your Python environment to enable Spark integration. ```bash pip install flytekitplugins-spark ``` -------------------------------- ### Debug dbt profile setup Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dbt/tests/testdata/jaffle_shop/README.md Ensure your dbt profile is correctly configured to connect to your data warehouse before proceeding. ```bash dbt debug ``` -------------------------------- ### SageMaker Deployment Workflow Example Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-sagemaker/README.md A sample Flyte workflow demonstrating how to create a SageMaker deployment. This includes defining model configuration, endpoint configuration, and the deployment itself. Ensure necessary environment variables like REGION and S3_OUTPUT_PATH are set. ```python from flytekitplugins.awssagemaker_inference import create_sagemaker_deployment REGION = os.getenv("REGION") MODEL_NAME = "xgboost" ENDPOINT_CONFIG_NAME = "xgboost-endpoint-config" ENDPOINT_NAME = "xgboost-endpoint" sagemaker_deployment_wf = create_sagemaker_deployment( name="sagemaker-deployment", model_input_types=kwtypes(model_path=str, execution_role_arn=str), model_config={ "ModelName": MODEL_NAME, "PrimaryContainer": { "Image": "{images.deployment_image}", "ModelDataUrl": "{inputs.model_path}", }, "ExecutionRoleArn": "{inputs.execution_role_arn}", }, endpoint_config_input_types=kwtypes(instance_type=str), endpoint_config_config={ "EndpointConfigName": ENDPOINT_CONFIG_NAME, "ProductionVariants": [ { "VariantName": "variant-name-1", "ModelName": MODEL_NAME, "InitialInstanceCount": 1, "InstanceType": "{inputs.instance_type}", }, ], "AsyncInferenceConfig": { "OutputConfig": {"S3OutputPath": os.getenv("S3_OUTPUT_PATH")} }, }, endpoint_config={ "EndpointName": ENDPOINT_NAME, "EndpointConfigName": ENDPOINT_CONFIG_NAME, }, images={"deployment_image": custom_image}, region=REGION, ) @workflow def model_deployment_workflow( model_path: str = os.getenv("MODEL_DATA_URL"), execution_role_arn: str = os.getenv("EXECUTION_ROLE_ARN"), ) -> str: return sagemaker_deployment_wf( model_path=model_path, execution_role_arn=execution_role_arn, instance_type="ml.m4.xlarge", ) ``` -------------------------------- ### Visualize Optuna Study with Flyte Decks Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-optuna/README.md Generates a Plotly timeline visualization of the Optuna study and embeds it into a Flyte Deck for in-UI monitoring. Requires 'plotly' to be installed. ```python import plotly fig = optuna.visualization.plot_timeline(optimizer.study) fl.Deck(name, plotly.io.to_html(fig)) ``` -------------------------------- ### Register and Execute a Workflow with Flyte Remote Source: https://context7.com/flyteorg/flytekit/llms.txt Register a local workflow and then execute it remotely. This example shows how to specify project, domain, and version during registration, and how to pass inputs and monitor execution. ```python from my_module import my_workflow # a @workflow function registered_wf = remote.register_script( entity=my_workflow, image_config=ImageConfig.auto_default_image(), version="v1.2.0", project="my-project", domain="development", source_path=".", ) # --- Execution --- execution = remote.execute( entity=registered_wf, inputs={"x": 10, "y": 20}, execution_name_prefix="test-run", wait=False, # non-blocking ) print(f"Execution URL: {remote.generate_console_url(execution)}") # Wait and retrieve outputs completed = remote.wait(execution, timeout=datetime.timedelta(minutes=10)) print(completed.outputs) # --- Fetch and re-execute an existing workflow --- wf = remote.fetch_workflow(name="my_module.my_workflow", version="v1.2.0") execution2 = remote.execute(wf, inputs={"x": 5, "y": 15}, wait=True) # --- Monitor --- task_exec = remote.fetch_execution(name=execution.id.name) synced = remote.sync_execution(task_exec, sync_nodes=True) for node in synced.node_executions.values(): print(node.id, node.closure.phase) ``` -------------------------------- ### Navigate to jaffle_shop directory Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dbt/tests/testdata/jaffle_shop/README.md Change into the jaffle_shop directory from the command line to begin setting up the project. ```bash cd jaffle_shop ``` -------------------------------- ### Build Documentation Locally Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Build the HTML documentation locally to preview your changes before committing. This command should be run from the root of the repository. ```bash make html ``` -------------------------------- ### Flytekit Plugin setup.py Template Source: https://github.com/flyteorg/flytekit/blob/master/plugins/README.md A comprehensive template for the `setup.py` file of a Flytekit plugin. Customize the TODO sections for plugin name, type, and requirements. Includes configuration for namespace packages and classifiers. ```python from setuptools import setup # TODO put the plugin name here PLUGIN_NAME = "" # TODO decide if the plugin is regular or `data` # for regular plugins microlib_name = f"flytekitplugins-{PLUGIN_NAME}" # For data/persistence plugins # microlib_name = f"flytekitplugins-data-{PLUGIN_NAME}" # TODO add additional requirements plugin_requires = ["flytekit>=1.1.0b0,<2.0.0, ""] __version__ = "0.0.0+develop" setup( name=microlib_name, version=__version__, author="flyteorg", author_email="admin@flyte.org", # TODO Edit the description description="My awesome plugin.....", # TODO alter the last part of the following URL url="https://github.com/flyteorg/flytekit/tree/master/plugins/flytekit-…", long_description=open("README.md").read(), long_description_content_type="text/markdown", namespace_packages=["flytekitplugins"], packages=[f"flytekitplugins.{PLUGIN_NAME}"], install_requires=plugin_requires, license="apache2", python_requires=">=3.9", classifiers=[ "Intended Audience :: Science/Research", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Software Development", "Topic :: Software Development :: Libraries", "Topic :: Software Development :: Libraries :: Python Modules", ], # TODO OPTIONAL # FOR Plugins where auto-loading on installation is desirable, please uncomment this line and ensure that the # __init__.py has the right modules available to be loaded, or point to the right module # entry_points={"flytekit.plugins": [f"{PLUGIN_NAME}=flytekitplugins.{PLUGIN_NAME}"]}, ) ``` -------------------------------- ### Serve dbt project documentation Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-dbt/tests/testdata/jaffle_shop/README.md Serve the generated documentation locally, allowing you to view and explore it in a web browser. ```bash dbt docs serve ``` -------------------------------- ### Example Flyte Workflow with MMCloud Integration Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-mmcloud/README.md This Python script demonstrates a Flyte workflow that utilizes the MMCloud plugin for task execution. It includes defining custom images, tasks with MMCloud configuration, resource specifications, and environment variables. ```python import pandas as pd from flytekit import ImageSpec, Resources, task, workflow from sklearn.datasets import load_wine from sklearn.linear_model import LogisticRegression from flytekitplugins.mmcloud import MMCloudConfig image_spec = ImageSpec(packages=["scikit-learn"], registry="docker.io/memverge") @task def get_data() -> pd.DataFrame: """Get the wine dataset.""" return load_wine(as_frame=True).frame @task(task_config=MMCloudConfig(), container_image=image_spec) # Task will be submitted as MMCloud job def process_data(data: pd.DataFrame) -> pd.DataFrame: """Simplify the task from a 3-class to a binary classification problem.""" return data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1)) @task( task_config=MMCloudConfig(submit_extra="--migratePolicy [enable=true]"), requests=Resources(cpu="1", mem="1Gi"), limits=Resources(cpu="2", mem="4Gi"), container_image=image_spec, environment={"KEY": "value"}, ) def train_model(data: pd.DataFrame, hyperparameters: dict) -> LogisticRegression: """Train a model on the wine dataset.""" features = data.drop("target", axis="columns") target = data["target"] return LogisticRegression(max_iter=3000, **hyperparameters).fit(features, target) @workflow def training_workflow(hyperparameters: dict) -> LogisticRegression: """Put all of the steps together into a single workflow.""" data = get_data() processed_data = process_data(data=data) return train_model( data=processed_data, hyperparameters=hyperparameters, ) ``` -------------------------------- ### Initialize Variables for Notebook Execution Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-papermill/tests/testdata/nb-multi.ipynb Sets up initial variables including integers, strings, and datetime objects. These are used in subsequent notebook cells. ```python from datetime import datetime, timedelta x = 10 y = 16 h = "hello" n = datetime(2020, 10, 10, 10, 10, 10) ``` -------------------------------- ### Configure Flyte Workflow for Optuna Optimization Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-optuna/README.md Sets up the Flyte workflow to orchestrate parallel Optuna trials. It initializes an Optimizer with the objective function and trial configuration, then awaits the optimization process. The best value found is printed. ```python import flytekit as fl from flytekitplugins.optuna import Optimizer, suggest @fl.eager(container_image=image) async def train(concurrency: int, n_trials: int) -> float: optimizer = Optimizer(objective=objective, concurrency=concurrency, n_trials=n_trials) await optimizer( x=suggest.float(low=-10, high=10), y=suggest.integer(low=-10, high=10), z=suggest.category([-5, 0, 3, 6, 9]), power=2, ) print(optimizer.study.best_value) ``` -------------------------------- ### Install Flytekit Modin Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-modin/README.md Install the plugin using pip. This command is necessary to enable Modin support in your Flytekit environment. ```bash pip install flytekitplugins-modin ``` -------------------------------- ### Common Make Commands Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/contributing.md Provides a list of useful make commands for development tasks such as formatting, linting, and testing. ```bash $ make setup Install requirements fmt Format code with ruff lint Run linters test Run tests requirements Compile requirements ``` -------------------------------- ### Dockerfile for Flytekit MMCloud Connector Image Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-mmcloud/README.md This Dockerfile outlines the steps to build a connector image for Flytekit with MMCloud integration. It installs the necessary plugin, copies the 'float' binary, and sets up the command to serve the connector. ```dockerfile FROM python:3.11-slim-bookworm WORKDIR /root ENV PYTHONPATH /root # flytekit will autoload the connector if package is installed. RUN pip install flytekitplugins-mmcloud COPY float /usr/local/bin/float CMD pyflyte serve connector --port 8000 ``` -------------------------------- ### Create Launch Plan for a Fetched Workflow Source: https://github.com/flyteorg/flytekit/blob/master/docs/source/design/control_plane.md Dynamically creates or retrieves a launch plan for a given workflow, specifying project and domain. This allows for the creation of launch plans associated with specific remote workflows. ```python from flytekit import LaunchPlan flyte_workflow = remote.fetch_workflow( name="my_workflow", version="v1", project="flytesnacks", domain="development" ) launch_plan = LaunchPlan.get_or_create(name="my_launch_plan", workflow=flyte_workflow) ``` -------------------------------- ### Install Flytekit Pandera Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-pandera/README.md Install the Flytekit Pandera plugin using pip. This command is necessary to enable Pandera integration within Flytekit. ```bash pip install flytekitplugins-pandera ``` -------------------------------- ### Install Flytekit GeoPandas Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-geopandas/README.md Install the plugin using pip. This command adds the necessary libraries to your environment to enable GeoPandas support in Flyte. ```bash pip install flytekitplugins-geopandas ``` -------------------------------- ### Scheduling and Configuring Workflow Launches with LaunchPlan Source: https://context7.com/flyteorg/flytekit/llms.txt `LaunchPlan` wraps a workflow with pre-bound inputs, scheduling (cron or fixed-rate), notifications, and concurrency policies. It enables scheduled or externally triggered executions. ```python import datetime from flytekit import task, workflow, LaunchPlan from flytekit.core.schedule import CronSchedule, FixedRate from flytekit.models.concurrency import ConcurrencyPolicy @workflow def etl_pipeline(data_date: datetime.datetime, batch_size: int = 100) -> int: ... # Placeholder for actual workflow logic return 0 # Default launch plan (inherits workflow defaults) default_lp = LaunchPlan.get_or_create(etl_pipeline) # Named launch plan with fixed inputs and a cron schedule scheduled_lp = LaunchPlan.create( name="etl_daily_schedule", workflow=etl_pipeline, default_inputs={"batch_size": 500}, fixed_inputs={"data_date": datetime.datetime(2024, 1, 1)}, schedule=CronSchedule( schedule="0 6 * * *", # every day at 06:00 kickoff_time_input_arg="data_date", # injects execution time ), ) # Fixed-rate schedule fixed_lp = LaunchPlan.create( name="etl_every_10_min", workflow=etl_pipeline, schedule=FixedRate(duration=datetime.timedelta(minutes=10)), concurrency=ConcurrencyPolicy(max=3, policy=ConcurrencyPolicy.PHASE_ABORT_OLDEST), ) ``` -------------------------------- ### Install Flytekit DuckDB Plugin Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-duckdb/README.md Install the Flytekit DuckDB plugin using pip. This command is necessary to enable DuckDB functionality within Flytekit. ```bash pip install flytekitplugins-duckdb ``` -------------------------------- ### Great Expectations Type Example Source: https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-greatexpectations/README.md Utilize Great Expectations as a Flyte type to validate data directly within a task's input parameters. This example shows how to configure batch request parameters for validation. ```python from flytekit import workflow from flytekitplugins.great_expectations import ( BatchRequestConfig, GreatExpectationsFlyteConfig, GreatExpectationsType, ) def simple_task( directory: GreatExpectationsType[ str, GreatExpectationsFlyteConfig( datasource_name="data", expectation_suite_name="test.demo", data_connector_name="my_data_connector", batch_request_config=BatchRequestConfig( data_connector_query={ "batch_filter_parameters": { "year": "2019", "month": "01", }, "limit": 10, }, ), context_root_dir="great_expectations", ), ] ) -> str: return f"Validation works for {directory}!" @workflow def simple_wf(directory: str = "my_assets") -> str: return simple_task(directory=directory) ``` -------------------------------- ### Launch a registered workflow from CLI Source: https://context7.com/flyteorg/flytekit/llms.txt Use the `pyflyte run` command to launch a registered workflow. Specify the project, domain, module, workflow name, and any necessary parameters. ```bash pyflyte run --remote \ --project my-project \ --domain production \ my_module.py my_workflow --x 3 --y 7 ```