### Setup PDM and Install Dependencies Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Create and activate a dedicated Conda environment for foundry-dev-tools, then install the project and its development dependencies using PDM. ```shell mamba create -n foundry-dev-tools python=3.12 pdm openjdk=17 mamba activate foundry-dev-tools ``` ```shell pdm install ``` -------------------------------- ### Install Pre-Commit Hooks Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Install pre-commit hooks to automatically format code and ensure consistency before each commit. ```shell # after this everytime you commit in this repo, it will run the hooks # and if it needs to reformat, your commit gets aborted # and you will need to readd the reformatted files pre-commit install ``` -------------------------------- ### Install with pip Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/README.md Use this command to install the foundry-dev-tools package using pip. ```shell pip install foundry-dev-tools ``` -------------------------------- ### Configuration Merging Example Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Illustrates how configuration settings are merged from multiple TOML files and environment variables. This example shows the precedence of different configuration sources. ```toml [config] key = 123 key2 = "foobar" [credentials.oauth] client_id = "topsecret" ``` ```toml [config] key = 987 key3 = "baz" [credentials.oauth] client_secret = "top_client_secret" ``` ```env FDT_CONFIG__KEY3=asdf ``` ```toml # Resulting config [config] key = 987 key2 = "foobar" key3 = "asdf" [credentials.oauth] client_id = "topsecret" client_secret = "top_client_secret" ``` -------------------------------- ### Modify setup.py for Local Package Installation Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/transforms.md This diff shows how to update the setup.py file to provide default values for package name and version using environment variables. This is necessary for installing the repository locally. ```diff - name=os.environ['PKG_NAME'], + name=os.getenv('PKG_NAME', 'your-project-name'), - version=os.environ['PKG_VERSION'], + version=os.getenv('PKG_VERSION', '1.2.3'), ``` -------------------------------- ### Build Documentation with PDM Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Build the project's documentation into HTML format using PDM after installing documentation dependencies. ```shell cd docs pdm install pdm run build ``` -------------------------------- ### Install Foundry DevTools with Full Features Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Install Foundry DevTools with all optional dependencies, including those for running transforms locally and the S3 compatible dataset API. ```shell pip install 'foundry-dev-tools[full]' ``` -------------------------------- ### Example API Client Class Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/testing.md This is an example of an API client class that can be tested. It includes methods for sending ping/pong signals and retrieving history. ```python from __future__ import annotations from typing import TYPE_CHECKING from foundry_dev_tools.clients.api_client import APIClient if TYPE_CHECKING: import requests class ExampleClient(APIClient): api_name = "example-api" def api_ping(self, ping: bool, **kwargs) -> requests.Response: """Sends a signal. Args: ping: if ping is true then it will send ping, otherwise it will send pong **kwargs: gets passed to the context client Returns: requests.Response: the api response json object looks like this: {"ping":int,"pong":int} both are counters that increment when either a ping or a pong was received. """ return self.api_request("GET", "ping", params={"msg": "ping" if ping else "pong"}, **kwargs) def api_ping_history(self, **kwargs) -> requests.Response: """History of ping and pong counters. Args: **kwargs: gets passed to the context client Returns: requests.Response: the api response json ohject looks like this: [{"ping":int,"pong":int},{"ping":int,"pong":int},<...>] It is the history tracking the ping and pong counters after each increment by `api_ping` call. """ return self.api_request("GET", "ping/history", **kwargs) ``` -------------------------------- ### Profiled Configuration Example Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Defining configuration under a specific profile prefix like 'dev'. ```toml [dev.config] cache_dir = "/tmp/cache" ``` -------------------------------- ### Install Foundry Dev Tools Optional Dependency Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/api.md Install the 'public' extra dependency for foundry-dev-tools to enable integration with foundry-platform-sdk. ```shell pip install 'foundry-dev-tools[public]' ``` -------------------------------- ### Install with conda Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/README.md Use this command to install the foundry-dev-tools package from the conda-forge channel using conda or mamba. ```shell conda install -c conda-forge foundry-dev-tools ``` -------------------------------- ### Use Built-in foundry-platform-sdk Client (v2) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/api.md Access and use the pre-configured foundry_sdk.v2.FoundryClient via FoundryContext. This example reads a dataset table using the ARROW format. ```python import pyarrow as pa import polars as pl from foundry_dev_tools import FoundryContext ctx = FoundryContext() client = ctx.public_client_v2 ds = client.datasets.Dataset.read_table("", format="ARROW") df = pl.from_arrow(pa.ipc.open_stream(ds).read_all()) ``` -------------------------------- ### Display Foundry DevTools Environment Info Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/cli.md Use the 'info' command to verify your environment setup for Foundry DevTools. This output is helpful for troubleshooting and issue reporting. ```shell fdt info ``` -------------------------------- ### Example CachedTokenProvider Implementation Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/token_provider_implementation.md An example implementation of CachedTokenProvider. This class demonstrates how to override the _request_token method to provide a token and its expiration timestamp. ```python import time from foundry_dev_tools.config.token_provider import CachedTokenProvider class ExampleCachedTokenProvider(CachedTokenProvider): def _request_token(self): # expire in 100 seconds return "token", time.time() + 100 ``` -------------------------------- ### Setup AWS Credentials and Account ID Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/examples/lambda/README.md Exports AWS region and account ID as environment variables. Replace '123456789' with your actual AWS Account ID. ```shell export AWS_REGION="eu-central-1" && \ export AWS_ACCOUNT_ID="123456789" ``` -------------------------------- ### Implement Get User Info API Method Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Implements the GET /user endpoint to retrieve user information. Handles UserNotFoundError and InsufficientPermissions errors. ```python def api_get_user_info(self, username: str, **kwargs) -> requests.Response: """Returns information about the user. Args: username: for which user to get information """ return self.api_request( "GET", "user", params={"username": username}, error_handling=ErrorHandlingConfig({ "Info:UserNotFoundError": UserNotFoundError, "Info:InsufficientPermissions": InsufficientPermissionsError, }), **kwargs, ) ``` -------------------------------- ### Install Foundry DevTools with pip Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Use this command to install the latest version of Foundry DevTools. It's recommended to do this within a dedicated conda or python environment. ```shell pip install 'foundry-dev-tools' ``` -------------------------------- ### Install Foundry DevTools Transforms with pip Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Install the foundry-dev-tools-transforms package, which includes PySpark as a dependency, necessary for running transforms and CachedFoundryClient. ```shell pip install 'foundry-dev-tools-transforms' ``` -------------------------------- ### Build Lambda Docker Image Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/examples/lambda/README.md Builds the Docker image for the Lambda function using pixi. Ensure you have Docker installed. ```shell pixi run --environment ci docker-build-lambda ``` -------------------------------- ### Install Local Python Package Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/transforms.md This shell command installs the current Python package in editable mode from the 'transform-python/src' directory. This makes the transform functions available for local import. ```shell pip install -e . ``` -------------------------------- ### Direct API Call Method Example Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Example of a direct API call method (`api_get_file`) that wraps a call to `api_request`. This method is used when the client's sole purpose is to interact with an API endpoint. It accepts dataset and transaction details, a logical path, and optional headers and stream parameters. The `**kwargs` are passed to the underlying `api_request` for custom error handling. ```python def api_get_file( self, dataset_rid: DatasetRid, transaction_rid: TransactionRid, logical_path: PathInDataset, range_header: str | None = None, requests_stream: bool = True, **kwargs, ) -> requests.Response: """Returns a file from the specified dataset and transaction. Args: dataset_rid: dataset rid transaction_rid: transaction rid logical_path: path in dataset range_header: HTTP range header requests_stream: passed to :py:meth:`requests.Session.request` as `stream` **kwargs: gets passed to :py:meth:`APIClient.api_request` """ return self.api_request( "GET", f"dataproxy/datasets/{dataset_rid}/transactions/{transaction_rid}/{quote(logical_path)}", headers={"Range": range_header} if range_header else None, stream=requests_stream, **kwargs, ) ``` -------------------------------- ### Implement List Users API Method Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Implements the GET /users endpoint to list all users. Handles InsufficientPermissions errors. ```python def api_list_users(self, **kwargs) -> requests.Response: """List all users.""" return self.api_request( "GET", "users", error_handling=ErrorHandlingConfig({"Info:InsufficientPermissions": InsufficientPermissionsError}), **kwargs, ) ``` -------------------------------- ### Download Entire Dataset to Temporary Folder (FoundryContext) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads all files of a dataset to a temporary folder using FoundryContext. The downloaded data can then be read, for example, as a Pandas DataFrame. The temporary folder is automatically cleaned up. ```python from foundry_dev_tools import FoundryContext import pandas as pd ctx = FoundryContext() dataset = ctx.get_dataset("ri.foundry.main.dataset...") with dataset.download_files_temporary() as tmp_dir: df = pd.read_parquet(tmp_dir) print(df.shape) ``` -------------------------------- ### Serverless Python Functions with OAuth Configuration Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Configuration for serverless Python functions using OAuth authentication. This example demonstrates setting up FoundryContext with a requests session client and OAuth token provider. ```python from functions.api import function from functions.sources import get_source import json from foundry_dev_tools import FoundryContext, OAuthTokenProvider, Config @function(sources=["SourceFoundryApi"]) def get_user_info() -> str: source = get_source("LsRestMTableauExtractSchedulerFoundryApi") client = source.get_https_connection().get_client() ctx = FoundryContext( config=Config(requests_session=client), token_provider=OAuthTokenProvider( host="your-stack.palantirfoundry.com", client_id=source.get_secret("ClientID"), client_secret=source.get_secret("ClientSecret"), grant_type="client_credentials", ), ) return json.dumps(ctx.multipass.get_user_info()) ``` -------------------------------- ### Manual Transaction Control with Dataset Context Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md When a transaction is manually started, the `transaction_context` will not create a new one but will use the existing one. Manual closing is required. ```python ds.start_transaction() with ds.transaction_context(): # will not start a new transaction print(ds.transaction) # will print the transaction started earlier # will not commit or abort the transaction print(ds.transaction) # still accessible and open # you'll need to close it manually ds.commit_transaction() ``` -------------------------------- ### Login to ECR and Push Lambda Image Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/examples/lambda/README.md Logs into Elastic Container Registry (ECR) and pushes the built Lambda Docker image. This requires prior ECR setup. ```shell pixi run --environment ci docker-login && pixi run --environment ci docker-push-lambda ``` -------------------------------- ### Get and Use Foundry Resources Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/foundry_dev_tools.md Retrieve Foundry resources (like Datasets) using their RID or path. Datasets offer specific methods and attributes, including branch management. All resources share common attributes like rid, path, modified, and created. ```python res = ctx.get_resource("rid") # depending on the resource, this will return a different class # for example if the RID starts with ri.main.foundry.dataset it will return a Dataset class ds = ctx.get_resource("ri.main.foundry.dataset...") ds = ctx.get_resource_by_path("/path/to/dataset") # datasets have their own methods, they basically do the same thing, # except that you can provide a different branch name ds = ctx.get_resource("ri.main.foundry.dataset...", branch="dev/feature1") # and can create a dataset via the get_dataset_by_path method ds = ctx.get_resource_by_path("/path/to/new_dataset", branch="branch_name", create_if_not_exists=True) # each resource has the same attributes like rid, path, modified or created print(ds.rid, ds.path, ds.modified, ds.created) # but for example the dataset class also has the branch attribute additionally print(ds.branch) ``` -------------------------------- ### Import and Compute Transform in Jupyter Notebook Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/transforms.md This Python code demonstrates how to import a transform function (e.g., 'apply_the_schema') after local installation and then compute its output. The result can be converted to a Pandas DataFrame. ```python from myproject.datasets import apply_the_schema # run the function by calling compute output = apply_the_schema.compute() # output in pandas format output.toPandas() ``` -------------------------------- ### Serve Documentation Locally with PDM Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Build and serve the project's documentation locally via HTTP, with live updates on changes, using PDM. ```shell cd docs pdm install pdm run live ``` -------------------------------- ### Initialize FoundryContext with Configuration Files Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/foundry_dev_tools.md Create a FoundryContext instance, which automatically loads configuration and credentials from your environment and config files. ```python from foundry_dev_tools import FoundryContext # this way it will take the configuration and credentials # from your configuration files and environment variables ctx = FoundryContext() ``` -------------------------------- ### Register Token Provider via Entry Point Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/extending.md Add this configuration to your `pyproject.toml` file to register a custom token provider using setuptools entry points. This allows the token provider to be discovered and used via configuration. ```toml [project.entry-points."fdt_token_provider"] my_token_provider = "your.module.name:MyTokenProvider" ``` -------------------------------- ### Initializing FoundryContext with a Profile Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md How to initialize FoundryContext using a specified profile name. ```python from foundry_dev_tools import FoundryContext ctx = FoundryContext(profile="dev") ``` -------------------------------- ### Register API Client via Entry Point Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/extending.md Configure your `pyproject.toml` to register a custom API client using setuptools entry points. This enables the API client to be accessed directly through the `FoundryContext`. ```toml [project.entry-points."fdt_api_client"] info = "foundry_dev_tools_info.client:InfoClient" ``` -------------------------------- ### Upload Sample Object to Foundry Dataset Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/examples/lambda/README.md Uploads a sample object to a specified Foundry dataset using the `upload-sample-object` command. The target dataset RID must be provided. ```shell TARGET_DATASET_RID="ri.foundry.main.dataset.1234" \ pixi run --environment ci upload-sample-object ``` -------------------------------- ### List Users Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Retrieves a list of all users. This is a simple GET request to the /users endpoint. ```APIDOC ## GET /users ### Description List all users. ### Method GET ### Endpoint /users ### Response #### Success Response (200) - **field1** (type) - Description #### Response Example { "example": "response body" } ### Error Handling - **Info:InsufficientPermissions**: You don't have sufficient permissions to use this API. ``` -------------------------------- ### Get User Info Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Retrieves information about a specific user. Requires the username as a query parameter. ```APIDOC ## GET /user ### Description Returns information about the user. ### Method GET ### Endpoint /user ### Parameters #### Query Parameters - **username** (string) - Required - for which user to get information ### Response #### Success Response (200) - **field1** (type) - Description #### Response Example { "example": "response body" } ### Error Handling - **Info:UserNotFoundError**: The username provided does not exist. - **Info:InsufficientPermissions**: You don't have sufficient permissions to use this API. ``` -------------------------------- ### Read and Write with fsspec Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/s3.md Demonstrates reading from and writing to Foundry's S3 API using fsspec. It shows how to open files for reading and writing with appropriate storage options. ```python import fsspec from foundry_dev_tools import FoundryContext ctx = FoundryContext() dataset_rid = "ri.foundry.main.dataset.2ce6cb10-59f4-4e19-a3b8-ae3deaf5985e" with fsspec.open(f"s3://{dataset_rid}/test.csv", "r", **ctx.s3.get_s3fs_storage_options()) as f: print(f.read()) # ------------------- fs = fsspec.filesystem("s3", **ctx.s3.get_s3fs_storage_options()) with fs.open(f"{dataset_rid}/fsspec_test_write.txt", "w") as f: f.write("hihi") with fs.open(f"{dataset_rid}/fsspec_test_write.txt", "r") as f: print(f.read()) ``` -------------------------------- ### Initialize FoundryContext with Explicit Parameters Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/foundry_dev_tools.md Initialize FoundryContext by providing configuration and a token provider directly. This is useful when not relying on default configuration files. Ensure credentials are not hardcoded in production. ```python from foundry_dev_tools import Config, JWTTokenProvider, OAuthTokenProvider # note: credentials shouldn't be stored directly in your code, this is just an example # jwt: ctx = FoundryContext(config=Config(), token_provider=JWTTokenProvider(jwt="...")) # oauth: ctx = FoundryContext(config=Config(), token_provider=OAuthTokenProvider(client_id="...")) ``` -------------------------------- ### Initialize FoundryContext with Config Only Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Initialize FoundryContext in Python with a Config object, reading credentials from files but ignoring the 'config' table. ```python # If you only supply either config or token_provider, the config files will be read, but only be used for the non supplied parameter # This will read your config files, but it does not read the 'config' table from the config files, only the credentials ctx = FoundryContext(config=Config()) ``` -------------------------------- ### Build with Foundry DevTools Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/cli.md The 'build' command initiates the build process. You can specify a transform file using the -t flag to execute checks and build for that specific file, or run it without arguments to select from transform files in your last commit. ```shell fdt build [-t transform_file] ``` -------------------------------- ### External Transforms with JWT Authentication Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Example of using external transforms with JWT authentication. This snippet demonstrates how to initialize FoundryContext with a JWT token provider. ```python from pyspark.sql import functions as F from transforms.api import transform, Output from transforms.external.systems import use_external_systems, EgressPolicy from foundry_dev_tools import FoundryContext, JWTTokenProvider import json @use_external_systems( egress=EgressPolicy( "ri.resource-policy-manager.global.network-egress-policy.[...]" ), ) @transform( output_transform=Output( "/path/to/dataset" ), ) def compute(ctx, output_transform, egress): fdt_context = FoundryContext( token_provider=JWTTokenProvider( host="your-stack.palantirfoundry.com", jwt=ctx.auth_header.split(" ")[1], ) ) user_info = json.dumps(fdt_context.multipass.get_user_info()) output_transform.write_dataframe( ctx.spark_session.createDataFrame( data=[[user_info]], schema="user_info_json: string" ) ) ``` -------------------------------- ### Run All Pre-Commit Hooks Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Manually run all configured pre-commit hooks on all files in the repository to check for formatting and style issues. ```shell pre-commit run --all-files ``` -------------------------------- ### Configuration Options TOML Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Set configuration options like cache directory using TOML format. ```toml [config] cache_dir = "/home/USER/.cache/foundry-dev-tools" ``` -------------------------------- ### Create User Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Creates a new user with specified details. The `org` parameter is mandatory, and `isAdmin` and `enabled` have default values if not provided. ```APIDOC ## Create User ### Description Creates a new user. The `org` parameter is mandatory. `isAdmin` defaults to false and `enabled` defaults to true if omitted. ### Method POST ### Endpoint /users ### Parameters #### Query Parameters - **username** (string) - Required - The username of the user to be created. #### Request Body - **org** (string) - Required - The organization name. - **isAdmin** (boolean) - Optional - Controls if the user is an admin. Defaults to false. - **enabled** (boolean) - Optional - Controls if the user's account is enabled. Defaults to true. ### Request Example ```json { "org": "example_org", "isAdmin": false, "enabled": true } ``` ### Response #### Success Response (200) - **User creation successful** #### Response Example ```json { "message": "User created successfully" } ``` #### Error Handling - **Info:UserAlreadyExistsError**: The user already exists. - **Info:InsufficientPermissions**: The authenticated user lacks the necessary permissions. ``` -------------------------------- ### Create Git Tag Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Locally create a git tag for a new release version. Ensure you have the necessary permissions. ```bash git tag v9.9.9 ``` -------------------------------- ### Implement Custom Foundry API Exception Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/errors.md Example of creating a custom exception by inheriting from FoundryAPIError and overriding the message attribute. This custom exception can then be registered in an error mapping. ```python from foundry_dev_tools.errors.meta import FoundryAPIError class ExampleError(FoundryAPIError): """Raised when the example exception happens.""" message = "Example error happened :O" ``` -------------------------------- ### Configuration Options Environment Variable Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Set configuration options like cache directory using environment variables. ```shell export FDT_CONFIG__CACHE_DIR="/home/USER/.cache/foundry-dev-tools" ``` -------------------------------- ### Default Configuration Format Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Standard TOML configuration format without profiles. ```toml [config] cache_dir = "/tmp/cache" ``` -------------------------------- ### Chaining Dataset Operations Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Chain multiple dataset operations that return the Dataset object itself. This allows for concise sequences of actions like starting a transaction, uploading, and committing. ```python ds = ctx.get_dataset(...) ds.start_transaction().put_file(...).upload_schema(...).commit_transaction() ``` -------------------------------- ### Initialize FoundryContext with OAuth Provider Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Initialize FoundryContext in Python with an OAuth token provider, bypassing configuration files. ```python from foundry_dev_tools import Config, JWTTokenProvider, OAuthTokenProvider # This way the configuration files are not read/ignored # note: credentials shouldn't be stored directly in your code, this is just an example # oauth: ctx = FoundryContext(config=Config(), token_provider=OAuthTokenProvider(client_id="...")) ``` -------------------------------- ### Get Boto3 S3 Resource Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/s3.md Obtains a boto3 S3 resource instance configured for Foundry. This allows for object-oriented interaction with Foundry datasets, similar to using S3 resources directly. ```python from foundry_dev_tools import FoundryContext ctx = FoundryContext() s3_resource = ctx.s3.get_boto3_resource() obj = s3_resource.Object( bucket_name="ri.foundry.main.dataset.2ce7cb50-41f3-4e22-a6b7-ae4deaf3985e", # replace with a dataset RID of yours key="some-file-in-the-dataset.txt" ) print(obj.last_modified) # returns 2023-04-13 09:57:09+00:00 ``` -------------------------------- ### Run Unit Tests with PDM Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Execute all unit tests for the project using the PDM command. ```shell pdm run unit ``` -------------------------------- ### Run Linting with PDM Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Execute the linting process using PDM, which runs the pre-commit hooks to check code formatting and style. ```shell pdm run lint ``` -------------------------------- ### Define InfoClient Class Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/api_client_implementation.md Initializes the InfoClient by inheriting from APIClient and setting the api_name. Includes necessary imports for type checking and error handling. ```python from __future__ import annotations from foundry_dev_tools.clients.api_client import APIClient from typing import TYPE_CHECKING from foundry_dev_tools.errors.handling import ErrorHandlingConfig from foundry_dev_tools.errors.info import ( UserNotFoundError, UserAlreadyExistsError, InsufficientPermissionsError, ) if TYPE_CHECKING: import requests class InfoClient(APIClient): api_name = "info" ``` -------------------------------- ### Get Boto3 S3 Client Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/s3.md Obtains a boto3 S3 client instance configured for Foundry. Use this to interact with Foundry datasets as S3 objects using boto3 methods like head_object. ```python from foundry_dev_tools import FoundryContext ctx = FoundryContext() s3_client = ctx.s3.get_boto3_client() resp = s3_client.head_object( Bucket="ri.foundry.main.dataset.2ce7cb50-41f3-4e22-a6b7-ae4deaf3985e", # replace with a dataset RID of yours Key="some-file-in-the-dataset.txt" ) print(resp['LastModified']) # returns 2023-04-13 09:57:09+00:00 ``` -------------------------------- ### Get Exception Class Logic Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/errors.md This method within ErrorHandlingConfig determines the appropriate Python exception class for a given HTTP response. It checks custom mappings and then defaults to FoundryAPIError if no specific mapping is found. ```python def get_exception_class(self, response: requests.Response) -> type[FoundryAPIError] | None: # noqa: PLR0911 """Returns the python exception class for the response.""" try: response.raise_for_status() except requests.exceptions.HTTPError: if self.api_error_mapping is not None: if isinstance(self.api_error_mapping, dict): if status_exception := self.api_error_mapping.get(response.status_code): return status_exception if (en := self._get_error_name(response)) and (exc := (self.api_error_mapping.get(en))): return exc else: return self.api_error_mapping if (en := self._get_error_name(response)) and (exc := DEFAULT_ERROR_MAPPING.get(en)): return exc return FoundryAPIError else: if isinstance(self.api_error_mapping, dict) and (exc := self.api_error_mapping.get(response.status_code)): return exc return None ``` -------------------------------- ### Use Custom Token Provider Without Entry Points Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/extending.md Instantiate a custom token provider directly when creating a `FoundryContext` if you are not using package entry points. This method requires manual instantiation and passing the provider to the context. ```python from foundry_dev_tools import FoundryContext, Host from your.module.name import MyTokenProvider ctx = FoundryContext(token_provider=MyTokenProvider(Host("your-stack.palantirfoundry.com"),...)) ``` -------------------------------- ### External Transforms with OAuth Client Secret Authentication Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Example of using external transforms with OAuth client secret authentication. This snippet shows how to configure FoundryContext with OAuth details obtained from external system sources. ```python from transforms.api import transform, Output from transforms.external.systems import external_systems, Source, ResolvedSource from foundry_dev_tools import FoundryContext, OAuthTokenProvider import json @external_systems( source=Source("ri.magritte..source.[...]") ) @transform( output_transform=Output( "/path/to/dataset" ), ) def compute(ctx, output_transform, source: ResolvedSource): fdt_context = FoundryContext( token_provider=OAuthTokenProvider( host=source.get_https_connection().url.replace("https://", ""), client_id=source.get_secret("additionalSecretClientId"), client_secret=source.get_secret("additionalSecretClientSecret"), grant_type="client_credentials", ) ) user_info = json.dumps(fdt_context.multipass.get_user_info()) output_transform.write_dataframe( ctx.spark_session.createDataFrame( data=[[user_info]], schema="user_info_json: string" ) ) ``` -------------------------------- ### Login using OAuth with FoundryContext Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Execute this Python one-liner to log in using OAuth and retrieve user information, after setting up the necessary credentials. ```shell python -c "from foundry_dev_tools import FoundryContext; ctx = FoundryContext(); print(ctx.multipass.get_user_info())" ``` -------------------------------- ### Initialize FoundryContext with Custom Requests Session Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Initialize FoundryContext in Python with a custom requests.Session object. ```python # if you want to bring your own requests.session ctx = FoundryContext(config=Config(debug=True, requests_session=requests.Session())) ``` -------------------------------- ### Download Dataset File to Temporary Location (FoundryContext) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads a specific file from a dataset to a temporary directory using FoundryContext. The file is then opened and processed. Temporary files are automatically deleted upon exiting the context. ```python from foundry_dev_tools import FoundryContext from pathlib import Path import pickle ctx = FoundryContext() dataset = ctx.get_dataset_by_path("/path/to/playground/model1") model_file = dataset.download_file(output_directory=Path("/tmp/model"),path_in_dataset="model.pickle") with model_file.open("rb") as model: print(pickle.load(model)) ``` -------------------------------- ### Use Custom API Client Without Entry Points Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/extending.md Instantiate a custom API client manually and associate it with a `FoundryContext` when not using package entry points. This requires direct instantiation and passing the client to the context. ```python from foundry_dev_tools import FoundryContext from foundry_dev_tools_info.client import InfoClient ctx = FoundryContext() info_client = InfoClient(ctx) ``` -------------------------------- ### Initialize FoundryContext with JWT Provider Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Initialize FoundryContext in Python with a JWT token provider, bypassing configuration files. ```python from foundry_dev_tools import Config, JWTTokenProvider, OAuthTokenProvider # This way the configuration files are not read/ignored # note: credentials shouldn't be stored directly in your code, this is just an example # jwt: ctx = FoundryContext(config=Config(), token_provider=JWTTokenProvider(jwt="...")) ``` -------------------------------- ### Download Specific Files from Dataset (FoundryContext) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads only a specified subset of files from a dataset to a designated output directory using FoundryContext. This is efficient when only certain files are needed. ```python rid = "ri.foundry.main.dataset.xxxxxxx-xxxx-xxx-xxx-xxxxxxxxx" ds = ctx.get_dataset(rid) ds.download_files(output_directory=Path("/path/to/only_few_files"),paths_in_dataset={"file1.png","file2.png"}) ``` -------------------------------- ### Configure Cache Directory in Python Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Instantiate FoundryContext with a specific cache directory path using Python. ```python from foundry_dev_tools import Config from pathlib import Path cache_dir = Path.home().joinpath("/.cache/foundry-dev-tools") conf = Config(cache_dir=cache_dir) ctx = FoundryContext(config=conf) # or after creating the context ctx.config.cache_dir = cache_dir ``` -------------------------------- ### Initialize AWS CLI Profile Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/s3.md Creates a custom AWS CLI profile that dynamically provides S3 credentials. This is the first step to using the AWS CLI with Foundry's S3 API. ```zsh fdt s3 init ``` -------------------------------- ### Initialize FoundryContext with Debug Enabled Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Initialize FoundryContext in Python with debug logging enabled. ```python # For example to enable some debug logging ctx = FoundryContext(config=Config(debug=True)) ``` -------------------------------- ### Download Dataset File to Temporary Location (FoundryRestClient) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads a specific file from a dataset using FoundryRestClient. It retrieves the dataset RID, downloads the file to a specified directory, and then opens and processes the file. ```python from foundry_dev_tools import FoundryRestClient import pickle rest_client = FoundryRestClient() rid = rest_client.get_dataset_rid('/path/to/playground/model1') model_file = rest_client.download_dataset_files(dataset_rid=rid, output_directory='/tmp/model', branch='master')[0] with open(model_file, 'rb') as file: print(pickle.load(file)) ``` -------------------------------- ### Run Integration Tests with PDM Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Execute integration tests using PDM. Ensure necessary environment variables and a valid config file are set. ```shell pdm run integration ``` -------------------------------- ### Download Entire Dataset to Temporary Folder (FoundryRestClient) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads all files of a dataset to a temporary folder using FoundryRestClient. This is useful for processing the entire dataset, such as reading it into a Pandas DataFrame. The temporary directory is automatically removed after use. ```python from foundry_dev_tools import FoundryRestClient import pandas as pd rest_client = FoundryRestClient() rid = "ri.foundry.main.dataset.xxxxxxx-xxxx-xxx-xx-xxxxxxxxxx" with rest_client.download_dataset_files_temporary(dataset_rid=rid, view='master') as temp_folder: df = pd.read_parquet(temp_folder) print(df.shape) ``` -------------------------------- ### Clone the Repository Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/contribute.md Clone the foundry-dev-tools repository to your local machine. ```shell git clone https://github.com/emdgroup/foundry-dev-tools cd foundry-dev-tools ``` -------------------------------- ### Parse Credentials Configuration Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/token_provider_implementation.md Parses a credentials configuration dictionary to create and return a TokenProvider object. Handles missing configurations, domain validation, and dynamic token provider instantiation. Use this when setting up authentication for Foundry services. ```python def parse_credentials_config(config_dict: dict | None) -> TokenProvider: """Parses the credentials config dictionary and returns a TokenProvider object.""" # check if there is a credentials config present if config_dict is not None and (credentials_config := config_dict.get("credentials")): # a domain must always be provided if "domain" not in credentials_config: raise MissingFoundryHostError # create a host object with the domain and the optional scheme setting host = Host(credentials_config.pop("domain"), credentials_config.pop("scheme", None)) # get the token provider config setting, if it does not exist use an empty dict try: tp_name, tp_config = credentials_config.popitem() # make it possible to do jwt = "eyJ" instead of jwt = {jwt="eyJ"} if tp_config is None or len(tp_config) == 0: tp_config = {} elif not isinstance(tp_config, dict): tp_config = {tp_name: tp_config} except KeyError: tp_name, tp_config = None, None if tp_name: if mapped_class := TOKEN_PROVIDER_MAPPING.get(tp_name): # check the config kwargs and pass the valid kwargs to the mapped class return mapped_class(**check_init(mapped_class, "credentials", {"host": host, **tp_config})) # if the token_provider name was set but not present in the mapping msg = f"The token provider implementation {tp_name} does not exist." raise TokenProviderConfigError(msg) # use flask/dash/streamlit provider when used in the app service if "APP_SERVICE_TS" in os.environ: return AppServiceTokenProvider(host=host) raise MISSING_TP_ERROR raise MissingCredentialsConfigError ``` -------------------------------- ### Apache License 2.0 Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/index.md The project is licensed under the Apache License, Version 2.0. This block shows the URL to obtain a copy of the license. ```none https://www.apache.org/licenses/LICENSE-2.0 ``` -------------------------------- ### Deploy Lambda with VPC Configuration Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/examples/lambda/README.md Deploys the Lambda function with specified VPC and subnet IDs, enabling it to access resources within a private network, such as Foundry. ```shell VPC_ID="vpc-123456" \ SUBNET_ID1="subnet-12345" \ SUBNET_ID2="subnet-12345" \ pixi run --environment ci cfn-deploy-lambda ``` -------------------------------- ### Upload Folder Content to Foundry Dataset Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Uploads all files from a local folder to a specified Foundry dataset. Creates the dataset if it doesn't exist. ```python from foundry_dev_tools import FoundryContext from pathlib import Path ctx = FoundryContext() dataset = ctx.get_dataset_by_path("/path/to/test_folder_upload", create_if_not_exist=True) dataset.upload_folder(Path("/path/to/folder-to-upload")) ``` -------------------------------- ### Multiple Profiles in User/System Configuration Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Defining multiple distinct profiles with their own configurations and credentials. ```toml [one.config] requests_ca_bundle = '/path/to/bundle/for/one' [one.credentials] domain = "one.plntr-domain" scheme = "http" jwt="eyJ..1" [two.config] requests_ca_bundle = '/path/to/bundle/for/two' [two.credentials] domain = "two.plntr-domain" jwt="eyJ..2" ``` -------------------------------- ### Load Dataset File In-Memory (FoundryRestClient) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads a specific file from a dataset into memory as bytes using FoundryRestClient. This method allows for efficient handling of files without writing them to disk. ```python from foundry_dev_tools import FoundryRestClient import pickle rest_client = FoundryRestClient() rid = rest_client.get_dataset_rid('/path/to/playground/model1') model_file_bytes = rest_client.download_dataset_file(dataset_rid=rid, output_directory=None, foundry_file_path='model.pickle', view='master') print(pickle.loads(model_file_bytes)) ``` -------------------------------- ### JWTTokenProvider Implementation Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/dev/architecture/token_provider_implementation.md The simplest implementation of a token provider, directly using a provided JWT token. It requires host and jwt token during initialization. ```default class JWTTokenProvider(TokenProvider): """Provides Host and Token.""" def __init__(self, host: Host | str, jwt: Token) -> None: """Initialize the JWTTokenProvider. Args: host: the foundry host jwt: the jwt token """ super().__init__(host) self._jwt = jwt @cached_property def token(self) -> Token: """Returns the token supplied when creating this Provider.""" return self._jwt ``` -------------------------------- ### List Folder Children using CompassClient Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/foundry_dev_tools.md Use the CompassClient via FoundryContext to retrieve a list of all objects within a specified folder, identified by its RID. ```python # list all children of the folder with the rid 'ri.compass.main.folder....' children = list(ctx.compass.get_child_objects_of_folder("ri.compass.main.folder....")) ``` -------------------------------- ### Setting a Default Profile in Project Configuration Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/configuration.md Specifying a default profile to be used for the project. ```toml profile = "one" ``` -------------------------------- ### Download Specific Files from Dataset (FoundryRestClient) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Downloads a specific list of files from a dataset using FoundryRestClient. This method allows for targeted downloads, saving time and storage space. ```python rid = "ri.foundry.main.dataset.xxxxxxx-xxxx-xxx-xxx-xxxxxxxxx" rest_client.download_dataset_files(dataset_rid=rid, output_directory='/paht/to/only_few_files', files=['file1.png', 'file2.png'], branch='master') ``` -------------------------------- ### OAuth Credentials Configuration (Basic) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Configure OAuth authentication with your Foundry domain, client ID, and an optional client secret. This is suitable for the authorization code grant. ```toml [credentials] domain = ".palantirfoundry.comp" [credentials.oauth] client_id = "client_id" client_secret = "client_secret" # optional with authorization code grant ``` -------------------------------- ### OAuth Credentials Configuration (Client Credentials Grant) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/getting_started/installation.md Configure OAuth authentication for the client credentials grant, requiring the domain, client ID, client secret, and specifying the grant type. ```toml [credentials] domain = "palantir foundry domain" [credentials.oauth] client_id = "your client_id" client_secret = "your client secret" # required with the client credentials grant grant_type = "client_credentials" ``` -------------------------------- ### Load Dataset File In-Memory (FoundryContext) Source: https://github.com/emdgroup/foundry-dev-tools/blob/main/docs/examples/dataset.md Loads a specific file from a dataset directly into memory as bytes using FoundryContext. This is useful for smaller files that do not require disk storage. ```python from foundry_dev_tools import FoundryContext import pickle ctx = FoundryContext() dataset = ctx.get_dataset_by_path("/path/to/playground/model1") model_file_bytes = dataset.get_file("model.pickle") print(pickle.loads(model_file_bytes)) ```