### Install Xee (Prerelease) Source: https://github.com/google/xee/blob/main/README.md Use this command to install the latest v0.1.0 prerelease of Xee. For other installation options, refer to the installation documentation. ```bash pip install --upgrade --pre xee ``` -------------------------------- ### Install Pixi Environments Source: https://github.com/google/xee/blob/main/docs/contributing.md Install Pixi environments for tests or documentation builds. Ensure Pixi is installed following upstream instructions. ```bash pixi install -e tests pixi install -e docs ``` -------------------------------- ### Install Xee Source: https://github.com/google/xee/blob/main/docs/client-vs-server.ipynb Install the Xee library if you are using Google Colab. ```python # Install Xee if using Colab. # !pip install -q xee ``` -------------------------------- ### Install Xee and Optional Dependencies Source: https://context7.com/google/xee/llms.txt Install Xee using pip or conda-forge. Optional packages for plotting and NetCDF export are also listed. ```bash pip install --upgrade --pre xee ``` ```bash pip install xee ``` ```bash conda install -c conda-forge xee ``` ```bash pip install matplotlib ``` ```bash pip install netCDF4 ``` ```bash pip install h5netcdf ``` -------------------------------- ### Install Xee Source: https://github.com/google/xee/blob/main/docs/quickstart.md Install the Xee package using pip. Matplotlib is optional for plotting. ```bash pip install --upgrade xee ``` -------------------------------- ### Minimal Xee Example Source: https://github.com/google/xee/blob/main/README.md This example demonstrates how to open an Earth Engine ImageCollection as an xarray.Dataset using Xee. Ensure you have authenticated with Earth Engine and replaced 'PROJECT-ID' with your actual Google Cloud project ID. The high-volume endpoint is recommended for reading stored collections. ```python import ee import xarray as xr from xee import helpers # Authenticate once (on a persistent machine): # earthengine authenticate project = 'PROJECT-ID' # Set your Earth Engine registered Google Cloud project ID # Initialize (high-volume endpoint recommended for reading stored collections) ee.Initialize(project=project, opt_url='https://earthengine-highvolume.googleapis.com') # Open a dataset by matching its native grid ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') grid = helpers.extract_grid_params(ic) ds = xr.open_dataset(ic, engine='ee', **grid) print(ds) ``` -------------------------------- ### Open Dataset with Old API (v0.0.x) Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a dataset using the older API with crs, scale, and geometry parameters. ```python import ee import xarray as xr ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', crs='EPSG:4326', scale=0.25, # pixel size in degrees geometry=ee.Geometry.Rectangle([-180, -90, 180, 90]) ) ``` -------------------------------- ### Install Xee with Conda Source: https://github.com/google/xee/blob/main/docs/installation.md Use this command to install Xee from the conda-forge channel. ```shell conda install -c conda-forge xee ``` -------------------------------- ### Open Dataset Fitting Geometry with Scale (New API) Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a dataset using the new API, fitting a specified geometry with a given scale. Requires shapely for geometry definition. ```python import ee import xarray as xr from xee import helpers import shapely # Define your area of interest using shapely aoi = shapely.geometry.box(-180, -90, 180, 90) # Global extent grid_params = helpers.fit_geometry( geometry=aoi, grid_crs='EPSG:4326', grid_scale=(0.25, -0.25) # (x_scale, y_scale) in degrees ) ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params ) ``` -------------------------------- ### Set Environment Variables for Xee Dataflow Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Configure essential environment variables for project, region, repository, container, and service account names. These are used throughout the example for infrastructure setup. ```shell PROJECT=$(gcloud config get-value project) REGION=us-central1 REPO=xee-dataflow CONTAINER=beam-runner SA_NAME=xee-dataflow-controller SERVICE_ACCOUNT=${SA_NAME}@${PROJECT}.iam.gserviceaccount.com ``` -------------------------------- ### Install netCDF4 or h5netcdf for Xarray Source: https://github.com/google/xee/blob/main/docs/faq.md Install either `netCDF4` or `h5netcdf` to enable Xarray to use preferred backends for writing NetCDF files, avoiding `int64` to `int32` casting errors. ```bash pip install netCDF4 ``` ```bash # or pip install h5netcdf ``` -------------------------------- ### Open Dataset with Old API (v0.0.x) - Fixed Scale Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a dataset using the older API with a fixed scale and a rectangular geometry. ```python ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', crs='EPSG:4326', scale=1.0, geometry=ee.Geometry.Rectangle([-180, -90, 180, 90]) ) ``` -------------------------------- ### Pin Specific Pre-release Xee Version Source: https://github.com/google/xee/blob/main/docs/installation.md Install a specific pre-release version of Xee by pinning the version number. ```shell pip install xee==0.1.1rc1 ``` -------------------------------- ### Open Dataset Fitting Geometry with Shape (New API) Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a dataset using the new API, fitting a specified geometry with a target shape (pixels). Requires shapely for geometry definition. ```python import shapely from xee import helpers aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia grid_params = helpers.fit_geometry( geometry=aoi, grid_crs='EPSG:4326', grid_shape=(256, 256) # (width, height) in pixels ) ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params ) ``` -------------------------------- ### Plotting the First Time Slice Source: https://github.com/google/xee/blob/main/docs/quickstart.md Plot the first time slice of the 'temperature_2m' variable from the dataset. Requires matplotlib to be installed. ```python ds['temperature_2m'].isel(time=0).plot() ``` -------------------------------- ### Authenticate Earth Engine (Persistent Environment) Source: https://github.com/google/xee/blob/main/docs/installation.md Authenticate the Earth Engine command-line utility for persistent environments like local machines. This is typically a one-time setup. ```shell earthengine authenticate ``` -------------------------------- ### Explicitly Use NetCDF4 or h5netcdf Engine Source: https://github.com/google/xee/blob/main/docs/faq.md When both `netCDF4` and `h5netcdf` are installed, you can explicitly specify the engine for `to_netcdf` to ensure compatibility with `int64` coordinates. ```python ds.to_netcdf("out.nc", engine="netcdf4") ``` ```python # or ds.to_netcdf("out.nc", engine="h5netcdf") ``` -------------------------------- ### Open Dataset with Old API (v0.0.x) - Custom Area Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a dataset with a custom region using the older API, requiring manual determination of the scale. ```python # You had to manually determine the scale from the dataset ds = xr.open_dataset( collection, engine='ee', crs='EPSG:4326', scale=0.25, # Manually determined geometry=my_region ) ``` -------------------------------- ### Pre-Processed ImageCollection with Custom Bands Source: https://github.com/google/xee/blob/main/docs/guide.md Apply server-side operations, such as calculating NDVI and adding it as a band, to an ImageCollection before opening it with xarray for efficiency. This example uses Landsat 9 data and requires `shapely` for geometry definition. ```python # Define an AOI as a shapely object for the helper function sf_aoi_shapely = shapely.geometry.Point(-122.4, 37.7).buffer(0.2) # Create an ee.Geometry from the shapely object for server-side filtering coords = list(sf_aoi_shapely.exterior.coords) sf_aoi_ee = ee.Geometry.Polygon(coords) # Define a function to calculate NDVI and add it as a band def add_ndvi(image): # Landsat 9 SR bands: NIR = B5, Red = B4 ndvi = image.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI') return image.addBands(ndvi) # Build the pre-processed collection processed_collection = ( ee.ImageCollection('LANDSAT/LC09/C02/T1_L2') .filterDate('2024-06-01', '2024-09-01') .filterBounds(sf_aoi_ee) .map(add_ndvi) .select(['NDVI'])) # Define the output grid using a helper grid_params = helpers.fit_geometry( geometry=sf_aoi_shapely, grid_crs='EPSG:32610', # Target CRS in meters (UTM Zone 10N) grid_scale=(30, -30) # Use Landsat's 30m resolution ) # Open the fully processed collection ds = xr.open_dataset(processed_collection, engine='ee', **grid_params) ``` -------------------------------- ### Build Documentation Locally Source: https://github.com/google/xee/blob/main/docs/README.md Navigate to the docs directory and run the make html command to build the documentation. Open the generated index.html file in your browser. ```bash cd docs make html open _build/html/index.html # or xdg-open on Linux ``` -------------------------------- ### Full `xr.open_dataset` Parameter Reference Source: https://context7.com/google/xee/llms.txt Demonstrates the extensive parameterization of `xr.open_dataset` for fine-grained control over data loading, CRS, transforms, chunking, and more. ```python import ee import xarray as xr from xee import helpers ee.Initialize(project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com') ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') grid = helpers.extract_grid_params(ic) # --- Full parameter reference --- ds_full = xr.open_dataset( ic, # ee.ImageCollection, ee.Image, asset ID str, or ee:// URI engine='ee', crs='EPSG:4326', # Required: output CRS crs_transform=(0.1,0,-180,0,-0.1,90), # Required: 6-tuple affine transform shape_2d=(3600, 1800), # Required: (width, height) in pixels n_images=24, # Limit to first 24 images (default: -1 = all) drop_variables=('u_component_of_wind_10m',), # Drop unwanted bands io_chunks={'time': 24, 'x': 256, 'y': 256}, # EE request window sizes chunks={'time': 12}, # Dask chunk sizes on returned Dataset request_byte_limit=32*1024*1024, # Per-request limit (default: 48 MB) primary_dim_name='time', # Name of the primary stacked dimension primary_dim_property='system:time_start', # EE property for primary coord ee_mask_value=None, # Sentinel for EE nodata (default: int32 max) mask_and_scale=True, # Lazily apply scale/offset/fill values decode_times=True, # Decode time coordinates to datetime64 fast_time_slicing=False, # ID-based slicing (faster but skips computed transforms) ee_init_if_necessary=False, # Auto-init EE on distributed workers ee_init_kwargs={'project': 'YOUR-PROJECT-ID'}, # Forwarded to ee.Initialize executor_kwargs={'max_workers': 4}, # ThreadPoolExecutor settings getitem_kwargs={'max_retries': 8, 'initial_delay': 1000}, # Retry backoff ) # Access and compute a variable temp = ds['temperature_2m'] # lazy DataArray, shape (time, y, x) monthly_mean = temp.mean(dim='time').compute() monthly_mean.plot() # Requires matplotlib ``` -------------------------------- ### Build Docs with Pixi Source: https://github.com/google/xee/blob/main/AGENTS.md Build the project documentation using Pixi. This command is part of the development workflow for ensuring documentation is up-to-date. ```bash pixi run -e docs docs-build ``` -------------------------------- ### Run Development Commands with Pixi Source: https://github.com/google/xee/blob/main/docs/contributing.md Execute common development tasks such as running tests or building documentation using Pixi environments. Use the appropriate environment flag (-e). ```bash pixi run -e tests pytest -q xee/ext_test.py ``` ```bash pixi run -e tests pytest -q xee/ext_integration_test.py ``` ```bash pixi run -e docs docs-build ``` ```bash pixi run -e docs docs-check ``` -------------------------------- ### Visualize a Time Slice of a Dataset Source: https://github.com/google/xee/blob/main/docs/guide.md Opens a dataset and visualizes a specific time slice of a variable. Requires `matplotlib` to be installed. ```python # First, open a dataset using one of the methods above aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia grid_params = helpers.fit_geometry( geometry=aoi, grid_crs='EPSG:4326', grid_shape=(256, 256) ) ds = xr.open_dataset('ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params) # Select the 2m air temperature for the first time step temp_slice = ds['temperature_2m'].isel(time=0) ``` -------------------------------- ### Migrate Global Dataset to New API Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Before and after comparison for opening a global dataset, demonstrating the migration from old parameters to the new fit_geometry helper. ```python ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', crs='EPSG:4326', scale=1.0, geometry=ee.Geometry.Rectangle([-180, -90, 180, 90]) ) ``` ```python import shapely from xee import helpers global_geom = shapely.geometry.box(-180, -90, 180, 90) grid_params = helpers.fit_geometry( geometry=global_geom, grid_crs='EPSG:4326', grid_scale=(1.0, -1.0) # Note: negative y-scale for north-up orientation ) ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params ) ``` -------------------------------- ### Open Dataset Matching Source Grid (New API) Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Recommended method for opening a dataset using the new API by extracting grid parameters from the ImageCollection. ```python import ee import xarray as xr from xee import helpers ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') grid_params = helpers.extract_grid_params(ic) ds = xr.open_dataset(ic, engine='ee', **grid_params) ``` -------------------------------- ### Use Source Resolution for Custom Area (New API) Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Demonstrates how to extract source grid parameters (CRS and scale) and apply them to a custom region using fit_geometry for consistent resolution. ```python from xee import helpers import shapely # 1. Extract source grid parameters ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') source_params = helpers.extract_grid_params(ic) # 2. Get the source scale source_crs = source_params['crs'] source_transform = source_params['crs_transform'] source_scale = (source_transform[0], source_transform[4]) # 3. Apply to your custom region my_region = shapely.geometry.box(-10, 35, 5, 50) # Western Europe grid_params = helpers.fit_geometry( geometry=my_region, geometry_crs='EPSG:4326', grid_crs=source_crs, grid_scale=source_scale ) ds = xr.open_dataset(ic, engine='ee', **grid_params) ``` -------------------------------- ### Initialize Earth Engine (Standard Endpoint) Source: https://github.com/google/xee/blob/main/docs/installation.md Initialize the Earth Engine client to connect to the standard endpoint, which is efficient for computed data and iterative development due to caching. Replace 'your-project-id' with your actual Google Cloud project ID. ```python ee.Initialize(project='your-project-id') ``` -------------------------------- ### Pattern 1: Global analysis with xee v0.1.0 Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Migrates a global analysis workflow to use xee's `helpers.fit_geometry` for grid parameters, removing the need for manual transpose during plotting. ```python import ee import xarray as xr ee.Initialize() ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', crs='EPSG:4326', scale=1.0, geometry=ee.Geometry.Rectangle([-180, -90, 180, 90]) ) mean_temp = ds['temperature_2m'].mean(dim='time') mean_temp.transpose().plot() ``` ```python import ee import xarray as xr from xee import helpers import shapely ee.Initialize() global_geom = shapely.geometry.box(-180, -90, 180, 90) grid_params = helpers.fit_geometry( geometry=global_geom, grid_crs='EPSG:4326', grid_scale=(1.0, -1.0) ) ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params ) mean_temp = ds['temperature_2m'].mean(dim='time') mean_temp.plot() # No transpose needed ``` -------------------------------- ### Open Dataset with Old API (v0.0.x) - Regional Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Example of opening a regional dataset using the older API with a specific CRS, scale, and Earth Engine geometry. ```python import ee aoi = ee.Geometry.Rectangle([-122.5, 37.0, -121.5, 38.0]) ds = xr.open_dataset( 'LANDSAT/LC09/C02/T1_L2', engine='ee', crs='EPSG:32610', scale=30, geometry=aoi ) ``` -------------------------------- ### Create Cloud Storage Bucket for Output Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Create a Cloud Storage bucket to store pipeline output. The bucket name is constructed using a prefix and the Cloud Project ID to ensure global uniqueness. ```shell gcloud storage buckets create gs://xee-out-${PROJECT} --location=$REGION ``` -------------------------------- ### Run Unit Tests with Pixi Source: https://github.com/google/xee/blob/main/AGENTS.md Execute unit tests for Xee using Pixi environments to ensure reproducible behavior. This command targets specific test files within the xee/ext_test.py path. ```bash pixi run -e tests pytest -q xee/ext_test.py ``` -------------------------------- ### Run Integration Tests with Pixi Source: https://github.com/google/xee/blob/main/AGENTS.md Execute integration tests for Xee using Pixi environments. This command focuses on tests located in xee/ext_integration_test.py. ```bash pixi run -e tests pytest -q xee/ext_integration_test.py ``` -------------------------------- ### Extract Grid Parameters from Earth Engine Object Source: https://context7.com/google/xee/llms.txt Use `helpers.extract_grid_params` to get native grid parameters (CRS, transform, shape) from an `ee.Image` or `ee.ImageCollection`. This is recommended for exploratory analysis and source grid alignment. The first image's first band is used for ImageCollections. ```python import ee import xarray as xr from xee import helpers ee.Initialize(project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com') ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') # Returns: {'crs': 'EPSG:4326', 'crs_transform': (0.1, 0, -180, 0, -0.1, 90), 'shape_2d': (3600, 1800)} grid_params = helpers.extract_grid_params(ic) print(grid_params) ds = xr.open_dataset(ic, engine='ee', **grid_params) print(ds) # # Dimensions: (time: N, y: 1800, x: 3600) # Coordinates: # * time (time) datetime64[ns] ... # * y (y) float64 ... # * x (x) float64 ... # Data variables: # temperature_2m (time, y, x) float32 ... # ... # Also works with a single ee.Image img = ee.Image('ECMWF/ERA5_LAND/MONTHLY_AGGR/202501') grid_params = helpers.extract_grid_params(img) ds_single = xr.open_dataset(img, engine='ee', **grid_params) ``` -------------------------------- ### EarthEngineStore Initialization Source: https://github.com/google/xee/blob/main/docs/_autosummary/xee.EarthEngineStore.rst Initializes an EarthEngineStore object. This is the primary constructor for the class. ```APIDOC ## EarthEngineStore.__init__ ### Description Initializes an EarthEngineStore object. ### Method __init__ ``` -------------------------------- ### Initialize Earth Engine Client Source: https://github.com/google/xee/blob/main/docs/quickstart.md Initialize the Earth Engine client. The high-volume endpoint is recommended for reading stored collections. Omit `opt_url` for computed collections to use the standard endpoint with caching. ```python import ee ee.Initialize( project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com' ) ``` -------------------------------- ### Fit Geometry with Buffer Source: https://context7.com/google/xee/llms.txt Use `helpers.fit_geometry` with a buffer option to expand the geometry before fitting it to the grid. This is useful for including areas around a point of interest. ```python grid_params_buf = helpers.fit_geometry( geometry=shapely.geometry.Point(-122.4, 37.7).buffer(0), geometry_crs='EPSG:4326', buffer=0.5, # Expand geometry by 0.5 degrees before fitting grid_crs='EPSG:4326', grid_scale=(0.01, -0.01) ) ``` -------------------------------- ### Update cloudbuild.yaml with Environment Variables Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Use sed to replace placeholder variables in the cloudbuild.yaml file with actual environment variables for region, project, repository, and container names. This prepares the build configuration. ```shell sed -i 's/REGION/'"$REGION"'/g; s/YOUR_PROJECT/'"$PROJECT"'/g; s/REPO/'"$REPO"'/g; s/CONTAINER/'"$CONTAINER"'/g' cloudbuild.yaml ``` -------------------------------- ### Authenticate and Initialize Earth Engine Source: https://github.com/google/xee/blob/main/docs/client-vs-server.ipynb Authenticate your Earth Engine account and initialize the Earth Engine API with your Cloud project ID. ```python ee.Authenticate() ee.Initialize(project='project-id') # Edit for your Cloud project ID ``` -------------------------------- ### Update to Helper-Based Grid Parameters Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Replace the old `xr.open_dataset` call with helper functions to define grid parameters, resolving 'Missing required parameter' errors. ```python # Add these imports from xee import helpers import shapely # Replace your old xr.open_dataset call with helper-based approach grid_params = helpers.fit_geometry( geometry=your_geometry, grid_crs='EPSG:4326', grid_scale=(your_scale, -your_scale) ) ds = xr.open_dataset(collection, engine='ee', **grid_params) ``` -------------------------------- ### Submit Docker Container Build to Cloud Build Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Build the Docker container using Cloud Build with the specified configuration file. This command uploads the build configuration and source code to Cloud Build for execution. ```shell gcloud builds submit --config cloudbuild.yaml ``` -------------------------------- ### Initialize Earth Engine (High-Volume Endpoint) Source: https://github.com/google/xee/blob/main/docs/installation.md Initialize the Earth Engine client to connect to the high-volume endpoint, suitable for requesting stored data collections. Replace 'your-project-id' with your actual Google Cloud project ID. ```python ee.Initialize( project='your-project-id', opt_url='https://earthengine-highvolume.googleapis.com' ) ``` -------------------------------- ### Test Xee v0.1.0 Migration Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Verify your migration by opening a dataset, checking dimensions, plotting, and confirming CRS and transform parameters. ```python import ee import xarray as xr from xee import helpers import shapely # Initialize Earth Engine ee.Initialize(project='YOUR-PROJECT') # Test 1: Open a dataset ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR').limit(5) grid_params = helpers.extract_grid_params(ic) ds = xr.open_dataset(ic, engine='ee', **grid_params) # Test 2: Check dimensions print("Dimensions:", ds['temperature_2m'].dims) # Should print: ('time', 'y', 'x') # Test 3: Plot without transpose ds['temperature_2m'].isel(time=0).plot() # Test 4: Verify CRS and transform print("CRS:", grid_params['crs']) print("Transform:", grid_params['crs_transform']) print("Shape:", grid_params['shape_2d']) ``` -------------------------------- ### Import Libraries Source: https://github.com/google/xee/blob/main/docs/client-vs-server.ipynb Import necessary libraries for Earth Engine, Xarray, Xee, and Shapely. ```python import ee import xarray from xee import helpers import shapely ``` -------------------------------- ### Pattern 3: Export workflows with xee v0.1.0 Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Adapts export workflows to use `xee.helpers.fit_geometry` for grid parameters, removing the need for manual transpose before exporting to NetCDF. ```python import xarray as xr ds = xr.open_dataset( collection, engine='ee', crs='EPSG:4326', scale=0.1, geometry=region ) # Transpose for proper export data = ds['variable'].transpose('time', 'y', 'x') data.to_netcdf('output.nc') ``` ```python import xarray as xr from xee import helpers grid_params = helpers.fit_geometry( geometry=region, grid_crs='EPSG:4326', grid_scale=(0.1, -0.1) ) ds = xr.open_dataset(collection, engine='ee', **grid_params) # Already in correct dimension order data = ds['variable'] data.to_netcdf('output.nc') ``` -------------------------------- ### EarthEngineBackendArray Methods Source: https://github.com/google/xee/blob/main/docs/_autosummary/xee.EarthEngineBackendArray.rst This section details the methods available for the EarthEngineBackendArray class, including initialization, asynchronous data retrieval, and duck array compatibility. ```APIDOC ## EarthEngineBackendArray.__init__ ### Description Initializes the EarthEngineBackendArray. ### Method __init__ ``` ```APIDOC ## EarthEngineBackendArray.async_get_duck_array ### Description Asynchronously retrieves the duck array representation of the Earth Engine array. ### Method async_get_duck_array ``` ```APIDOC ## EarthEngineBackendArray.async_getitem ### Description Asynchronously retrieves a slice or element from the Earth Engine array. ### Method async_getitem ``` ```APIDOC ## EarthEngineBackendArray.get_duck_array ### Description Retrieves the duck array representation of the Earth Engine array. ### Method get_duck_array ``` -------------------------------- ### EarthEngineStore.open Source: https://github.com/google/xee/blob/main/docs/_autosummary/xee.EarthEngineStore.rst Opens an Earth Engine store, typically by providing an Earth Engine asset ID or path. ```APIDOC ## EarthEngineStore.open ### Description Open an Earth Engine store. ### Method open ``` -------------------------------- ### Run Xee Integration Tests Source: https://github.com/google/xee/blob/main/docs/contributing.md Run Xee integration tests locally using either the unittest module or pytest. Ensure you are authenticated with 'earthengine authenticate' and are on an Xee branch. ```bash pixi run -e tests python -m unittest xee/ext_integration_test.py ``` ```bash pixi run -e tests python -m pytest xee/ext_integration_test.py ``` -------------------------------- ### Create Artifact Registry Repository Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Create a Docker repository in Artifact Registry to store custom Docker containers for Beam pipelines. This command specifies the repository name, location, and format. ```shell gcloud artifacts repositories create $REPO \ --location=$REGION \ --repository-format=docker \ --description="Repository for hosting the Docker images to test xee with Dataflow" \ --async ``` -------------------------------- ### Check Docs with Pixi Source: https://github.com/google/xee/blob/main/AGENTS.md Perform a strict check on the project documentation using Pixi. This is a required step before proposing changes that affect documentation. ```bash pixi run -e docs docs-check ``` -------------------------------- ### Migrate Regional Dataset with EE Geometry to New API Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Before and after comparison for opening a regional dataset using an Earth Engine geometry, showing the conversion to shapely and use of fit_geometry. ```python import ee aoi = ee.Geometry.Rectangle([-122.5, 37.0, -121.5, 38.0]) ds = xr.open_dataset( 'LANDSAT/LC09/C02/T1_L2', engine='ee', crs='EPSG:32610', scale=30, geometry=aoi ) ``` ```python import ee import shapely from xee import helpers # Convert EE geometry to shapely (or create directly with shapely) aoi = shapely.geometry.box(-122.5, 37.0, -121.5, 38.0) grid_params = helpers.fit_geometry( geometry=aoi, geometry_crs='EPSG:4326', # Input geometry CRS grid_crs='EPSG:32610', # Output grid CRS (UTM Zone 10N) grid_scale=(30, -30) # 30m resolution ) ds = xr.open_dataset( 'LANDSAT/LC09/C02/T1_L2', engine='ee', **grid_params ) ``` -------------------------------- ### Tuning EE Request Windows and Dask Chunks Source: https://context7.com/google/xee/llms.txt Configure `io_chunks` for Earth Engine request windows and `chunks` for Dask chunking. Adjust `request_byte_limit` and `executor_kwargs` for parallel requests. `fast_time_slicing` can improve performance for stored collections. ```python import ee import xarray as xr from xee import helpers ee.Initialize(project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com') ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') grid = helpers.extract_grid_params(ic) ds = xr.open_dataset( ic, engine='ee', **grid, n_images=120, # Only load 10 years of monthly data drop_variables=('v_component_of_wind_10m',), # Drop unneeded bands chunks={'time': 12}, # Dask: 1 year per chunk io_chunks={'time': 24, 'x': 256, 'y': 256}, # EE window: 2 years per request request_byte_limit=32*1024*1024, # Conservative 32 MB limit executor_kwargs={'max_workers': 4}, # Parallel EE requests getitem_kwargs={'max_retries': 8, 'initial_delay': 1000}, # Robust retries fast_time_slicing=True, # Faster for stored (non-computed) collections ) # Keep lazy until needed temp_mean = ds['temperature_2m'].isel(time=slice(0, 12)).mean('time') result = temp_mean.compute() # Triggers EE fetch # Persist to avoid re-fetching ds_persisted = ds.persist() # Requires Dask distributed cluster ``` -------------------------------- ### EarthEngineStore.load Source: https://github.com/google/xee/blob/main/docs/_autosummary/xee.EarthEngineStore.rst Loads data from the Earth Engine store into memory. ```APIDOC ## EarthEngineStore.load ### Description Load data from the store. ### Method load ``` -------------------------------- ### xr.open_dataset(..., engine='ee') Source: https://github.com/google/xee/blob/main/docs/open_dataset.md This is the primary user-facing API for opening datasets with the Earth Engine backend. It routes calls to xee.EarthEngineBackendEntrypoint.open_dataset. ```APIDOC ## `xr.open_dataset(..., engine='ee')` ### Description Opens a dataset using the Earth Engine backend, streaming pixels and metadata via `xee.EarthEngineStore`. ### Parameters When `engine='ee'`, the following grid parameters are required at call time: - `crs` (str) - Required - Output coordinate reference system. - `crs_transform` (tuple[float, float, float, float, float, float] | Affine) - Required - Geotransform defining pixel size/origin in the selected CRS. - `shape_2d` (tuple[int, int]) - Required - Pixel grid size in (width, height) order. Other common parameters include: - `filename_or_obj` (ee.ImageCollection | ee.Image | str) - Required - Input source, can be an `ee.ImageCollection`, `ee.Image`, or an asset id string/path. - `chunks` (int | dict[Any, Any] | Literal['auto'] | None) - Optional - Dask/Xarray chunking for the returned dataset. Defaults to None. - `n_images` (int) - Optional - Limit the number of images loaded from the collection. Defaults to -1 (all images). - `primary_dim_name` (str | None) - Optional - Rename the primary stacked dimension. Defaults to 'time'. - `primary_dim_property` (str | None) - Optional - EE image property used to derive primary-dimension coordinate values. Defaults to 'system:time_start'. ### Input Source (`filename_or_obj`) Can be one of: - An `ee.ImageCollection` object - An `ee.Image` object (wrapped internally as an ImageCollection) - An asset id string/path, including `ee://...` / `ee:...` style URIs Example catalog path: `ECMWF/ERA5_LAND/MONTHLY_AGGR` (or URI form `ee://ECMWF/ERA5_LAND/MONTHLY_AGGR`). ### Parameter Name Mapping (User API vs Core Backend) | User-facing (`xr.open_dataset`) | Core backend (`EarthEngineStore.open`) | Notes | |---|---|---| | `filename_or_obj` | `image_collection` | Backend always operates on an `ee.ImageCollection` | | `io_chunks` | `chunk_store` / `chunks` | Same concept, different name at different layers | | `ee_mask_value` | `mask_value` | Same behavior | ### Request Example ```python import xarray as xr # Example using an asset ID dataset = xr.open_dataset( "ECMWF/ERA5_LAND/MONTHLY_AGGR", engine="ee", crs="EPSG:4326", crs_transform=(0.1, 0, -180, 0, -0.1, 90), shape_2d=(3600, 1800), chunks=128 ) # Example using an ee.Image object # import ee # ee.Initialize() # image = ee.Image('...') # dataset = xr.open_dataset(image, engine="ee", ...) ``` ### Response #### Success Response (xarray.Dataset) Returns an xarray Dataset with data streamed from Earth Engine. #### Response Example ```json { "example": "xarray.Dataset object representing the Earth Engine data" } ``` ``` -------------------------------- ### Open Pre-Processed Earth Engine Collection with Xarray Source: https://context7.com/google/xee/llms.txt Build a collection using Earth Engine operations and pass the result to `xr.open_dataset`. Use the standard endpoint for computed collections to leverage server-side caching. ```python import ee import xarray as xr import shapely from xee import helpers ee.Initialize(project='YOUR-PROJECT-ID') # Standard endpoint for computed collections sf_aoi_shapely = shapely.geometry.Point(-122.4, 37.5).buffer(0.2) coords = list(sf_aoi_shapely.exterior.coords) sf_aoi_ee = ee.Geometry.Polygon(coords) def add_ndvi(image): ndvi = image.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI') return image.addBands(ndvi) # Build computed collection with filtering, mapping, and band selection processed = ( ee.ImageCollection('LANDSAT/LC09/C02/T1_L2') .filterDate('2024-06-01', '2024-09-01') .filterBounds(sf_aoi_ee) .map(add_ndvi) .select(['NDVI']) ) grid = helpers.fit_geometry( geometry=sf_aoi_shapely, grid_crs='EPSG:32610', # UTM Zone 10N (meters) grid_scale=(30, -30) # Landsat native 30 m resolution ) ds = xr.open_dataset(processed, engine='ee', **grid) print(ds['NDVI']) # ``` -------------------------------- ### Server-side Linear Regression with Earth Engine Source: https://github.com/google/xee/blob/main/docs/client-vs-server.ipynb Computes the slope of July temperature on the server using Earth Engine's `linearFit` reducer. Only the resulting slope image is downloaded, minimizing data transfer. ```python def k_to_c(image): return image.select().addBands(image.subtract(273.15)) def add_year_band(image): year = ee.Image(image.date().get('year')).rename('year').toFloat() return image.addBands(year) july_deg_c = (ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') .filterDate('1960', '2020') .filter(ee.Filter.calendarRange(7, None, 'month')) .select('temperature_2m') .map(k_to_c) .map(add_year_band)) coeff = july_deg_c.select(['year', 'temperature_2m']).reduce( ee.Reducer.linearFit()) global_geom = shapely.geometry.box(-180, -90, 180, 90) grid_params = helpers.fit_geometry( geometry=global_geom, grid_crs='EPSG:4326', grid_scale=(1.0, -1.0) ) ds = xarray.open_dataset( ee.ImageCollection([coeff]), engine='ee', **grid_params ) slope = ds['scale'] slope.plot() ``` -------------------------------- ### Authenticate Earth Engine Source: https://github.com/google/xee/blob/main/docs/quickstart.md Authenticate your Earth Engine access. Use `earthengine authenticate` for persistent machines or `ee.Authenticate()` in ephemeral environments like Colab. ```bash earthengine authenticate ``` ```python import ee ee.Authenticate() ``` -------------------------------- ### EarthEngineBackendEntrypoint Methods Source: https://github.com/google/xee/blob/main/docs/_autosummary/xee.EarthEngineBackendEntrypoint.rst This section details the methods available for the EarthEngineBackendEntrypoint class, allowing users to interact with Earth Engine datasets. ```APIDOC ## `__init__` ### Description Initializes the EarthEngineBackendEntrypoint. ### Method `__init__()` ### Parameters This method does not take any explicit parameters. ## `guess_can_open` ### Description Guesses if the backend can open a given path. ### Method `guess_can_open(path: str)` ### Parameters #### Path Parameters - **path** (str) - Required - The path to check. ## `open_dataset` ### Description Opens a dataset from Earth Engine. ### Method `open_dataset(path: str, **kwargs)` ### Parameters #### Path Parameters - **path** (str) - Required - The path to the Earth Engine asset. #### Keyword Arguments - **kwargs** - Additional keyword arguments to pass to the underlying Earth Engine functions. ### Request Example ```python entrypoint.open_dataset("users/your_username/your_asset") ``` ## `open_datatree` ### Description Opens a dataset as a DataTree from Earth Engine. ### Method `open_datatree(path: str, **kwargs)` ### Parameters #### Path Parameters - **path** (str) - Required - The path to the Earth Engine asset. #### Keyword Arguments - **kwargs** - Additional keyword arguments to pass to the underlying Earth Engine functions. ### Request Example ```python entrypoint.open_datatree("users/your_username/your_asset") ``` ## `open_groups_as_dict` ### Description Opens groups from an Earth Engine asset as a dictionary. ### Method `open_groups_as_dict(path: str, **kwargs)` ### Parameters #### Path Parameters - **path** (str) - Required - The path to the Earth Engine asset. #### Keyword Arguments - **kwargs** - Additional keyword arguments to pass to the underlying Earth Engine functions. ### Request Example ```python entrypoint.open_groups_as_dict("users/your_username/your_asset") ``` ``` -------------------------------- ### Automatic EE Initialization on Remote Dask Workers Source: https://context7.com/google/xee/llms.txt Enable automatic Earth Engine initialization on remote Dask workers using `ee_init_if_necessary=True` and `ee_init_kwargs`. This avoids manual initialization on the scheduler. ```python import ee import xarray as xr from xee import helpers import shapely # Do NOT initialize EE here on the scheduler—let workers self-initialize # ee.Initialize(...) # omit or use only for grid parameter derivation aoi = shapely.geometry.box(-10, 35, 5, 50) # Western Europe grid = helpers.fit_geometry( geometry=aoi, grid_crs='EPSG:4326', grid_scale=(0.1, -0.1) ) ds = xr.open_dataset( 'ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid, chunks={'time': 6}, ee_init_if_necessary=True, # Workers will call ee.Initialize automatically ee_init_kwargs={ 'project': 'YOUR-PROJECT-ID', 'opt_url': 'https://earthengine-highvolume.googleapis.com', }, ) # Submit to Dask distributed cluster import dask.distributed as dd client = dd.Client() result = ds['temperature_2m'].mean('time').compute() ``` -------------------------------- ### Run Dataflow Pipeline with Earth Engine Data Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Execute the `ee_to_zarr_dataflow.py` script to pull data from Earth Engine, transform it to Zarr, and store it. Pass command-line arguments to define execution parameters. ```shell python ee_to_zarr_dataflow.py \ --input NASA/GPM_L3/IMERG_V06 \ --output gs://xee-out-${PROJECT}/output/ \ --target_chunks='time=6' \ --runner DataflowRunner \ --project $PROJECT \ --region $REGION \ --temp_location gs://xee-out-${PROJECT}/tmp/ \ --service_account_email $SERVICE_ACCOUNT \ --sdk_location=container \ --sdk_container_image=${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${CONTAINER} \ --job_name imerg-dataflow-$(date '+%Y%m%d%H%M%S') ``` -------------------------------- ### Minimal `xr.open_dataset` Call Source: https://context7.com/google/xee/llms.txt Load an Earth Engine ImageCollection into a lazy xarray Dataset using the minimal required parameters: ImageCollection, engine, and extracted grid parameters. ```python import ee import xarray as xr from xee import helpers ee.Initialize(project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com') ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR') grid = helpers.extract_grid_params(ic) # --- Minimal call --- ds = xr.open_dataset(ic, engine='ee', **grid) ``` -------------------------------- ### Create Service Account for Dataflow Source: https://github.com/google/xee/blob/main/examples/dataflow/README.md Create a dedicated service account for the Xee Dataflow process. This account will be used for authorization of remote workers to interact with Google Cloud services. ```shell gcloud iam service-accounts create ${SA_NAME} \ --description="Controller service account for services used with Dataflow" \ --display-name="Xee Dataflow Controller SA" ``` -------------------------------- ### User Grid Helpers Source: https://github.com/google/xee/blob/main/docs/api.md High-level utilities for deriving or matching pixel grid parameters passed to xarray.open_dataset(..., engine='ee'). ```APIDOC ## User Grid Helpers High-level utilities for deriving or matching pixel grid parameters passed to ``xarray.open_dataset(..., engine='ee')``. ### Functions - `fit_geometry` - `extract_grid_params` - `set_scale` ### Classes - `PixelGridParams` ``` -------------------------------- ### Pre-PR Checks Source: https://github.com/google/xee/blob/main/docs/contributing.md Perform essential checks before opening a Pull Request, including running unit tests and checking documentation. This ensures code quality and documentation integrity. ```bash pixi run -e tests pytest -q xee/ext_test.py pixi run -e docs docs-check ``` -------------------------------- ### Earth Engine Authentication and Initialization Source: https://context7.com/google/xee/llms.txt Authenticate and initialize the Earth Engine client. Use the high-volume endpoint for stored ImageCollections and the standard endpoint for computed workflows. ```python import ee # One-time authentication on a persistent machine (CLI) # earthengine authenticate # Or inside ephemeral environments (Colab/notebook) ee.Authenticate() # Initialize with high-volume endpoint (recommended for stored ImageCollections) ee.Initialize( project='YOUR-PROJECT-ID', opt_url='https://earthengine-highvolume.googleapis.com' ) # Standard endpoint (computed collections / iterative development) ee.Initialize(project='YOUR-PROJECT-ID') ``` -------------------------------- ### Pin to Older Xee Version Source: https://github.com/google/xee/blob/main/docs/migration-guide-v0.1.0.md Temporarily maintain the old API by pinning to a version older than v0.1.0 using pip. ```bash pip install "xee<0.1.0" ```