### Install xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/getting_started.rst Installs the xarray-regrid package using pip. This is the first step to using the library's functionalities. ```shell pip install xarray-regrid ``` -------------------------------- ### Basic Regridding with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/getting_started.rst Demonstrates how to import the xarray_regrid library, load input and target grid data using xarray, and then apply linear or conservative regridding. It shows the basic workflow for transforming data to a new resolution. ```python import xarray_regrid import xarray ds = xr.open_dataset("input_data.nc") ds_grid = xr.open_dataset("target_grid.nc") ds = ds.regrid.linear(ds_grid) # or, for example: ds = ds.regrid.conservative(ds_grid, latitude_coord="lat") ``` -------------------------------- ### Setup Source and Target Grids for Regridding (Python) Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb Initializes source and target xarray Datasets for regridding operations using xarray-regrid. It defines grid resolutions and boundaries. Dependencies include dask, xarray, and xarray-regrid. ```python import dask.array as da import xarray as xr import xesmf import xarray_regrid bounds = dict(south=-90, north=90, west=-180, east=180) source = xarray_regrid.Grid( resolution_lat=0.25, resolution_lon=0.25, **bounds, ).create_regridding_dataset() target = xarray_regrid.Grid( resolution_lat=1, resolution_lon=1, **bounds, ).create_regridding_dataset() ``` -------------------------------- ### Install xarray-regrid with Acceleration Extras Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/README.md Installs xarray-regrid with optional dependencies for enhanced performance, including dask for parallelization, sparse for sparse weight matrices, and opt-einsum for optimized routines. ```console pip install xarray-regrid[accel] ``` -------------------------------- ### Create Source Data and Target Grid for Regridding Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Demonstrates how to create a sample xarray Dataset for source data and define a target grid using xarray-regrid's Grid utility. This is the initial step before performing any regridding operations. ```python import xarray as xr import xarray_regrid import numpy as np # Create source data ds = xr.Dataset({ "temp": (["lat", "lon"], [[20, 21], [22, 23]]), "lat": [50, 51], "lon": [5, 6] }) # Define target grid target = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid(north=52, east=7, south=49, west=4, resolution_lat=1, resolution_lon=1) ) ``` -------------------------------- ### Import Modules and Load Data with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Imports necessary libraries including dask, xarray, xarray_regrid, and xesmf. It initializes a Dask client and loads the ERA5 monthly dewpoint temperature dataset. This sets up the environment for regridding operations. ```python from time import time import dask.distributed import xarray_regrid # Importing this will make Dataset.regrid accessible. import xarray as xr from xarray_regrid import Grid import xesmf as xe client = dask.distributed.Client() ds = xr.open_dataset( "data/era5_2m_dewpoint_temperature_2000_monthly.nc", ) ``` -------------------------------- ### CDO Command Line Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Demonstrates the command-line usage of Climate Data Operators (CDO) for bilinear regridding. It includes steps to convert the input datasets to 64-bit floats for accuracy and then perform the remapping using `cdo remapbil`. The output is saved to a NetCDF file. ```bash cdo -b 64 -copy era5_2m_dewpoint_temperature_2000_monthly.nc era5_2m_dewpoint_temperature_2000_monthly_64b.nc cdo -b 64 -copy new_grid.nc new_grid_64b.nc cdo remapbil,new_grid_64b.nc era5_2m_dewpoint_temperature_2000_monthly_64b.nc cdo_bilinear_64b.nc ``` -------------------------------- ### Create Target Dataset for Regridding with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Defines a new target grid using the `Grid` dataclass from `xarray-regrid`. It then creates a dataset representing this new grid, which will be used as the target for regridding operations with both xarray-regrid and xESMF. ```python new_grid = Grid( north=90, east=180, south=45, west=90, resolution_lat=0.17, resolution_lon=0.17, ) target_dataset = new_grid.create_regridding_dataset() ``` -------------------------------- ### Perform Linear Regridding with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/README.md Demonstrates basic usage of the xarray-regrid accessor to perform a linear regridding operation. It requires importing the library, opening input and target grid datasets, and then applying the regrid accessor. ```python import xarray_regrid ds = xr.open_dataset("input_data.nc") ds_grid = xr.open_dataset("target_grid.nc") ds.regrid.linear(ds_grid) ``` -------------------------------- ### Instantiate xesmf Regridder (Python) Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb Creates an instance of the xesmf Regridder using the 'conservative' method. This process can be computationally expensive for large grids. Dependencies include xesmf, and the pre-defined source and target grids. ```python # For larger grids, generating weights is quite expensive xesmf_regridder = xesmf.Regridder(source, target, "conservative") ``` -------------------------------- ### Import Libraries and Initialize Dask Client Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Imports the xarray and xarray_regrid libraries, along with dask.distributed for parallel computing. Initializes a Dask client to manage distributed computations, which is beneficial for large datasets. ```python import xarray as xr import xarray_regrid import dask.distributed client = dask.distributed.Client() ``` -------------------------------- ### Regrid data using CDO command-line tool Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Demonstrates how to perform conservative regridding using the Climate Data Operators (CDO) command-line program. This involves converting the dataset to 64-bit floats, creating a target grid file, and then using `cdo remapcon` to perform the regridding. The output is saved to a new NetCDF file. ```bash import numpy as np target_dataset["test_data"] = ( ["latitude", "longitude"], np.zeros((target_dataset["latitude"].size, target_dataset["longitude"].size)), ) target_dataset.to_netcdf("data/new_grid_con.nc") !cdo -b 64 -copy data/era5_total_precipitation_2020_monthly.nc data/era5_total_precipitation_2020_monthly_64b.nc !cdo -b 64 -copy data/new_grid_con.nc data/new_grid_64b.nc !cdo remapcon,data/new_grid_64b.nc data/era5_total_precipitation_2020_monthly_64b.nc data/cdo_conservative_64b.nc data_cdo = xr.open_dataset("data/cdo_conservative_64b.nc") ``` -------------------------------- ### Create Target Grid and Dataset Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_nearest.ipynb Defines a new geographical grid using the `Grid` dataclass from `xarray-regrid` and creates a corresponding target dataset with specified boundaries and resolution. This dataset serves as the target for regridding operations. ```python new_grid = Grid( north=90, east=180, south=45, west=90, resolution_lat=0.17, resolution_lon=0.17, ) target_dataset = new_grid.create_regridding_dataset() target_dataset ``` -------------------------------- ### Regrid Data using CDO (Command Line) Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_nearest.ipynb Demonstrates the command-line usage of Climate Data Operators (CDO) for nearest-neighbor regridding. It includes steps for converting the dataset to 64-bit floats for accuracy and then applying the `remapnn` operation. The regridded data is then loaded back into an xarray Dataset. ```bash # Convert to 64-bit floats cdo -b 64 -copy era5_2m_dewpoint_temperature_2000_monthly.nc era5_2m_dewpoint_temperature_2000_monthly_64b.nc cdo -b 64 -copy new_grid.nc new_grid_64b.nc # Perform nearest neighbor remapping cdo remapnn,new_grid_64b.nc era5_2m_dewpoint_temperature_2000_monthly_64b.nc cdo_nearest_64b.nc ``` ```python data_cdo = xr.open_dataset("data/cdo_nearest_64b.nc") ``` -------------------------------- ### Regrid data using xESMF library Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Shows how to use the xESMF library for conservative regridding. It first computes the regridding weights using `xe.Regridder` and then applies these weights to regrid the dataset. The time taken for both weight generation and the actual regridding is measured. ```python t0 = time() regridder = xe.Regridder(ds, target_dataset, "conservative") print(f"Elapsed time generating weights: {time() - t0:.3f} seconds") t0 = time() data_esmf: xr.Dataset = regridder(ds, keep_attrs=True).compute() print(f"Elapsed time regridding: {time() - t0:.3f} seconds") ``` -------------------------------- ### Perform Bilinear Regridding with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Applies bilinear regridding to the dataset using the `Dataset.regrid.linear` method provided by `xarray-regrid`. The operation is timed, and the result is computed and stored. This snippet demonstrates the core regridding functionality of the xarray-regrid extension. ```python t0 = time() data_regrid = ds.regrid.linear(target_dataset) data_regrid = data_regrid.compute() print(f"Elapsed time: {time() - t0:.3f} seconds") ``` -------------------------------- ### Import modules and load data with Dask Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Imports necessary libraries like time, dask, xarray, and xarray_regrid. It initializes a Dask client and loads ERA5 precipitation data, persisting it in memory with specified chunk sizes for Dask. ```python from time import time import dask.distributed import xarray_regrid # Importing this will make Dataset.regrid accessible. import xarray as xr from xarray_regrid import Grid import xesmf as xe import warnings warnings.filterwarnings('ignore') client = dask.distributed.Client() ds = xr.open_dataset( "data/era5_total_precipitation_2020_monthly.nc", chunks={"longitude": 400, "latitude": 400}, ).persist() ``` -------------------------------- ### Perform Bilinear Regridding with xESMF Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Regrids the dataset using the xESMF library. It first creates a `Regridder` object specifying the input and output grids and the 'bilinear' method. The regridder is then applied to the dataset, and the elapsed time is measured. This showcases the xESMF workflow for regridding. ```python t0 = time() regridder = xe.Regridder(ds, target_dataset, "bilinear") data_esmf: xr.Dataset = regridder(ds, keep_attrs=True) print(f"Elapsed time: {time() - t0:.3f} seconds") ``` -------------------------------- ### Load CDO Regridded Data with xarray Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Loads the NetCDF file containing the data regridded by CDO into an xarray Dataset. This allows for programmatic comparison with the results obtained from xarray-regrid and xESMF. ```python data_cdo = xr.open_dataset("data/cdo_bilinear_64b.nc") ``` -------------------------------- ### Regrid Data using xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_nearest.ipynb Performs nearest-neighbor regridding using the `xarray-regrid` accessor (`Dataset.regrid.nearest`). The code measures the time taken for the regridding operation and computes the result using Dask. ```python t0 = time() data_regrid = ds.regrid.nearest(target_dataset) data_regrid = data_regrid.compute() print(f"Elapsed time: {time() - t0:.3f} seconds") ``` -------------------------------- ### Create Conda Environment from YAML Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/README.md This command creates a conda environment using a specified YAML file, typically used for setting up environments for projects that require specific dependencies like ESMF and CDO for benchmarking. ```sh micromamba create -n environment_name -f environment.yml ``` -------------------------------- ### Benchmark Regridding Performance with Nan Threshold Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Measures the runtime of the `conservative` regridding method using different `nan_threshold` and `skipna` configurations. It loads sample data, defines a target grid, and then computes the regridded dataset for each configuration, printing the elapsed time. ```python ds_sst = xr.tutorial.open_dataset("ersstv5")[["sst"]].rename(lon="longitude", lat="latitude").persist() new_grid = Grid( north=90, east=360, south=-90, west=0, resolution_lat=4, resolution_lon=4, ) target_dataset = new_grid.create_regridding_dataset() t0 = time() data_skipna_false = ds_sst.regrid.conservative(target_dataset, skipna=False).compute() print(f"Elapsed time skipna=False: {time() - t0:.3f} seconds") t0 = time() data_nt0 = ds_sst.regrid.conservative(target_dataset, skipna=True, nan_threshold=0).compute() print(f"Elapsed time skipna=True: {time() - t0:.3f} seconds") data_nt1 = ds_sst.regrid.conservative(target_dataset, skipna=True, nan_threshold=1).compute() ``` -------------------------------- ### Perform conservative regridding with xarray-regrid accessor Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Applies conservative regridding to the dataset using the `Dataset.regrid.conservative` accessor provided by `xarray-regrid`. It computes the result and measures the time taken for the regridding operation. ```python t0 = time() data_regrid = ds.regrid.conservative(target_dataset, skipna=False, latitude_coord="latitude") data_regrid = data_regrid.compute() print(f"Elapsed time: {time() - t0:.3f} seconds") ``` -------------------------------- ### Set up Dask Distributed Client Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_conservative_nan_threshold.ipynb Initializes a Dask distributed client for enhanced memory management and parallel computation. This is often a prerequisite for processing large datasets with libraries like xarray and dask. ```python from dask import distributed c = distributed.Client() ``` -------------------------------- ### Python Display Ratio of Regridding Times Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb This Python code snippet simply displays the calculated 'ratio' DataFrame, which represents the speedup factor of xesmf over xarray-regrid for various regridding scenarios. It's the final step in visualizing the timing results. ```Python ratio ``` -------------------------------- ### Create target dataset grid with xarray-regrid Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Defines a new target grid using the `Grid` dataclass from `xarray-regrid`. This creates a dataset with the specified geographical boundaries and resolution, which will be used as the target for regridding operations. ```python new_grid = Grid( north=90, east=360, south=-90, west=0, resolution_lat=2.2, resolution_lon=2.2, ) target_dataset = new_grid.create_regridding_dataset() target_dataset ``` -------------------------------- ### Grid Creation for Regridding Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Defines target grids for regridding operations using the `xarray_regrid.Grid` object. Grids can be created globally or regionally and then converted into xarray Datasets suitable for regridding functions. Coordinate names can be customized. ```python import xarray_regrid # Create Grid object grid = xarray_regrid.Grid( north=90.0, # Northern boundary (degrees) east=180.0, # Eastern boundary (degrees) south=-90.0, # Southern boundary (degrees) west=-180.0, # Western boundary (degrees) resolution_lat=0.25, # Latitude spacing resolution_lon=0.25, # Longitude spacing ) # Method 1: Create dataset from Grid object target_ds = grid.create_regridding_dataset( lat_name="latitude", lon_name="longitude" ) # Method 2: Using module-level function target_ds = xarray_regrid.create_regridding_dataset( grid, lat_name="lat", lon_name="lon" ) # Created dataset has coordinate attributes print(target_ds.latitude.attrs) # {'units': 'degrees_north'} print(target_ds.longitude.attrs) # {'units': 'degrees_east'} # Regional grid example regional = xarray_regrid.Grid( north=55.0, east=15.0, south=45.0, west=5.0, resolution_lat=0.1, resolution_lon=0.1, ) regional_grid = regional.create_regridding_dataset() # Output: xarray.Dataset with latitude/longitude coordinates # No data variables - ready to use as target grid ``` -------------------------------- ### Generate Synthetic Source Data for Regridding (Python) Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb Creates a synthetic dask array representing source data with specified dimensions (time, latitude, longitude) and chunking. It uses random data and includes date-based time coordinates. Dependencies include dask and xarray. ```python def source_data(source, chunks, n_times=1000): data = da.random.random( size=(n_times, source.latitude.size, source.longitude.size), chunks=chunks, ).astype("float32") data = xr.DataArray( data, dims=["time", "latitude", "longitude"], coords={ "time": xr.date_range("2000-01-01", periods=n_times, freq="D"), "latitude": source.latitude, "longitude": source.longitude, } ) return data ``` -------------------------------- ### Python Regridding Timing Functions Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb This Python code defines functions to time regridding operations using both xarray-regrid and xesmf. It includes helper functions for performing the regridding and measuring execution times across different chunking schemes and NaN skipping options. Dependencies include 'time', 'pandas', and assumed 'xesmf_regridder', 'source_data', 'chunk_schemes' variables. ```Python import time import pandas as pd pd.options.display.precision = 1 def do_regrid(data, target, skipna): data.regrid.conservative(target, skipna=skipna).compute() def do_xesmf(data, target, skipna): xesmf_regridder(data, skipna=skipna).compute() def timing_grid(func, repeats=2): times = pd.DataFrame( index=chunk_schemes.keys(), columns=["skipna=False", "skipna=True"], ) for name, chunks in chunk_schemes.items(): data = source_data(source, chunks) for skipna in [False, True]: execution_times = [] for _ in range(repeats): start = time.perf_counter() func(data, target, skipna) end = time.perf_counter() execution_times.append(end - start) # Sometimes the first execution is a little slower times.loc[name, f"skipna={skipna}"] = min(execution_times) return times regrid_times = timing_grid(do_regrid) xesmf_times = timing_grid(do_xesmf) ratio = xesmf_times / regrid_times ``` -------------------------------- ### Initialize Dask Distributed Client Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_variance.ipynb Sets up a Dask distributed client for optimal memory management when dealing with large datasets. This is a prerequisite for efficient parallel processing. ```python from dask import distributed c = distributed.Client() c ``` -------------------------------- ### Regrid Data using xESMF Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_nearest.ipynb Regrids data using the xESMF library. It first computes the regridding weights using `xe.Regridder` with the 'nearest_s2d' method and then applies the regridder to the dataset, measuring the elapsed time. The `keep_attrs=True` argument preserves metadata. ```python t0 = time() regridder = xe.Regridder(ds, target_dataset, "nearest_s2d") data_esmf: xr.Dataset = regridder(ds, keep_attrs=True) print(f"Elapsed time: {time() - t0:.3f} seconds") ``` -------------------------------- ### Compare Regridding Results Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_nearest.ipynb Compares the results from the different regridding methods by calculating the relative error between them. It utilizes a `plot_comparison` function from `benchmark_utils` to visualize the differences, setting specific `vmin` and `vmax` to highlight small discrepancies. ```python from benchmark_utils import plot_comparison plot_comparison( data_regrid, data_esmf, data_cdo, vmin=-0.5e-10, vmax=0.5e-10, varname="d2m" ) ``` -------------------------------- ### Plot Comparison of Regridded Datasets Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Generates a plot to visually compare regridding results from different methods (xarray-regrid, xESMF, CDO). It takes the datasets from each method as input and visualizes the differences within a specified range. ```python from benchmark_utils import plot_comparison plot_comparison(data_regrid, data_esmf, data_cdo, vmin=-1e-3, vmax=1e-3, varname="tp") ``` -------------------------------- ### Define Target Grid for Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Creates a target grid definition using the `Grid` class from `xarray_regrid`. This specifies the spatial extent (north, east, south, west) and the desired resolution (resolution_lat, resolution_lon) for the output regridded dataset. ```python from xarray_regrid import Grid target_dataset = Grid( north=90, east=90, south=0, west=0, resolution_lat=1, resolution_lon=1, ).create_regridding_dataset() ``` -------------------------------- ### Compare Regridding Results with Plotting Utility Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_bilinear.ipynb Utilizes a `plot_comparison` function (presumably from `benchmark_utils`) to visualize and compare the regridded datasets from xarray-regrid, xESMF, and CDO. It calculates and plots the relative error between the datasets, highlighting differences in regridding accuracy. ```python from benchmark_utils import plot_comparison plot_comparison( data_regrid, data_esmf, data_cdo, vmin=-0.5e-5, vmax=0.5e-5, varname="d2m" ) ``` -------------------------------- ### Handle Error: No Matching Dimensions Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Illustrates the error handling for attempting to regrid with a target dataset that has no dimensions matching the source data. A `ValueError` is raised, indicating incompatible dimensions. ```python # Error: No matching dimensions incompatible_target = xr.Dataset({ "x": [1, 2, 3], "y": [4, 5, 6] }) try: ds.regrid.linear(incompatible_target) except ValueError as e: print(e) # "None of the target dims are in the data" ``` -------------------------------- ### Dask Integration for Lazy Regridding Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Enables efficient lazy regridding of large datasets using Dask. Regridding operations on Dask-chunked xarray objects return Dask arrays, allowing computation to be deferred until explicitly called. This is crucial for out-of-core processing. ```python import xarray as xr import xarray_regrid # Load large dataset with Dask ds = xr.open_dataset("large_climate_data.nc", chunks={"time": 10}) # Or chunk existing dataset ds_chunked = ds.chunk({ "time": 1, "latitude": 100, "longitude": 100 }) # Create target grid target_grid = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid( north=90, east=180, south=-90, west=-180, resolution_lat=1.0, resolution_lon=1.0 ) ) # Regridding is lazy - no computation yet ds_regridded = ds_chunked.regrid.linear(target_grid) print(type(ds_regridded.temperature.data)) # dask.array.Array ``` -------------------------------- ### Define Chunking Schemes for Dask Datasets (Python) Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_xesmf.ipynb Defines various chunking schemes for dask arrays, categorized as 'pancake' (time-chunked) and 'churro' (space-chunked) with different sizes. These schemes are used to test performance variations. ```python chunk_schemes = { "pancake_small": (1, -1, -1), "pancake_large": (25, -1, -1), "churro_small": (-1, 32, 32), "churro_large": (-1, 160, 160), } ``` -------------------------------- ### Statistical Aggregation Regridding (mean, var, std, median, min, max, sum) Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Performs statistical aggregation of data to coarser grids using methods like mean, variance, standard deviation, median, min, max, and sum. Supports `skipna` option and `fill_value` for uncovered areas. ```python import xarray as xr import xarray_regrid # Load high-resolution data ds = xr.open_dataset("high_res_temperature.nc") # Create coarser grid grid = xarray_regrid.Grid( north=70, east=40, south=50, west=20, resolution_lat=1.0, resolution_lon=1.0, ) target_grid = xarray_regrid.create_regridding_dataset(grid) # Mean aggregation ds_mean = ds.regrid.stat(target_grid, method="mean") # Variance aggregation ds_variance = ds.regrid.stat(target_grid, method="var", skipna=True) # Standard deviation ds_std = ds.regrid.stat(target_grid, method="std", skipna=True) # Median ds_median = ds.regrid.stat(target_grid, method="median") # Min/Max ds_min = ds.regrid.stat(target_grid, method="min") ds_max = ds.regrid.stat(target_grid, method="max") # Sum (useful for counting) ds_sum = ds.regrid.stat(target_grid, method="sum") # With fill value for uncovered areas ds_mean_filled = ds.regrid.stat( target_grid, method="mean", fill_value=-999.0 ) # Output: Statistical aggregation over each target grid cell # Methods: "sum", "mean", "var", "std", "median", "min", "max" ``` -------------------------------- ### Load and Prepare SST Dataset for Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_conservative_nan_threshold.ipynb Loads the 'analysed_sst' data from the MUR Zarr store, selects a specific geographical slice (latitude and longitude), and isolates the first time step. This prepares the data for subsequent regridding operations. ```python import xarray as xr import xarray_regrid sst = xr.open_zarr("https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1")["analysed_sst"] # Reduce size of array by only selecting a slice sst = sst.sel(lat=slice(30, 45), lon=slice(125, 150)).isel(time=0) sst.plot() ``` -------------------------------- ### Perform Linear Regridding Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Shows a successful linear regridding operation using the `.regrid.linear()` method on an xarray Dataset with a compatible target grid. This method interpolates data from the source grid to the target grid. ```python # Valid regridding result = ds.regrid.linear(target) # Success ``` -------------------------------- ### Rechunk Data for Efficient Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Rechunks the input DataArray to smaller, more manageable sizes for the regridding operation. This step is crucial for avoiding memory errors when dealing with large datasets and is performed before applying the regridder. ```python da = da.chunk({"time": -1, "latitude": 4050, "longitude": 4050}) da.data ``` -------------------------------- ### Define Target Grid and Perform Conservative Regridding with NaN Thresholds Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_conservative_nan_threshold.ipynb Creates a target grid definition with specified bounds and resolution, then applies the conservative regridding method from xarray-regrid to the input dataset. The 'nan_threshold' parameter is demonstrated with values 0, 0.5, and 1 to show its effect on handling NaN values during regridding. ```python grid = xarray_regrid.Grid( north=45, south=30, west=125, east=150, resolution_lat=1, resolution_lon=1, ) target = grid.create_regridding_dataset(lat_name="lat", lon_name="lon") ds0p0 = sst.regrid.conservative(target, nan_threshold=0) ds0p5 = sst.regrid.conservative(target, nan_threshold=0.5) ds1p0 = sst.regrid.conservative(target, nan_threshold=1) ``` -------------------------------- ### Visualize Original Land Cover Data Subset Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Generates a plot of a subset of the original land cover data using matplotlib and a custom colormap derived from the data's attributes. This helps visualize the spatial distribution of land cover classes before regridding. ```python import matplotlib.pyplot as plt from matplotlib.colors import ListedColormap colors = da.attrs["flag_colors"].split(" ") cmap = ListedColormap(colors) ax = da.sel(latitude=slice(51, 54), longitude=slice(3.4, 6.4)).plot(cmap=cmap, vmin=10, vmax=220) ax = plt.gca() ax.set_aspect('equal') ``` -------------------------------- ### Error Handling and Input Validation Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt The library includes built-in mechanisms for input validation and error handling to ensure robust regridding operations. This helps in identifying and addressing potential issues with input datasets or regridding parameters before computation. ```python import xarray as xr import xarray_regrid import numpy as np # Example usage demonstrating error handling (specific errors depend on implementation) # try: # # Attempt a regridding operation that might fail due to invalid inputs # invalid_ds = xr.Dataset({"data": ([], [1])}) # target_grid = xarray_regrid.create_regridding_dataset(xarray_regrid.Grid(north=10, east=10, south=0, west=0)) # regridded_data = invalid_ds.regrid.linear(target_grid) # except ValueError as e: # print(f"Caught expected error: {e}") # except Exception as e: # print(f"Caught unexpected error: {e}") # Output: Validation checks and informative error messages for invalid inputs. ``` -------------------------------- ### Handling Coordinate Conventions (Longitude and Names) Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt The library automatically handles variations in longitude conventions (e.g., 0-360 vs. -180 to 180) and coordinate names (e.g., 'lat'/'latitude', 'lon'/'longitude'). This ensures seamless regridding even when source and target grids use different conventions or naming. ```python import xarray as xr import xarray_regrid # Dataset with 0-360 longitude convention ds_360 = xr.open_dataset("data_0_360.nc") print(ds_360.longitude.values) # [0.5, 1.5, ..., 359.5] # Target grid with -180 to 180 convention target_180 = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid( north=90, east=180, south=-90, west=-180, resolution_lat=1.0, resolution_lon=1.0 ), lon_name="longitude" ) # Automatic longitude wrapping - works seamlessly ds_regridded = ds_360.regrid.linear(target_180) print(ds_regridded.longitude.values) # [-179.5, -178.5, ..., 179.5] # Coordinate name variations handled automatically # Recognizes: "lat"/"latitude", "lon"/"longitude" ds_short_names = xr.Dataset({ "temp": (["lat", "lon"], [[20, 21], [22, 23]]), "lat": [50, 51], "lon": [5, 6] }) target_long_names = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid(north=52, east=7, south=49, west=4, resolution_lat=1, resolution_lon=1), lat_name="latitude", lon_name="longitude" ) # Works despite name differences result = ds_short_names.regrid.linear(target_long_names) # Output: Automatic conversion between coordinate conventions # Handles global wraparound and pole padding automatically ``` -------------------------------- ### Linear Interpolation Regridding with Xarray Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Performs linear (bilinear/trilinear) interpolation to transform data between grids. It takes a target grid definition and returns a regridded Dataset or DataArray. Works with both Datasets and DataArrays, preserving metadata. ```python import xarray as xr import xarray_regrid # Load source data ds = xr.open_dataset("era5_temperature.nc") # Define target grid grid = xarray_regrid.Grid( north=90, east=180, south=-90, west=-180, resolution_lat=1.0, resolution_lon=1.0, ) target_grid = xarray_regrid.create_regridding_dataset(grid) # Regrid using linear interpolation ds_regridded = ds.regrid.linear(target_grid) # Works on DataArrays too temperature_regridded = ds["temperature"].regrid.linear(target_grid) # Output: Dataset/DataArray with data interpolated to target grid coordinates # Attributes and metadata are preserved ``` -------------------------------- ### Load and Prepare High-Resolution Land Cover Data Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Loads a high-resolution land cover dataset (LCCS Map) using xarray. The data is loaded with automatic chunking for lazy computation. It then selects the 'lccs_class' variable, sorts it by latitude and longitude, and renames the coordinate axes for consistency. ```python ds = xr.open_dataset( "/data/C3S-LC-L4-LCCS-Map-300m-P1Y-2020-v2.1.1.nc", chunks="auto", ) da = ds["lccs_class"] # Only take the class variable. da = da.sortby(["lat", "lon"]) da = da.rename({"lat": "latitude", "lon": "longitude"}) da ``` -------------------------------- ### Load and Prepare SST Dataset for Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_variance.ipynb Loads the MUR SST dataset from an AWS S3 Zarr store, selects a specific geographical slice, and isolates the first time step. This prepares the data for subsequent regridding operations. It utilizes xarray for data handling and plotting. ```python import xarray as xr import xarray_regrid sst = xr.open_zarr("https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1")["analysed_sst"] # Reduce size of array by only selecting a slice sst = sst.sel(lat=slice(30, 45), lon=slice(125, 150)).isel(time=0) sst.plot() ``` -------------------------------- ### Perform most_common Regridding on DataArray Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Illustrates the correct usage of the `most_common` regridding method on an xarray DataArray. This method requires providing corresponding `values` for the source grid cells. ```python # Correct: most_common on DataArray result = ds["temp"].regrid.most_common( target, values=np.array([20, 21, 22, 23]) ) ``` -------------------------------- ### Plot Regridded Land Cover Data Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Generates a plot of the regridded land cover data. Calling `.plot()` on the regridded DataArray triggers the computation. The plot uses the same colormap and value range as the original data for comparison. ```python da_regrid.plot(x="longitude", cmap=cmap, vmin=10, vmax=220) ``` -------------------------------- ### Handle Error: most_common on Dataset Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Shows the error raised when attempting to use the `most_common` regridding method on an xarray Dataset. This method is only implemented for DataArrays, and a `ValueError` is expected. ```python # Error: most_common on Dataset (should use DataArray) try: ds.regrid.most_common(target, values=np.array([1, 2, 3])) except ValueError as e: print(e) # "The 'most common value' regridder is not implemented for xarray.Dataset" ``` -------------------------------- ### Conservative Regridding with Output Chunking Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Perform conservative regridding while controlling output chunking for memory efficiency. Computation is triggered explicitly using .compute() or implicitly during I/O operations like to_netcdf(). This is ideal for processing datasets larger than available RAM. ```python import xarray as xr # Assuming ds_chunked and target_grid are defined elsewhere data_path = "/path/to/your/data.nc" ds_chunked = xr.open_dataset(data_path, chunks={'latitude': 45, 'longitude': 90}) target_grid = xr.open_dataset("/path/to/your/target_grid.nc") # Conservative regridding with specified output chunking ds_conservative = ds_chunked.regrid.conservative( target_grid, latitude_coord="latitude", output_chunks={"latitude": 45, "longitude": 90} ) # Trigger computation explicitly result = ds_regridded.compute() # Computation happens here # Or compute specific variables temp_result = ds_regridded["temperature"].compute() # Save lazily to NetCDF (computation happens during write) ds_regridded.to_netcdf("regridded_output.nc") # Output: Lazy Dask arrays until .compute() is called # Memory-efficient processing of datasets larger than RAM ``` -------------------------------- ### Cubic Spline Regridding for Smooth Interpolation Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Utilizes cubic spline interpolation for smooth data transformation, preserving continuous derivatives. This method is ideal for continuous variables like temperature or pressure, resulting in visually appealing and analytically useful outputs. ```python import xarray as xr import xarray_regrid # Load smooth continuous data (e.g., temperature, pressure) ds = xr.open_dataset("climate_data.nc") # Create high-resolution target grid target_grid = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid( north=45, east=10, south=35, west=0, resolution_lat=0.05, resolution_lon=0.05, ) ) # Regrid using cubic spline interpolation ds_smooth = ds.regrid.cubic(target_grid) # Output: Smoothly interpolated data with cubic splines # Better for visualizations and downstream analysis requiring derivatives ``` -------------------------------- ### Handle Error: Invalid Grid Bounds Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Demonstrates how xarray-regrid handles invalid grid definitions, specifically when the North bound is less than the South bound. An `InvalidBoundsError` is raised. ```python # Invalid Grid bounds try: bad_grid = xarray_regrid.Grid( north=40, # North < South east=10, south=50, west=0, resolution_lat=1, resolution_lon=1 ) except xarray_regrid.utils.InvalidBoundsError as e: print(e) # "Value of north bound is greater than south bound" ``` -------------------------------- ### Define Target Grid for Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_variance.ipynb Creates a new target grid definition with specified geographical bounds and resolution using xarray_regrid. This grid will be used as the destination for the regridding operation. ```python target = xarray_regrid.Grid( north=45, south=30, west=125, east=150, resolution_lat=1, resolution_lon=1, ).create_regridding_dataset(lat_name="lat", lon_name="lon") ``` -------------------------------- ### Preserving Attributes and Metadata During Regridding Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt This feature ensures that all attributes and metadata associated with the dataset, variables, and coordinates are preserved throughout the regridding process. This includes units, long names, standard names, and custom attributes, maintaining data integrity and context. ```python import xarray as xr import xarray_regrid # Dataset with rich metadata ds = xr.Dataset({ "temperature": ( ["time", "latitude", "longitude"], [[[15, 16], [17, 18]]], { "long_name": "2 metre temperature", "units": "K", "standard_name": "air_temperature" } ), "latitude": ([50, 51], {"units": "degrees_north"}), "longitude": ([5, 6], {"units": "degrees_east"}), "time": (["2020-01-01"], {"calendar": "standard"}) }) ds.attrs["source"] = "ERA5 Reanalysis" ds.attrs["institution"] = "ECMWF" # Create target grid target = xarray_regrid.create_regridding_dataset( xarray_regrid.Grid(north=52, east=7, south=49, west=4, resolution_lat=0.5, resolution_lon=0.5) ) # Regrid ds_regridded = ds.regrid.linear(target) # All attributes preserved assert ds_regridded["temperature"].attrs == ds["temperature"].attrs assert ds_regridded.attrs == ds.attrs assert ds_regridded["latitude"].attrs["units"] == "degrees_north" # Coordinate attributes preserved print(ds_regridded["temperature"].attrs["units"]) # "K" print(ds_regridded["temperature"].attrs["long_name"]) # "2 metre temperature" # Output: Complete metadata preservation throughout regridding # Dataset, variable, and coordinate attributes all maintained ``` -------------------------------- ### Nearest-Neighbor Regridding for Categorical Data Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Applies nearest-neighbor interpolation, suitable for categorical or discrete data. It finds the closest source grid point for each target grid point. Options exist to exclude or include specific dimensions like 'time' during regridding. ```python import xarray as xr import xarray_regrid # Load land cover classification data landcover = xr.open_dataset("landcover.nc") # Create target grid grid = xarray_regrid.Grid( north=60, east=10, south=50, west=0, resolution_lat=0.1, resolution_lon=0.1, ) target_grid = grid.create_regridding_dataset( lat_name="lat", lon_name="lon" ) # Regrid using nearest-neighbor landcover_regridded = landcover.regrid.nearest(target_grid) # Exclude time dimension from regridding landcover_spatial = landcover.regrid.nearest(target_grid, time_dim="time") # Include time in regridding (for numeric time coordinates) landcover_with_time = landcover.regrid.nearest(target_grid, time_dim=None) # Output: Nearest value assigned to each target grid point ``` -------------------------------- ### Handle Error: Invalid nan_threshold Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Demonstrates error handling for providing an invalid `nan_threshold` value (outside the [0, 1] range) to the `conservative` regridding method. This results in a `ValueError`. ```python # Error: Invalid nan_threshold try: ds.regrid.conservative(target, nan_threshold=1.5) except ValueError as e: print(e) # "nan_threshold must be between [0, 1]" ``` -------------------------------- ### Perform 'Most Common' Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/demos/demo_most_common.ipynb Applies the 'most_common' regridder to the DataArray. This method is suitable for categorical data and determines the target grid cell's value based on the most frequent source cell value within its overlap. It requires the `values` of the labels and the `time_dim` to be specified. ```python da_regrid = da.regrid.most_common( target_dataset, values=da.attrs["flag_values"], time_dim="time" ) ``` -------------------------------- ### Calculate Root Mean Squared Error (RMSE) for Regridding Source: https://github.com/xarray-contrib/xarray-regrid/blob/main/docs/notebooks/benchmarks/benchmarking_conservative.ipynb Computes the Root Mean Squared Error (RMSE) between regridded datasets and a reference dataset (CDO). It specifically excludes data at the edges of the dataset before calculation and outputs the RMSE in millimeters. ```python import numpy as np no_edges = dict(latitude=slice(-85, 85), longitude=slice(5, 355)) rmse = ( np.sqrt(np.mean((data_regrid - data_cdo).sel(no_edges) ** 2))["tp"].to_numpy() * 1000 ) print(f"xarray-regrid vs. CDO - RMSE: {rmse:.5f} mm") rmse = ( np.sqrt(np.mean((data_esmf - data_cdo).sel(no_edges) ** 2))["tp"].to_numpy() * 1000 ) print(f"xESMF vs. CDO - RMSE: {rmse:.5f} mm") ``` -------------------------------- ### Conservative (Area-Weighted) Regridding with Xarray Source: https://context7.com/xarray-contrib/xarray-regrid/llms.txt Performs area-weighted conservative regridding, crucial for preserving physical quantities like mass or energy in flux data. It handles spherical grids with latitude correction and offers options for NaN handling and custom Dask chunking. ```python import xarray as xr import xarray_regrid # Load flux data (precipitation, radiation, etc.) precipitation = xr.open_dataset("era5_precipitation.nc") # Create coarser target grid grid = xarray_regrid.Grid( north=90, east=360, south=-90, west=0, resolution_lat=2.0, resolution_lon=2.0, ) target_grid = xarray_regrid.create_regridding_dataset(grid) # Conservative regridding with spherical correction precip_regridded = precipitation.regrid.conservative( target_grid, latitude_coord="latitude", # Apply spherical earth correction skipna=True, # Handle NaN values nan_threshold=0.5, # Require 50%+ valid input data ) # Conservative without NaN handling (faster for clean data) precip_fast = precipitation.regrid.conservative( target_grid, latitude_coord="latitude", skipna=False ) # With custom output chunking for Dask precip_chunked = precipitation.chunk({"time": 1}).regrid.conservative( target_grid, latitude_coord="latitude", output_chunks={"latitude": 50, "longitude": 50} ) # Output: Area-weighted values preserving total mass/energy # Sum over domain is conserved (accounting for spherical correction) ```