### Run Doctest Modules (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Tests code examples embedded within the docstrings of the xskillscore modules. This ensures that documentation examples are accurate and functional. ```bash pytest --doctest-modules xskillscore --ignore xskillscore/tests ``` -------------------------------- ### Install Development Dependencies (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Installs the necessary development dependencies for xskillscore using conda. This command should be run after cloning the repository. ```bash mamba env update -f ci/dev.yml conda activate xskillscore-dev ``` -------------------------------- ### Editable Install xskillscore (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Installs the xskillscore package in editable mode, allowing changes to be reflected immediately without reinstallation. This is a crucial step after installing dependencies. ```bash pip install --no-deps -e . ``` -------------------------------- ### Run Specific Performance Benchmark Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Command to run specific performance benchmarks filtered by a regular expression. This example targets benchmarks in the deterministic.py file. ```shell asv continuous -f 1.1 upstream/main HEAD -b ^deterministic ``` -------------------------------- ### Create Multi-Category Contingency Table with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Initializes a Contingency object for multi-category data by providing observation and forecast arrays along with multi-category bin edges. This setup is necessary for applying scores that support more than two categories. ```python multi_category_edges = np.array([0, 0.25, 0.75, 1]) multicategory_contingency = xs.Contingency( obs, fct, multi_category_edges, multi_category_edges, dim=["lat", "lon"] ) ``` -------------------------------- ### Process Notebooks (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Strips output from Jupyter notebooks. This is typically done before committing notebooks to ensure clean and reproducible examples in the documentation. ```bash cd docs nbstripout source/*.ipynb ``` -------------------------------- ### Run Specific Benchmark Method Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Command to run a single, specific benchmark method within a class from a benchmark file. This example targets a specific test in deterministic.py. ```shell asv continuous -f 1.1 upstream/main HEAD -b deterministic.Compute_small.time_xskillscore_metric_small ``` -------------------------------- ### Generate Sample Data for Probabilistic Metrics Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Generates sample xarray DataArrays for observations and forecasts, including an ensemble member dimension, suitable for testing probabilistic metrics. ```python import xarray as xr import numpy as np obs3 = xr.DataArray( np.random.rand(4, 5), coords=[np.arange(4), np.arange(5)], dims=["lat", "lon"], name="var", ) fct3 = xr.DataArray( np.random.rand(3, 4, 5), coords=[np.arange(3), np.arange(4), np.arange(5)], dims=["member", "lat", "lon"], name="var", ) ``` -------------------------------- ### Generate Rank Histogram Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Creates a rank histogram, which visualizes the distribution of forecast ranks relative to observations. This helps in assessing the reliability and sharpness of ensemble forecasts. ```python import xskillscore as xs rank_histogram = xs.rank_histogram(obs3, fct3) print(rank_histogram) ``` -------------------------------- ### Import Libraries and Generate Sample Data Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Initializes necessary libraries (matplotlib, numpy, xarray, xskillscore) and generates sample xarray DataArrays for observations ('obs') and forecast ('fct') data. This sets up the data structure for subsequent metric calculations. ```python import matplotlib.pyplot as plt import numpy as np import xarray as xr import xskillscore as xs np.random.seed(seed=42) obs = xr.DataArray( np.random.rand(3, 4, 5), coords=[ xr.cftime_range("2000-01-01", "2000-01-03", freq="D"), np.arange(4), np.arange(5), ], dims=["time", "lat", "lon"], name="var", ) fct = obs.copy() fct.values = np.random.rand(3, 4, 5) ``` -------------------------------- ### Add Sphinx Documentation and Quick Start Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Implements Sphinx documentation for the project, providing a comprehensive API reference. Additionally, a 'quick start' notebook has been added to facilitate initial user onboarding. ```python Documentation ~~~~~~~~~~~~~ - Added ``sphinx`` documentation with full API and a `quick start `__ notebook. (:pr:`127`) `Riley X. Brady`_ and `Ray Bell`_ ``` -------------------------------- ### Clone Repository and Set Up Branch (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Steps to clone the xskillscore repository, set up the upstream remote, and create a new branch for development. Replace YOUR_GITHUB_USERNAME with your GitHub username. ```bash git clone git@github.com:YOUR_GITHUB_USERNAME/xskillscore.git cd xskillscore git remote add upstream git@github.com:xarray-contrib/xskillscore.git git checkout -b your-bugfix-feature-branch-name main ``` -------------------------------- ### Build HTML Documentation Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Command to build the HTML documentation for the xskillscore project using make with parallel jobs. ```shell make -j4 html ``` -------------------------------- ### Build Documentation Locally (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Commands to build the documentation locally using conda environments and make. Assumes a conda environment named 'xskillscore-docs' is available. ```bash conda env update -f ci/doc.yml conda activate xskillscore-docs cd docs make html ``` -------------------------------- ### Build and Upload Package to PyPI Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTORELEASE.rst This sequence of commands cleans the working directory, builds the source distribution and wheel package for the project, and then uploads the built distributions to the Python Package Index (PyPI). Requires `twine` to be installed. ```bash git clean -xfd python setup.py sdist bdist_wheel --universal twine upload dist/* ``` -------------------------------- ### Install development version of xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/index.rst Installs the bleeding-edge (pre-release) versions of xskillscore directly from the main branch of its GitHub repository using pip. This allows users to test the latest features and bug fixes. ```bash pip install git+https://github.com/xarray-contrib/xskillscore@main --upgrade ``` -------------------------------- ### Visualize Sign Test Results Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Visualizes the results of a sign test, typically plotting the 'walk' and 'confidence' intervals. This helps in assessing the statistical significance of differences between two forecasts based on a chosen metric. ```python walk.plot() confidence.plot(c="gray") (-1 * confidence).plot(c="gray") ``` -------------------------------- ### Run All Tests (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Executes all unit tests for the xskillscore package. This command should be run before submitting a pull request to ensure code stability. ```bash pytest xskillscore ``` -------------------------------- ### Create Dichotomous Contingency Table with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Instantiates a Contingency object for dichotomous (two-category) data using observations and forecasts. It defines category edges and the dimensions for the contingency table. The resulting contingency table is then printed. ```python dichotomous_category_edges = np.array([0, 0.5, 1]) # "dichotomous" mean two-category dichotomous_contingency = xs.Contingency( obs, fct, dichotomous_category_edges, dichotomous_category_edges, dim=["lat", "lon"] ) dichotomous_contingency_table = dichotomous_contingency.table print(dichotomous_contingency_table) ``` -------------------------------- ### Install xskillscore from GitHub using Pip Source: https://github.com/xarray-contrib/xskillscore/blob/main/README.rst This command installs the xskillscore package directly from its GitHub repository using pip. This is useful for installing the latest development version or a specific commit that may not yet be released on PyPI. ```bash pip install git+https://github.com/xarray-contrib/xskillscore ``` -------------------------------- ### Run Pre-commit Hooks (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Executes all pre-commit hooks on the entire codebase. This ensures code style and formatting consistency across multiple languages. ```bash pre-commit run --all-files ``` -------------------------------- ### Install xskillscore with pip Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/index.rst Installs the latest release of the xskillscore package using pip. This is a common method for Python package installation. ```bash pip install xskillscore ``` -------------------------------- ### Prepare Weighted Data for Metric Calculation Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Generates a new set of sample data ('obs2', 'fct2') with dimensions suitable for gridded calculations and creates latitude-based weights using the cosine of the latitude. These weights are then broadcast to match the data dimensions, preparing them for use in weighted metric calculations. ```python obs2 = xr.DataArray( np.random.rand(3, 180, 360), coords=[ xr.cftime_range("2000-01-01", "2000-01-03", freq="D"), np.linspace(-89.5, 89.5, 180), np.linspace(-179.5, 179.5, 360), ], dims=["time", "lat", "lon"], ) fct2 = obs2.copy() fct2.values = np.random.rand(3, 180, 360) # make weights as cosine of the latitude and broadcast weights = np.cos(np.deg2rad(obs2.lat)) _, weights = xr.broadcast(obs2, weights) ``` -------------------------------- ### Calculate Reliability Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Measures the reliability of probabilistic forecasts, assessing whether the predicted probabilities match the observed frequencies of events. It plots observed frequencies against forecast probabilities. ```python import xskillscore as xs rel = xs.reliability(obs3 > 0.5, (fct3 > 0.5).mean("member")) print(rel) ``` -------------------------------- ### Calculate Discrimination Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the discrimination component of forecast skill, indicating how well forecasts distinguish between events that occur and those that do not. It is calculated based on observed events and forecast probabilities. ```python import xskillscore as xs disc = xs.discrimination(obs3 > 0.5, (fct3 > 0.5).mean("member")) print(disc) ``` -------------------------------- ### Resampling with Replacement using xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Illustrates how to perform resampling with replacement on an xarray DataArray along a specified dimension. It shows two methods: `resample_iterations` and `resample_iterations_idx`, highlighting the performance difference. The output is then visualized by comparing the distribution of the original data and the resampled means. ```python import xarray as xr import numpy as np import matplotlib.pyplot as plt # create large one-dimensional array s = 1000 f = xr.DataArray( np.random.normal(size=s), dims="member", coords={"member": np.arange(s)}, name="var" ) # resample with replacement in that one dimension iterations = 100 # The following lines are for demonstration and timing # %timeit f_r = xs.resampling.resample_iterations(f, iterations, 'member', replace=True) # %timeit f_r = xs.resampling.resample_iterations_idx(f, iterations, 'member', replace=True) f_r = xs.resampling.resample_iterations_idx(f, iterations, "member", replace=True) f.plot.hist(label="distribution") f_r.mean("iteration").plot.hist(label="resampled mean distribution") plt.axvline(x=f.mean("member"), c="k", label="distribution mean") plt.title("Gaussian distribution mean") plt.legend() plt.show() # Calculate and plot RMSE distribution xs.rmse(f_r, xr.zeros_like(f_r), dim="iteration").plot.hist(label="resampled RMSE distribution") plt.axvline(x=xs.rmse(f, xr.zeros_like(f)), c="k", label="RMSE") plt.title("RMSE between gaussian distribution and 0") plt.legend() plt.show() ``` -------------------------------- ### Run Continuous Performance Benchmark Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Command to run the continuous performance benchmark suite using asv. It checks for performance regressions between specified branches. ```shell asv continuous -f 1.1 upstream/main HEAD ``` -------------------------------- ### Calculate Brier Score Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Brier Score, a measure of the accuracy of probabilistic forecasts. It compares binary observations (e.g., exceeding a threshold) with the mean probability of the forecast. ```python import xskillscore as xs brier_score = xs.brier_score(obs3 > 0.5, (fct3 > 0.5).mean("member")) print(brier_score) ``` -------------------------------- ### Calculate CRPS Ensemble Score Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Continuous Ranked Probability Score (CRPS) for an ensemble forecast distribution. Setting dim=[] replicates the behavior of `properscoring.crps_ensemble` by avoiding averaging over dimensions. ```python import xskillscore as xs crps_ensemble = xs.crps_ensemble(obs3, fct3, dim=[]) print(crps_ensemble) ``` -------------------------------- ### Calculate Hit Rate with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the hit rate (also known as recall or sensitivity) from a dichotomous contingency object. The hit rate indicates the proportion of actual positive events that were correctly predicted as positive. The result is printed. ```python print(dichotomous_contingency.hit_rate()) ``` -------------------------------- ### Calculate Ranked Probability Score (RPS) Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Ranked Probability Score (RPS), which is suitable for probabilistic forecasts of ordered categories. The `category_edges` parameter defines the boundaries between categories. ```python import xskillscore as xs import numpy as np rps = xs.rps(obs3 > 0.5, fct3 > 0.5, category_edges=np.array([0.5])) print(rps) ``` -------------------------------- ### Install xskillscore with conda Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/index.rst Installs the latest release of the xskillscore package using conda from the conda-forge channel. This is useful for managing Python environments and dependencies. ```bash conda install -c conda-forge xskillscore ``` -------------------------------- ### Perform ROC Analysis with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Generates data for a Receiver Operating Characteristic (ROC) curve for deterministic forecasts. It takes observations, forecasts, and bin edges as input and can return various results, including the area under the curve. The output can be visualized to assess forecast discrimination. ```python roc = xs.roc(obs, fct, np.linspace(0, 1, 11), return_results="all_as_metric_dim") plt.figure(figsize=(4, 4)) plt.plot([0, 1], [0, 1], "k:") roc.to_dataset(dim="metric").plot.scatter(y="true positive rate", x="false positive rate") roc.sel(metric="area under curve").values[0] ``` -------------------------------- ### Perform Sign Test for Forecast Comparison Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Conducts a sign test to determine if one forecast is significantly better than another. It requires two sets of forecasts and corresponding observations, specifying the time dimension and the metric for comparison (e.g., 'mae'). The orientation parameter ('positive' or 'negative') indicates the expected direction of improvement. ```python length = 100 obs_1d = xr.DataArray( np.random.rand(length), coords=[ np.arange(length), ], dims=["time"], name="var", ) fct_1d = obs_1d.copy() fct_1d.values = np.random.rand(length) significantly_different, walk, confidence = xs.sign_test( fct_1d, fct_1d + 0.2, obs_1d, time_dim="time", metric="mae", orientation="negative" ) ``` -------------------------------- ### Calculate Threshold Brier Score Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Threshold Brier Score, which evaluates forecast accuracy for events exceeding a specific threshold. This score is equivalent to the Ranked Probability Score for two categories. ```python import xskillscore as xs threshold_brier_score = xs.threshold_brier_score(obs3, fct3, 0.5, dim=None) print(threshold_brier_score) ``` -------------------------------- ### Commit and Push Changes (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Stages all changes, commits them with a descriptive message, and pushes the branch to the remote repository. This is a standard Git workflow for submitting changes. ```bash git commit -a -m "" git push -u ``` -------------------------------- ### Lint reStructuredText Files (Bash) Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTOCONTRIBUTE.rst Command to lint reStructuredText files in the project's home directory using doc8. This helps ensure documentation follows reStructuredText conventions. ```bash cd .. doc8 *.rst ``` -------------------------------- ### Add CONTRIBUTING.md for Contribution Guide Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Adds a `CONTRIBUTING.md` file to the repository. This file serves to link to GitHub's built-in contribution guide, making it easier for new developers to understand how to contribute to the project. Addresses pull request #181. ```markdown # Contributing to xskillscore We welcome contributions from the community! Please refer to GitHub's [guides for contributing](https://docs.github.com/en/get-started/exploring-github/contributing-to-your-open-source-project) for general guidelines on pull requests, issue reporting, and code of conduct. For project-specific guidelines, please see our detailed [CONTRIBUTING.md](CONTRIBUTING.md) file (if applicable). ``` -------------------------------- ### Add Community Contribution Documents Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Includes community support documents such as 'HOWTOCONTRIBUTE.rst', an issue template, and a pull request template. These resources guide external contributors on how to participate in the project. ```markdown Internal Changes ~~~~~~~~~~~~~~~~ - Add community support documents: ``HOWTOCONTRIBUTE.rst``, issue template and pull request template. `Aaron Spring`_ and `Ray Bell`_ ``` -------------------------------- ### Perform Half-Width Confidence Interval Test Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Executes a half-width confidence interval test to compare two forecasts. This test determines if the difference in a chosen metric (e.g., 'rmse') between two forecasts is statistically significant at a given alpha level by comparing the metric difference to the half-width of the confidence interval. ```python # create a worse forecast with high but different to perfect correlation and choose RMSE as the distance metric fct_1d_worse = fct_1d.copy() step = 3 fct_1d_worse[::step] = fct_1d[::step].values + 0.1 metric = "rmse" # half-with of the confidence interval at level alpha is larger than the RMSE differences, # therefore not significant alpha = 0.05 significantly_different, diff, hwci = xs.halfwidth_ci_test( fct_1d, fct_1d_worse, obs_1d, metric, time_dim="time", dim=[], alpha=alpha ) print(diff) print(hwci) print(f"RMSEs significantly different at level {alpha} : {bool(significantly_different)}") ``` -------------------------------- ### Calculate Gerrity Score with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Gerrity Score for evaluating categorical forecasts. This score is particularly useful for assessing the performance of probabilistic forecasts against categorical observations, considering the resolution and reliability. ```python print(multicategory_contingency.gerrity_score()) ``` -------------------------------- ### Calculate CRPS Quadrature Score Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Continuous Ranked Probability Score (CRPS) using numerical integration (quadrature). Requires a callable distribution function, such as `norm` from `scipy.stats`. Setting dim=[] avoids averaging over dimensions. ```python from scipy.stats import norm import xskillscore as xs crps_quadrature = xs.crps_quadrature(obs3, norm, dim=[]) print(crps_quadrature) ``` -------------------------------- ### Calculate Accuracy with Multi-Category xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the accuracy score for a multi-category contingency object. Accuracy measures the overall proportion of correct forecasts (both correct positive and correct negative predictions). The computed accuracy is then printed. ```python print(multicategory_contingency.accuracy()) ``` -------------------------------- ### Calculate Unweighted Pearson Correlation Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Pearson correlation coefficient between observed and forecasted data without applying any weights. This serves as a baseline for comparison with weighted calculations. The 'dim' parameter specifies the dimensions over which to compute the correlation. ```python r_unweighted = xs.pearson_r(obs2, fct2, dim=["lat", "lon"], weights=None) print(r_unweighted) ``` -------------------------------- ### Calculate Bias Score with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the bias score from a pre-instantiated dichotomous contingency object. The bias score measures whether the forecasts tend to over-predict or under-predict the observations. The result is printed to the console. ```python print(dichotomous_contingency.bias_score()) ``` -------------------------------- ### Calculate Pearson Correlation over Multiple Dimensions Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Pearson correlation coefficient between observed ('obs') and forecast ('fct') data by reducing both the 'lat' and 'lon' dimensions. This demonstrates applying the metric over multiple axes, resulting in a DataArray indexed by 'time'. ```python r = xs.pearson_r(obs, fct, dim=["lat", "lon"]) print(r) ``` -------------------------------- ### Calculate Pearson R p-value for Spatial Data Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the p-value for the Pearson correlation coefficient between forecast and observed data, typically for spatially gridded data. This function helps in identifying regions where the correlation is statistically significant. ```python p = xs.pearson_r_p_value(fct, obs, "time") print(p) ``` -------------------------------- ### Calculate Weighted Pearson Correlation Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Pearson correlation coefficient between observed and forecasted data, considering weights. The 'weights' parameter allows for differential weighting of data points. The 'dim' parameter specifies the dimensions over which to compute the correlation. ```python # Remove the time dimension from weights weights = weights.isel(time=0) r_weighted = xs.pearson_r(obs2, fct2, dim=["lat", "lon"], weights=weights) print(r_weighted) ``` -------------------------------- ### Calculate Pearson Correlation Coefficient Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Pearson correlation coefficient between observed ('obs') and forecast ('fct') data along the 'time' dimension using xskillscore's pearson_r function. The result is an xarray DataArray with dimensions 'lat' and 'lon'. ```python r = xs.pearson_r(obs, fct, dim="time") print(r) ``` -------------------------------- ### Calculate Peirce Score with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Peirce Score (also known as the True Skill Statistic) for contingency data. This metric quantifies the skill of a binary forecast system by measuring the agreement between forecasts and observations, independent of the base rates. ```python print(multicategory_contingency.peirce_score()) ``` -------------------------------- ### Calculate Pearson Correlation p-value Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the p-value for the Pearson correlation coefficient between observed ('obs') and forecast ('fct') data along the 'time' dimension using xskillscore's pearson_r_p_value function. The output is an xarray DataArray with dimensions 'lat' and 'lon'. ```python p = xs.pearson_r_p_value(obs, fct, dim="time") print(p) ``` -------------------------------- ### Replace pandas with cftime Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Replaces the 'pandas' dependency with 'cftime' in examples and tests. This change may affect how time information is handled, potentially offering more flexibility or better compatibility with specific climate data standards. ```python Internal Changes ~~~~~~~~~~~~~~~~ - Replace ``pandas`` with ``cftime`` in examples and tests. `Aaron Spring`_ and `Ray Bell`_ ``` -------------------------------- ### Calculate False Alarm Rate with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the false alarm rate (also known as the false positive rate or fall-out) from a dichotomous contingency object. This metric represents the proportion of actual negative events that were incorrectly predicted as positive. The result is printed. ```python print(dichotomous_contingency.false_alarm_rate()) ``` -------------------------------- ### Import Libraries and Generate Data for xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/tabular-data.ipynb Imports necessary libraries (numpy, pandas, xskillscore, sklearn) and sets a random seed for reproducible data generation. This setup is crucial for using xskillscore with pandas DataFrames. ```python import numpy as np import pandas as pd import xskillscore as xs from sklearn.datasets import fetch_california_housing from sklearn.metrics import mean_squared_error np.random.seed(seed=42) ``` -------------------------------- ### Calculate Odds Ratio Skill Score with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Odds Ratio Skill Score (ORSS) using a dichotomous contingency object. ORSS measures the performance of a forecasting system relative to a random forecast, considering the odds ratio. The calculated score is printed. ```python print(dichotomous_contingency.odds_ratio_skill_score()) ``` -------------------------------- ### Calculate Pearson Correlation using xarray Accessor Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Demonstrates how to use the xskillscore accessor (`xs`) within an xarray Dataset to compute the Pearson correlation coefficient between two variables. It handles cases where both observation and forecast variables are in the same Dataset, or when the forecast variable is provided separately. The calculation is performed along a specified dimension, typically 'time'. ```python import xarray as xr import numpy as np # Assume 'obs' and 'fct' are defined xarray DataArrays # Example DataArrays: obs = xr.DataArray(np.random.rand(4, 5), dims=('lat', 'lon'), coords={'lat': np.arange(4), 'lon': np.arange(5)}) fct = xr.DataArray(np.random.rand(4, 5), dims=('lat', 'lon'), coords={'lat': np.arange(4), 'lon': np.arange(5)}) ds = xr.Dataset() ds["obs_var"] = obs ds["fct_var"] = fct # Calculate Pearson R when both are in the same Dataset print(ds.xs.pearson_r("obs_var", "fct_var", dim="time")) # Calculate Pearson R when forecast is a separate Dataset ds = ds.drop_vars("fct_var") print(ds.xs.pearson_r("obs_var", fct, dim="time")) ``` -------------------------------- ### Calculate Mean Absolute Error with Skipna Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Mean Absolute Error (MAE) between observed and forecasted data, with NaN values being ignored during the calculation when 'skipna' is set to True. This is useful when dealing with incomplete datasets. The 'dim' parameter specifies the dimensions over which to compute the MAE. ```python obs_with_nans = obs.where(obs.lat > 1) fct_with_nans = fct.where(fct.lat > 1) mae_with_skipna = xs.mae(obs_with_nans, fct_with_nans, dim=["lat", "lon"], skipna=True) print(mae_with_skipna) ``` -------------------------------- ### Calculate ROC for Probabilistic Forecasts with Xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb This snippet demonstrates how to calculate the Receiver Operating Characteristic (ROC) curve for probabilistic forecasts using the xskillscore library. It assumes continuous bin edges and returns all results as a metric dimension. It utilizes xarray for data handling and matplotlib for plotting. Dependencies include xskillscore and matplotlib. Inputs are observed and forecasted data, with outputs being the ROC curve data and the area under the curve. ```python # ROC for probabilistic forecasts and bin_edges='continuous' default roc = xs.roc(obs3 > 0.5, (fct3 > 0.5).mean("member"), return_results="all_as_metric_dim") plt.figure(figsize=(4, 4)) plt.plot([0, 1], [0, 1], "k:") roc.to_dataset(dim="metric").plot.scatter(y="true positive rate", x="false positive rate") roc.sel(metric="area under curve").values[0] ``` -------------------------------- ### Tag and Push Git Release Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTORELEASE.rst These commands clone the repository, create an annotated Git tag for the release version, and push the tag to the main branch on GitHub. This is crucial for version control and deployment. ```bash git clone git@github.com:xarray-contrib/xskillscore.git cd xskillscore git tag -a v0.0.xx -m "Version 0.0.xx" git push origin main --tags ``` -------------------------------- ### Calculate Heidke Score with xskillscore Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Heidke Score for contingency data, a measure of skill in categorical forecasts. This score is typically calculated from observation and forecast data structured as xarray DataArrays. ```python print(multicategory_contingency.heidke_score()) ``` -------------------------------- ### Create and Checkout Git Release Branch Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTORELEASE.rst This command creates a new Git branch for the release, following the 'release-vX.X.XX' naming convention. It's a standard Git operation for managing release versions. ```bash git checkout -b release-v0.0.xx ``` -------------------------------- ### Clone and Edit Conda-Forge Recipe Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTORELEASE.rst This demonstrates cloning the xskillscore-feedstock repository, navigating to the recipe directory, and editing the 'meta.yaml' file. This is part of the process to update the conda package for xskillscore. ```bash git clone git@github.com:username/xskillscore-feedstock.git cd xskillscore-feedstock cd recipe # edit meta.yaml ``` -------------------------------- ### Use setuptools-scm for Versioning Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Implementation of `setuptools-scm` to automatically determine the package version number. This change simplifies version management and was introduced in v0.0.27. ```python # In setup.py or pyproject.toml from setuptools_scm import get_version setup( # ... other setup arguments version=get_version(), # ... ) ``` -------------------------------- ### Use pyproject.toml and Ruff for Linting Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Migration to `pyproject.toml` for project configuration and adoption of `ruff` as the primary linter. This streamlines the development workflow and improves code quality checks. ```toml [tool.ruff] line-length = 79 select = ["E", "F", "W", "I"] ``` -------------------------------- ### Calculate CRPS Gaussian Score Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Calculates the Continuous Ranked Probability Score (CRPS) assuming a Gaussian distribution for the forecast. It uses the ensemble mean and standard deviation as parameters for the Gaussian distribution. Setting dim=[] avoids averaging over dimensions. ```python import xskillscore as xs crps_gaussian = xs.crps_gaussian(obs3, fct3.mean("member"), fct3.std("member"), dim=[]) print(crps_gaussian) ``` -------------------------------- ### Add Formatting Tools (black, flake8, etc.) Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Incorporates code formatting and linting tools like 'black', 'flake8', 'isort', 'doc8', and 'pre-commit'. This ensures consistent code style across the project, similar to the 'climpred' project. ```text Internal Changes ~~~~~~~~~~~~~~~~ - Add ``black``, ``flake8``, ``isort``, ``doc8`` and ``pre-commit`` for formatting similar to ``climpred``. `Aaron Spring`_ and `Ray Bell`_ ``` -------------------------------- ### Convert Contingency Table to Pandas DataFrame Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Converts the xarray contingency table DataArray into a pandas DataFrame. It then uses pivot_table to reshape the data for better readability, grouping by forecast and observation categories and their bounds, and rounds the results to two decimal places. ```python print( dichotomous_contingency_table.to_dataframe() .pivot_table( index=["forecasts_category", "forecasts_category_bounds"], columns=["observations_category", "observations_category_bounds"], ) .round(2) ) ``` -------------------------------- ### Update and Push Stable Git Branch Source: https://github.com/xarray-contrib/xskillscore/blob/main/HOWTORELEASE.rst This process updates the 'stable' branch by rebasing it onto the 'main' branch, forcefully pushing the changes to the remote origin, and then switching back to the 'main' branch. This ensures the 'stable' branch reflects the latest 'main' after a release. ```bash git checkout stable git rebase main git push -f origin stable git checkout main ``` -------------------------------- ### Create Contingency Table (Contingency) Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/api.rst Initializes a Contingency object for analyzing dichotomous forecast verification. This object is used to compute various contingency table-based metrics. It requires forecast and observation data, typically binary (yes/no) for many of its methods. ```python import xskillscore as xs # Assuming 'forecast' and 'observation' are binary xarray DataArrays contingency_table = xs.Contingency(forecast, observation) ``` -------------------------------- ### Comparative Metrics Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/api.rst Statistical tests to compare the performance of two forecasts. ```APIDOC ## Comparative Metrics ### Description Functions that perform statistical tests to determine if one forecast is significantly better than another. ### Methods - `halfwidth_ci_test` - `sign_test` ``` -------------------------------- ### Calculate Mean Absolute Error without Skipna Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Computes the Mean Absolute Error (MAE) between observed and forecasted data, where the presence of any NaN values will result in a NaN output when 'skipna' is set to False. This is useful for ensuring data integrity. The 'dim' parameter specifies the dimensions over which to compute the MAE. ```python obs_with_nans = obs.where(obs.lat > 1) fct_with_nans = fct.where(fct.lat > 1) mae_without_skipna = xs.mae(obs_with_nans, fct_with_nans, dim=["lat", "lon"], skipna=False) print(mae_without_skipna) ``` -------------------------------- ### Implement Faster Resampling Iterations Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Adds two functions for resampling: `~xskillscore.resampling.resample_iterations` and `~xskillscore.resampling.resample_iterations_idx`. The latter is optimized for speed. These functions facilitate resampling with and without replacement, addressing issue #215 and pull request #225, which is crucial for uncertainty quantification in forecasts. ```python import xskillscore as xscore # Example usage data = xr.DataArray(np.random.rand(10, 5)) # Resampling with replacement resampled_data_with = xscore.resampling.resample_iterations(data, 'time', n_iter=100) # Faster resampling without replacement (using indices) resampled_data_without = xscore.resampling.resample_iterations_idx(data, 'time', n_iter=100) ``` -------------------------------- ### Consolidate Pytest Fixtures Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Creates a 'conftest.py' file to aggregate all 'pytest' fixtures used in the project. This centralizes testing configurations and improves test management. ```python Internal Changes ~~~~~~~~~~~~~~~~ - Added ``conftest.py`` to gather all ``pytest.fixtures``. (:issue:`126`, :pr:`159`). `Aaron Spring`_ and `Ray Bell`_ ``` -------------------------------- ### Create Fake Data for Performance Comparison Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/tabular-data.ipynb Generates a synthetic dataset with 'DATE', 'STORE', 'SKU', 'y' (actual values), and 'yhat' (predicted values). This data is used to benchmark the performance of different evaluation methods. ```python stores = np.arange(100) skus = np.arange(100) dates = pd.date_range("1/1/2020", "1/10/2020", freq="D") rows = [] for _, date in enumerate(dates): for _, store in enumerate(stores): for _, sku in enumerate(skus): rows.append( dict( { "DATE": date, "STORE": store, "SKU": sku, "y": np.random.randint(9) + 1, } ) ) df = pd.DataFrame(rows) noise = np.random.uniform(-1, 1, size=len(df["y"])) df["yhat"] = (df["y"] + (df["y"] * noise)).clip(lower=df["y"].min()) df ``` -------------------------------- ### Apply Multiple Testing Correction (FDR) Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/quick-start.ipynb Applies multiple testing correction methods, such as False Discovery Rate (FDR), to a set of p-values, commonly obtained from spatial significance testing. This helps control the rate of false positives when performing many statistical tests simultaneously. The function can return corrected p-values or other relevant statistics. ```python p_corrected = xs.multipletests(p, alpha=0.5, method="fdr_bh", return_results="pvals_corrected") print(p_corrected) ``` -------------------------------- ### Probabilistic Metrics Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/api.rst Metrics for evaluating probabilistic forecasts, many ported from properscoring. ```APIDOC ## Probabilistic Metrics ### Description Metrics for evaluating probabilistic forecasts, including Brier score, Continuous Ranked Probability Score (CRPS), and related scores, often adapted from the `properscoring` library. ### Methods - `brier_score` - `crps_ensemble` - `crps_gaussian` - `crps_quadrature` - `discrimination` - `rank_histogram` - `reliability` - `roc` - `rps` - `threshold_brier_score` ``` -------------------------------- ### Generic Comparative Metric (formerly mae_test) Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Renamed `mae_test` to `halfwidth_ci_test` to reflect its generic nature. This function now accepts various distance metrics (except `mape`) and includes a `metric` argument to specify the desired distance calculation. ```python import xskillscore as xs import numpy as np # Example forecasts and observations forecasts = np.array([1, 2, 3]) observations = np.array([1.1, 2.2, 2.9]) # Using the new generic function with 'mse' metric # Assume metric='mse' calculates Mean Squared Error # The actual implementation would involve passing a callable or string # For simplicity, let's assume it's integrated into the function signature: # ci_halfwidth = xs.halfwidth_ci_test(forecasts, observations, metric='mse') # Example assuming a specific metric function is passed: def mse(f, o): return np.mean((f - o)**2) # The function signature might look like this: # def halfwidth_ci_test(f, o, metric): # ... use metric(f, o) ... # Placeholder for demonstration: print("Functionality demonstrated by calling with specific metrics.") # Example: calculate MAE directly for demonstration purpose mae = xs.mae(forecasts, observations) print(f"Example metric (MAE): {mae}") ``` -------------------------------- ### Compare Time Series to Gridded Product Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Enables comparison of time series data to gridded products that span the same time period. This functionality is useful for evaluating model performance on spatial data against observational time series. ```python time series can be compared to a gridded product spanning that same time span. (:issue:`165`, :issue:`71`, :issue:`156`, :pr:`166`) `Aaron Spring`_ ``` -------------------------------- ### Update CI with Python 3.7 and 3.8 Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Adds support for Python 3.7 and Python 3.8 to the Continuous Integration (CI) process. The project now recommends using the latest Python 3 versions for development. This aligns the project with modern Python standards and ensures compatibility. Addresses issue #21 and pull request #189. ```python # No direct code snippet, this is a CI/environment configuration change. # Typically managed in files like .travis.yml, .github/workflows/ci.yml, # or setup.py/pyproject.toml specifying python_requires='>=3.7' ``` -------------------------------- ### Configure Numba Pinning for CI Fix Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Pins the 'numba' dependency to version '>=0.52' to resolve issues in the Continuous Integration (CI) environment. This ensures that a compatible version of Numba is used, preventing potential build or runtime errors. This fix addresses issue #233 and pull request #234. ```python # No direct code snippet, this is a dependency management change. # It would typically be reflected in a requirements.txt or setup.py file: # numba>=0.52 ``` -------------------------------- ### Add Sign Test for Deterministic Forecasts Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Introduces the sign test, a method for comparing deterministic forecasts described by DelSole and Tippett (2016). Implemented as `~xskillscore.sign_test`, this feature addresses issue #133 and pull request #176, providing a new tool for forecast verification. ```python import xskillscore as xscore # Assuming 'forecast' and 'reference' are deterministic forecasts sign_test_result = xscore.sign_test(forecast, reference) ``` -------------------------------- ### Enhance Brier Score and RPS with 'fair' keyword Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Updates `~xskillscore.brier_score` and `~xskillscore.rps` to include a 'fair' keyword argument. This allows for ensemble-size adjustments in the scoring calculations, defaulting to False. Additionally, `~xskillscore.brier_score` now supports binary or boolean forecasts when a 'member_dim' is present, addressing issue #162 and pull request #211. ```python import xskillscore as xscore # Example usage with fair=True brier_score_fair = xscore.brier_score(forecast, reference, fair=True) rps_fair = xscore.rps(forecast, reference, fair=True) # Example with binary/boolean forecasts for brier_score bool_forecast = xr.DataArray([True, False, True]) bool_reference = xr.DataArray([1, 0, 1]) brier_bool = xscore.brier_score(bool_forecast, bool_reference, member_dim='member') ``` -------------------------------- ### Add Contingency Table and Associated Metrics Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Adds the `~xskillscore.Contingency` class for creating contingency tables and associated metrics. This feature, addressed in pull requests #119 and #153, enhances the library's capability for categorical forecast verification. ```python import xskillscore as xscore # Assuming 'forecast' and 'reference' are categorical forecasts contingency_table = xscore.Contingency(forecast, reference) # Associated metrics can be derived from the contingency table ``` -------------------------------- ### Add Rank Histogram and Discrimination Metrics Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Introduces `~xskillscore.rank_histogram` and `~xskillscore.discrimination` functions for probabilistic forecast evaluation. These metrics, addressed in pull request #136, provide insights into the calibration and discriminatory power of probabilistic forecasts. ```python import xskillscore as xscore # Assuming 'forecast' and 'reference' are probabilistic forecasts rank_hist = xscore.rank_histogram(forecast, reference, dim='time') discrimination = xscore.discrimination(forecast, reference, dim='time') ``` -------------------------------- ### Implement Typing for Xskillscore Functions Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Integration of type hints (`typing`) across all `xskillscore` functions. This improves code readability, maintainability, and allows for static analysis. ```python from typing import Union, Optional import xarray as xr def my_function(data: xr.DataArray, threshold: Union[float, int], optional_arg: Optional[str] = None) -> xr.DataArray: """A function with type hints.""" # Function logic here return data ``` -------------------------------- ### Test Gridded Product Time Series Equivalence Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Includes tests to verify that results from grid cells within a gridded product match the results obtained when their corresponding time series are input directly into the functions. This ensures consistency in calculations. ```python Testing ~~~~~~~ - Test that results from grid cells in a gridded product match the same value if their time series were input directly into functions. `Riley X. Brady`_ ``` -------------------------------- ### Add Coveralls for Test Coverage Source: https://github.com/xarray-contrib/xskillscore/blob/main/CHANGELOG.rst Integrates Coveralls to track and report test coverage for the project. This helps in monitoring the thoroughness of the test suite and identifying areas that need more testing. ```text Internal Changes ~~~~~~~~~~~~~~~~ - Add coveralls for tests coverage. `Aaron Spring`_ and `Ray Bell`_ ``` -------------------------------- ### Load and Preprocess California Housing Dataset (Python) Source: https://github.com/xarray-contrib/xskillscore/blob/main/docs/source/tabular-data.ipynb Loads the California housing dataset, rounds the 'AveRooms' column, and renames 'MedHouseVal' to 'y'. This prepares the data for subsequent analysis. ```python from sklearn.datasets import fetch_california_housing housing = fetch_california_housing(as_frame=True) df = housing.frame df["AveRooms"] = df["AveRooms"].round() df = df.rename(columns={"MedHouseVal": "y"}) df ```