Try Live
Add Docs
Rankings
Pricing
Enterprise
Docs
Install
Theme
Install
Docs
Pricing
Enterprise
More...
More...
Try Live
Rankings
Create API Key
Add Docs
HierarchicalForecast
https://github.com/nixtla/hierarchicalforecast
Admin
HierarchicalForecast is a Python library for probabilistic hierarchical forecasting with statistical
...
Tokens:
73,562
Snippets:
532
Trust Score:
9.7
Update:
1 month ago
Context
Skills
Chat
Benchmark
89.5
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# HierarchicalForecast HierarchicalForecast is a Python library for probabilistic hierarchical forecasting with statistical and econometric reconciliation methods. It provides tools to ensure coherent forecasts across hierarchical structures (geographical groupings, categories, temporal aggregations) while offering both cross-sectional and temporal reconciliation capabilities. The library integrates with forecasting tools like StatsForecast and NeuralForecast. The library implements classic reconciliation methods (BottomUp, TopDown, MiddleOut), optimal combination methods (MinTrace, ERM), and probabilistic coherent prediction methods (Normality, Bootstrap, PERMBU, Conformal). It supports both dense and sparse matrix operations for scalability with large hierarchies, and provides evaluation tools to assess forecast accuracy across different hierarchical levels. ## Installation ```bash pip install hierarchicalforecast ``` ## Aggregate Time Series into Hierarchical Structure The `aggregate` function transforms bottom-level time series into a hierarchical structure defined by specification levels, creating the summing matrix S_df and hierarchical tags needed for reconciliation. ```python import pandas as pd import numpy as np from hierarchicalforecast.utils import aggregate # Create sample data with hierarchical attributes df = pd.DataFrame({ 'ds': pd.date_range('2020-01-01', periods=12, freq='M').tolist() * 4, 'Country': ['Australia'] * 24 + ['NewZealand'] * 24, 'State': ['NSW'] * 12 + ['VIC'] * 12 + ['Auckland'] * 12 + ['Wellington'] * 12, 'y': np.random.rand(48) * 100 }) # Define hierarchical specification (from top to bottom) spec = [ ['Country'], # Level 1: Total by country ['Country', 'State'], # Level 2: Country-State combinations (bottom) ] # Aggregate data into hierarchical structure Y_df, S_df, tags = aggregate( df=df, spec=spec, id_col='unique_id', time_col='ds', target_cols=['y'] ) # Y_df contains all hierarchical series # S_df is the summing matrix mapping bottom to all levels # tags maps level names to their series indices print(f"Total series: {len(S_df)}") print(f"Hierarchy levels: {list(tags.keys())}") # Output: Total series: 6, Hierarchy levels: ['Country', 'Country/State'] ``` ## Aggregate Time Series with Exogenous Variables The `aggregate` function supports aggregating exogenous variables alongside target columns using various aggregation functions. ```python from hierarchicalforecast.utils import aggregate # Data with exogenous variables df = pd.DataFrame({ 'ds': pd.date_range('2020-01-01', periods=12, freq='M').tolist() * 4, 'Country': ['US'] * 24 + ['Canada'] * 24, 'Region': ['East'] * 12 + ['West'] * 12 + ['Ontario'] * 12 + ['Quebec'] * 12, 'y': np.random.rand(48) * 100, 'temperature': np.random.rand(48) * 30, 'promotion': np.random.randint(0, 2, 48) }) spec = [['Country'], ['Country', 'Region']] # Aggregate with different functions for exogenous variables Y_df, S_df, tags = aggregate( df=df, spec=spec, exog_vars={ 'temperature': 'mean', # Average temperature 'promotion': ['sum', 'mean'] # Sum and mean of promotions }, id_col='unique_id', time_col='ds', target_cols=['y'] ) # Y_df now contains: unique_id, ds, y, temperature_mean, promotion_sum, promotion_mean print(Y_df.columns.tolist()) ``` ## Aggregate with Sparse Summing Matrix For large hierarchies with many series, use sparse matrices to reduce memory usage. ```python from hierarchicalforecast.utils import aggregate # Enable sparse summing matrix Y_df, S_matrix, tags = aggregate( df=df, spec=spec, sparse_s=True, # Returns SMatrix wrapper instead of DataFrame id_col='unique_id', time_col='ds' ) # SMatrix provides efficient sparse operations print(f"Shape: {S_matrix.shape}") # (n_series, n_bottom) print(f"Non-zero elements: {S_matrix.to_sparse().nnz}") # Convert to dense or DataFrame when needed S_dense = S_matrix.to_dense() # NumPy array S_df = S_matrix.to_frame() # Pandas DataFrame S_csr = S_matrix.to_csr() # CSR sparse matrix ``` ## Temporal Aggregation The `aggregate_temporal` function creates temporal hierarchies by aggregating time series at different frequencies. ```python from hierarchicalforecast.utils import aggregate_temporal # Sample time series data df = pd.DataFrame({ 'unique_id': ['series_1'] * 12 + ['series_2'] * 12, 'ds': pd.date_range('2020-01-01', periods=12, freq='M').tolist() * 2, 'y': np.random.rand(24) * 100 }) # Define temporal aggregation levels temporal_spec = { 'Annual': 12, # 12 months = 1 year 'Quarterly': 3, # 3 months = 1 quarter 'Monthly': 1 # Bottom level (required) } Y_df, S_df, tags = aggregate_temporal( df=df, spec=temporal_spec, id_col='unique_id', time_col='ds', id_time_col='temporal_id', aggregation_type='local' # 'local' or 'global' ) print(f"Temporal levels: {list(tags.keys())}") # Output: Temporal levels: ['Annual', 'Quarterly', 'Monthly'] ``` ## BottomUp Reconciliation BottomUp reconciliation aggregates bottom-level forecasts upward through the hierarchy. This is the simplest method that preserves bottom-level forecasts exactly. ```python from hierarchicalforecast.core import HierarchicalReconciliation from hierarchicalforecast.methods import BottomUp # Assume Y_hat_df contains base forecasts with columns: unique_id, ds, model_name # S_df is the summing matrix, tags are hierarchy indices reconcilers = [BottomUp()] hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, # Base forecasts S_df=S_df, # Summing matrix tags=tags, # Hierarchy tags Y_df=None # Not needed for BottomUp ) # Y_rec_df contains reconciled forecasts with columns like 'model_name/BottomUp' print(Y_rec_df.columns.tolist()) ``` ## TopDown Reconciliation TopDown reconciliation distributes aggregate forecasts down the hierarchy using proportions. Requires strictly hierarchical structures (tree-like, no grouped hierarchies). ```python from hierarchicalforecast.methods import TopDown # Three proportion methods available: # 1. forecast_proportions: uses forecast values as proportions # 2. average_proportions: uses historical average proportions # 3. proportion_averages: uses proportions of historical averages reconcilers = [ TopDown(method='forecast_proportions'), TopDown(method='average_proportions'), TopDown(method='proportion_averages') ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df # Required for average_proportions and proportion_averages ) ``` ## MiddleOut Reconciliation MiddleOut anchors forecasts at a middle level, using BottomUp above and TopDown below. Requires strictly hierarchical structures. ```python from hierarchicalforecast.methods import MiddleOut # Specify the middle level from your tags reconcilers = [ MiddleOut( middle_level='Country/State', # Middle level name from tags top_down_method='forecast_proportions' # Method for levels below middle ) ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df ) ``` ## MinTrace Reconciliation MinTrace minimizes the total forecast variance while ensuring coherence. Supports multiple covariance estimation methods. ```python from hierarchicalforecast.methods import MinTrace reconcilers = [ MinTrace(method='ols'), # Ordinary least squares (no covariance) MinTrace(method='wls_struct'), # Weighted by hierarchical structure MinTrace(method='wls_var'), # Weighted by forecast variance MinTrace(method='mint_shrink'), # Shrinkage covariance estimator MinTrace(method='mint_cov'), # Full empirical covariance ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) # Methods requiring insample residuals need Y_df with fitted values Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df # Required for wls_var, mint_shrink, mint_cov ) # Access execution times print(hrec.execution_times) ``` ## MinTrace with Non-negative Constraints Force reconciled forecasts to be non-negative using quadratic programming. ```python from hierarchicalforecast.methods import MinTrace # Non-negative reconciliation reconcilers = [ MinTrace( method='mint_shrink', nonnegative=True, # Enforce non-negative forecasts num_threads=4 # Parallel QP solving ) ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df ) ``` ## ERM Reconciliation (Empirical Risk Minimization) ERM optimizes the reconciliation matrix using training data, relaxing unbiasedness assumptions. ```python from hierarchicalforecast.methods import ERM reconcilers = [ ERM(method='closed'), # Closed-form solution ERM(method='reg', lambda_reg=0.01), # L1 regularized ERM(method='reg_bu', lambda_reg=0.01) # L1 regularized toward BottomUp ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) # ERM requires insample values and predictions Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df # Must include fitted values as columns ) ``` ## Sparse Matrix Reconciliation For large hierarchies, use sparse variants for memory efficiency. ```python from hierarchicalforecast.methods import BottomUpSparse, TopDownSparse, MinTraceSparse reconcilers = [ BottomUpSparse(), TopDownSparse(method='forecast_proportions'), MinTraceSparse(method='ols', nonnegative=False) ] hrec = HierarchicalReconciliation(reconcilers=reconcilers) # Pass sparse S_df (from aggregate with sparse_s=True) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_matrix, # SMatrix object tags=tags ) ``` ## Probabilistic Reconciliation with Normality Method Generate coherent prediction intervals assuming normally distributed forecasts. ```python from hierarchicalforecast.core import HierarchicalReconciliation from hierarchicalforecast.methods import MinTrace reconcilers = [MinTrace(method='mint_shrink')] hrec = HierarchicalReconciliation(reconcilers=reconcilers) # Y_hat_df must include prediction interval columns: model-lo-90, model-hi-90, etc. Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, # Must have -lo-{level} and -hi-{level} columns S_df=S_df, tags=tags, Y_df=Y_train_df, level=[80, 90, 95], # Confidence levels intervals_method='normality' # Gaussian assumption ) # Output includes reconciled intervals: model/MinTrace-lo-90, model/MinTrace-hi-90 ``` ## Probabilistic Reconciliation with Bootstrap Method Generate prediction intervals without distributional assumptions using bootstrap sampling. ```python Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df, # Required for residual bootstrap level=[80, 90, 95], intervals_method='bootstrap', seed=42 # Reproducibility ) ``` ## Probabilistic Reconciliation with PERMBU Method PERMBU uses empirical copulas to capture bottom-level dependencies. Requires strictly hierarchical structures. ```python Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df, level=[80, 90, 95], intervals_method='permbu', seed=42 ) ``` ## Probabilistic Reconciliation with Conformal Prediction Distribution-free prediction intervals with finite-sample coverage guarantees. ```python Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df, # Calibration data level=[80, 90, 95], intervals_method='conformal', seed=42 ) ``` ## Generate Probabilistic Samples Generate coherent samples from the reconciled forecast distribution. ```python # Enable sample generation Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df, level=[90], intervals_method='normality', num_samples=100, # Number of coherent samples seed=42 ) # Samples stored in columns: model/Method-sample-0, model/Method-sample-1, ... sample_cols = [c for c in Y_rec_df.columns if '-sample-' in c] print(f"Generated {len(sample_cols)} sample columns") ``` ## Temporal Reconciliation Apply reconciliation to temporal hierarchies (different aggregation frequencies). ```python from hierarchicalforecast.utils import aggregate_temporal # First create temporal hierarchy Y_df, S_df, tags = aggregate_temporal( df=df, spec={'Annual': 12, 'Quarterly': 3, 'Monthly': 1}, id_col='unique_id', time_col='ds', id_time_col='temporal_id' ) # Reconcile temporally Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, temporal=True, # Enable temporal reconciliation id_time_col='temporal_id' # Temporal identifier column ) ``` ## Reconciliation Diagnostics Enable diagnostics to monitor coherence and adjustments made during reconciliation. ```python Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_df, diagnostics=True, # Enable diagnostics diagnostics_atol=1e-6 # Tolerance for coherence check ) # Access diagnostics DataFrame diagnostics_df = hrec.diagnostics print(diagnostics_df) # Metrics include: # - coherence_residual_mae_before/after: coherence violation before/after # - adjustment_mae/rmse/max/mean: statistics on forecast adjustments # - negative_count_before/after: count of negative values # - is_coherent: whether forecasts satisfy aggregation constraints ``` ## Evaluate Hierarchical Forecasts Evaluate forecast accuracy across different hierarchical levels. ```python from hierarchicalforecast.evaluation import evaluate from utilsforecast.losses import mse, mae, mape # Merge reconciled forecasts with actual values df_eval = Y_rec_df.merge(Y_test_df, on=['unique_id', 'ds']) # Evaluate across hierarchy levels evaluation = evaluate( df=df_eval, metrics=[mse, mae, mape], tags=tags, train_df=Y_train_df, # For metrics like MASE id_col='unique_id', time_col='ds', target_col='y', agg_fn='mean', # Aggregate scores by mean benchmark='Naive' # Optional: scale by benchmark model ) print(evaluation) # Returns DataFrame with metrics by hierarchical level ``` ## Visualization with HierarchicalPlot Visualize hierarchical time series and forecasts. ```python from hierarchicalforecast.utils import HierarchicalPlot # Initialize plotter hplot = HierarchicalPlot(S=S_df, tags=tags) # Plot summing matrix structure fig = hplot.plot_summing_matrix() # Plot single series with forecasts and prediction intervals fig = hplot.plot_series( series='Australia/NSW', Y_df=Y_rec_df, models=['AutoARIMA/MinTrace_method-mint_shrink'], level=[80, 90] ) # Plot hierarchically linked series fig = hplot.plot_hierarchically_linked_series( bottom_series='Australia/NSW', Y_df=Y_rec_df, models=['AutoARIMA/MinTrace_method-mint_shrink'], level=[90] ) # Plot prediction gaps across hierarchy levels fig = hplot.plot_hierarchical_predictions_gap( Y_df=Y_rec_df, models=['AutoARIMA/BottomUp', 'AutoARIMA/MinTrace_method-mint_shrink'] ) ``` ## Complete Workflow Example Full example combining data preparation, forecasting, reconciliation, and evaluation. ```python import pandas as pd import numpy as np from datasetsforecast.hierarchical import HierarchicalData from statsforecast import StatsForecast from statsforecast.models import AutoARIMA, Naive from hierarchicalforecast.core import HierarchicalReconciliation from hierarchicalforecast.methods import BottomUp, TopDown, MinTrace from hierarchicalforecast.evaluation import evaluate from utilsforecast.losses import mse, mae # 1. Load and prepare hierarchical data Y_df, S_df, tags = HierarchicalData.load('./data', 'TourismSmall') Y_df['ds'] = pd.to_datetime(Y_df['ds']) S_df = S_df.reset_index(names='unique_id') # 2. Train/test split Y_test_df = Y_df.groupby('unique_id').tail(4) Y_train_df = Y_df.drop(Y_test_df.index) # 3. Generate base forecasts sf = StatsForecast( models=[AutoARIMA(season_length=4), Naive()], freq='QE', n_jobs=-1 ) Y_hat_df = sf.forecast(df=Y_train_df, h=4, fitted=True) Y_fitted_df = sf.forecast_fitted_values() # 4. Prepare training data with fitted values for reconciliation Y_train_with_fitted = Y_train_df.merge( Y_fitted_df[['unique_id', 'ds', 'AutoARIMA', 'Naive']], on=['unique_id', 'ds'] ) # 5. Set up reconcilers reconcilers = [ BottomUp(), TopDown(method='forecast_proportions'), MinTrace(method='mint_shrink') ] # 6. Reconcile forecasts hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile( Y_hat_df=Y_hat_df, S_df=S_df, tags=tags, Y_df=Y_train_with_fitted ) # 7. Evaluate df_eval = Y_rec_df.merge(Y_test_df, on=['unique_id', 'ds']) evaluation = evaluate( df=df_eval, metrics=[mse, mae], tags=tags, benchmark='Naive' ) print(evaluation) ``` ## Summary HierarchicalForecast provides a comprehensive toolkit for ensuring coherent forecasts across hierarchical time series structures. The main use cases include: (1) retail demand forecasting where product hierarchies require consistent predictions across SKUs, categories, and regions; (2) supply chain optimization needing aligned forecasts across distribution networks; (3) energy load forecasting with temporal and spatial aggregations; and (4) financial planning requiring consistent forecasts across organizational structures. Integration patterns typically involve: using StatsForecast or NeuralForecast for base forecast generation, then applying HierarchicalForecast reconciliation methods. For large-scale applications, leverage sparse matrix operations (sparse_s=True, Sparse reconciler variants) and parallel processing (num_threads parameter). For probabilistic forecasting, choose intervals_method based on needs: 'normality' for Gaussian assumptions, 'bootstrap' for non-parametric, 'conformal' for distribution-free guarantees, or 'permbu' for copula-based dependencies. The library integrates seamlessly with pandas and polars DataFrames through the narwhals abstraction layer.