### Build pmdarima from source (install) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Installs pmdarima into the Python site-packages from the source code. This is a standard installation method when building from source. ```bash $ python setup.py install ``` ```bash $ make install ``` -------------------------------- ### Install pmdarima using pip Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Installs the pmdarima package from the Python Package Index (PyPI) using pip. This is the standard method for installing Python packages. ```bash $ pip install pmdarima ``` -------------------------------- ### Inspect model summary and system versions Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/quickstart.rst Provides methods to view the statistical summary of a fitted model and display environment dependency versions. ```python # Examine model fit results stepwise_fit.summary() # Show system and dependency versions import pmdarima as pm pm.show_versions() ``` -------------------------------- ### Clone pmdarima from GitHub Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Clones the pmdarima repository from GitHub to build from source. This is useful for development or installing the latest bleeding-edge version. ```bash $ git clone https://github.com/alkaline-ml/pmdarima.git $ cd pmdarima ``` -------------------------------- ### Perform basic time series operations with pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/quickstart.rst Demonstrates how to use pmdarima's R-like interface for creating arrays, computing auto-correlation, and plotting results. ```python import pmdarima as pm # Create an array like you would in R x = pm.c(1, 2, 3, 4, 5, 6, 7) # Compute an auto-correlation like you would in R: pm.acf(x) # Plot an auto-correlation: pm.plot_acf(x) ``` -------------------------------- ### Verify pmdarima installation Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Verifies a successful pmdarima installation by importing the auto_arima module. An ImportError may indicate an outdated numpy version. ```python from pmdarima.arima import auto_arima ``` -------------------------------- ### Fit an auto-ARIMA model Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/quickstart.rst Shows how to load a dataset and fit a stepwise auto-ARIMA model using pmdarima's auto_arima function. ```python import numpy as np import pmdarima as pm from pmdarima.datasets import load_wineind # this is a dataset from R wineind = load_wineind().astype(np.float64) # fit stepwise auto-ARIMA stepwise_fit = pm.auto_arima(wineind, start_p=1, start_q=1, max_p=3, max_q=3, m=12, start_P=0, seasonal=True, d=1, D=1, trace=True, error_action='ignore', suppress_warnings=True, stepwise=True) ``` -------------------------------- ### Install pmdarima using Conda Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Installs the pmdarima package from the conda-forge channel using conda. This method is recommended for users within the Anaconda ecosystem. ```bash $ conda config --add channels conda-forge $ conda config --set channel_priority strict $ conda install pmdarima ``` -------------------------------- ### Build pmdarima from source (develop mode) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/setup.rst Installs pmdarima in development mode from the source code. This allows for running unit tests and making code changes without re-installing. ```bash $ python setup.py develop ``` ```bash $ make develop ``` -------------------------------- ### Setup: Load Data for Benchmarking Source: https://github.com/alkaline-ml/pmdarima/blob/master/benchmarks/Benchmarking Seasonality.ipynb Loads the 'item_sales_daily.csv.gz' dataset using pandas and extracts the 'sales' column as a NumPy array. This setup is common across different versions of pmdarima for benchmarking purposes. ```python import pandas as pd X = pd.read_csv('item_sales_daily.csv.gz') y = X['sales'].values X.head() ``` -------------------------------- ### Initialize pmdarima Environment Source: https://github.com/alkaline-ml/pmdarima/blob/master/benchmarks/bench_autoarima.ipynb Imports the necessary pmdarima modules and loads the wineind dataset for benchmarking. This serves as the initial setup step for the analysis. ```python import pmdarima from pmdarima.datasets import load_wineind wineind = load_wineind() print("Pyramid version: %s" % pmdarima.__version__) ``` -------------------------------- ### Install legacy pmdarima package Source: https://github.com/alkaline-ml/pmdarima/blob/master/README.md Instructions for installing the deprecated pyramid-arima package. This is not recommended for new projects and is provided for legacy support only. ```bash # Legacy warning: $ pip install pyramid-arima # python -c 'import pyramid;' ``` -------------------------------- ### Install pmdarima using pip Source: https://github.com/alkaline-ml/pmdarima/blob/master/README.md Installs the pmdarima library using pip, the standard Python package installer. This is the most common method for installing Python packages. ```bash pip install pmdarima ``` -------------------------------- ### Initialize pmdarima and Load Data Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/stock_market_example.ipynb Imports necessary libraries including numpy, pandas, matplotlib, and pmdarima. It also prints the installed pmdarima version and loads the MSFT stock dataset using `pmdarima.datasets.stocks.load_msft`. ```python import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import pmdarima as pm print(f"Using pmdarima {pm.__version__}") from pmdarima.datasets.stocks import load_msft df = load_msft() df.head() ``` -------------------------------- ### pmdarima.datasets: Toy timeseries datasets Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/modules/classes.rst This submodule provides several univariate time-series datasets for use in examples and tests. ```APIDOC ## pmdarima.datasets ### Description Provides several different univariate time-series datasets used in various examples and tests across the package. This is a good place to find easy-to-access data for prototyping models. ### Functions * **datasets.load_airpassengers**: Loads the Air Passengers dataset. * **datasets.load_ausbeer**: Loads the Australian Beer Production dataset. * **datasets.load_austres**: Loads the Australian Quarterly Employment dataset. * **datasets.load_gasoline**: Loads the Gasoline Consumption dataset. * **datasets.load_heartrate**: Loads the Heart Rate dataset. * **datasets.load_lynx**: Loads the Canadian Lynx dataset. * **datasets.load_msft**: Loads the Microsoft Stock Prices dataset. * **datasets.load_sunspots**: Loads the Sunspots dataset. * **datasets.load_taylor**: Loads the Taylor Series dataset. * **datasets.load_wineind**: Loads the Wine Industry Sales dataset. * **datasets.load_woolyrnq**: Loads the Woollen Yarn Production dataset. ``` -------------------------------- ### Setup and Git Workflow for pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/contributing.rst Commands to clone the repository, create a feature branch, and push changes to a remote fork. This workflow ensures that development is isolated from the main branch. ```bash git clone https://github.com/alkaline-ml/pmdarima.git cd pmdarima git checkout -b my-feature git add modified_files git commit git push -u origin my-feature ``` -------------------------------- ### Install pmdarima using conda Source: https://github.com/alkaline-ml/pmdarima/blob/master/README.md Installs the pmdarima library using conda, a package and environment manager. This method is recommended for users who prefer conda and ensures compatibility with other conda-installed packages. ```bash conda config --add channels conda-forge conda config --set channel_priority strict conda install pmdarima ``` -------------------------------- ### Load Built-in Time Series Datasets (Python) Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Shows how to load various classic time series datasets included with the pmdarima library. These datasets are useful for testing, examples, and benchmarking. ```python import pmdarima as pm # Wine production (monthly, seasonal) wineind = pm.datasets.load_wineind() print(f"Wineind: {len(wineind)} monthly observations") # Air passengers (monthly, seasonal with trend) airpassengers = pm.datasets.load_airpassengers() print(f"Air passengers: {len(airpassengers)} monthly observations") # Sunspots (monthly, long series) sunspots = pm.datasets.load_sunspots() print(f"Sunspots: {len(sunspots)} observations") # Australian beer production (quarterly) ausbeer = pm.datasets.load_ausbeer() print(f"Ausbeer: {len(ausbeer)} quarterly observations") # Heart rate data heartrate = pm.datasets.load_heartrate() print(f"Heart rate: {len(heartrate)} observations") # Lynx trappings (annual) lynx = pm.datasets.load_lynx() print(f"Lynx: {len(lynx)} annual observations") # Australian residents austres = pm.datasets.load_austres() print(f"Austres: {len(austres)} quarterly observations") # Taylor (half-hourly electricity demand) taylor = pm.datasets.load_taylor() print(f"Taylor: {len(taylor)} observations") # Wool yarn quarterly production woolyrnq = pm.datasets.load_woolyrnq() print(f"Woolyrnq: {len(woolyrnq)} quarterly observations") ``` -------------------------------- ### Check pmdarima version Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/issue12/issue-12.ipynb Retrieves and prints the currently installed version of the pmdarima library. ```python import pmdarima as pm print("Pyramid version: %r" % pm.__version__) ``` -------------------------------- ### Import and Version Check in Python Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/quick_start_example.ipynb Imports the numpy and pmdarima libraries and prints their respective versions. This is a basic setup step for using pmdarima. ```python import numpy as np import pmdarima as pm print('numpy version: %r' % np.__version__) print('pmdarima version: %r' % pm.__version__) ``` -------------------------------- ### Load Toy Time-Series Datasets in Python Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/modules/datasets.rst Demonstrates how to load various built-in datasets using the pmdarima interface. The 'as_series' parameter allows users to retrieve data as a Pandas Series object. ```python from pmdarima.datasets import load_airpassengers, load_austres, load_heartrate, load_lynx, load_taylor, load_wineind, load_woolyrnq, load_msft # Load endogenous datasets as series load_airpassengers(True).head() load_austres(True).head() load_heartrate(True).head() load_lynx(True).head() load_taylor(True).head() load_wineind(True).head() load_woolyrnq(True).head() # Load exogenous datasets load_msft().head() ``` -------------------------------- ### Serialize Entire Pipeline with Pickle Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Demonstrates how to serialize an entire scikit-learn compatible pipeline, including preprocessing steps and the ARIMA model, using Python's pickle module. This allows for saving and loading the complete modeling process. ```python import pmdarima as pm from pmdarima.pipeline import Pipeline from pmdarima.preprocessing import BoxCoxEndogTransformer import pickle # Assume 'y' is your time series data # y = ... # Define and fit the pipeline pipeline = Pipeline([ ('boxcox', BoxCoxEndogTransformer()), ('arima', pm.AutoARIMA(seasonal=True, m=12, suppress_warnings=True)) ]) pipeline.fit(y) # Serialize the pipeline with open('pipeline.pkl', 'wb') as f: pickle.dump(pipeline, f) # Deserialize the pipeline with open('pipeline.pkl', 'rb') as f: loaded_pipeline = pickle.load(f) # Use the loaded pipeline for predictions pipeline_forecasts = loaded_pipeline.predict(n_periods=6) ``` -------------------------------- ### Serialize and Deserialize Pmdarima Models using Pickle and Joblib (Python) Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Demonstrates how to save a fitted pmdarima model to disk using Python's `pickle` and `joblib` libraries, and how to load it back for later use, such as making predictions. ```python import pmdarima as pm import pickle import joblib # Load data and fit a model y = pm.datasets.load_wineind() model = pm.auto_arima(y, seasonal=True, m=12, suppress_warnings=True) # Serialize with pickle with open('arima_model_pickle.pkl', 'wb') as f: pickle.dump(model, f) # Load and predict using pickle with open('arima_model_pickle.pkl', 'rb') as f: loaded_model_pickle = pickle.load(f) forecasts_pickle = loaded_model_pickle.predict(n_periods=12) print(f"Pickle loaded model forecasts (first 3): {forecasts_pickle[:3]}") # Serialize with joblib joblib.dump(model, 'arima_model_joblib.pkl') # Load and predict using joblib loaded_model_joblib = joblib.load('arima_model_joblib.pkl') forecasts_joblib = loaded_model_joblib.predict(n_periods=12) print(f"Joblib loaded model forecasts (first 3): {forecasts_joblib[:3]}") ``` -------------------------------- ### Analyze Autocorrelation with Lag Plots (Python) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/stocks.rst Generates lag plots for the 'Open' price of the stock data to visually assess autocorrelation. This helps in understanding the linear structure of the time series, guiding the selection of ARIMA model orders (p, d, q). ```python from pandas.plotting import lag_plot fig, axes = plt.subplots(3, 2, figsize=(8, 12)) plt.title('MSFT Autocorrelation plot') # The axis coordinates for the plots ax_idcs = [ (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1) ] for lag, ax_coords in enumerate(ax_idcs, 1): ax_row, ax_col = ax_coords axis = axes[ax_row][ax_col] lag_plot(df['Open'], lag=lag, ax=axis) axis.set_title(f"Lag={lag}") plt.show() ``` -------------------------------- ### Testing pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/contributing.rst Commands to build the package and run the test suite to ensure code integrity. These commands should be executed from the top-level source folder. ```bash python setup.py develop pytest make test ``` -------------------------------- ### Fit and Serialize Pipeline with pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/README.md Illustrates creating, fitting, serializing, and deserializing a pmdarima pipeline that includes a Box-Cox transformation and an AutoARIMA model. This demonstrates how to save and load trained models for later use. ```python import pmdarima as pm from pmdarima.model_selection import train_test_split from pmdarima.pipeline import Pipeline from pmdarima.preprocessing import BoxCoxEndogTransformer import pickle # Load/split your data y = pm.datasets.load_sunspots() train, test = train_test_split(y, train_size=2700) # Define and fit your pipeline pipeline = Pipeline([ ('boxcox', BoxCoxEndogTransformer(lmbda2=1e-6)), # lmbda2 avoids negative values ('arima', pm.AutoARIMA(seasonal=True, m=12, suppress_warnings=True, trace=True)) ]) pipeline.fit(train) # Serialize your model just like you would in scikit: with open('model.pkl', 'wb') as pkl: pickle.dump(pipeline, pkl) # Load it and make predictions seamlessly: with open('model.pkl', 'rb') as pkl: mod = pickle.load(pkl) print(mod.predict(15)) ``` -------------------------------- ### Fit Stepwise ARIMA Model with pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/quick_start_example.ipynb This code snippet utilizes the `auto_arima` function from the `pmdarima` library to perform a stepwise search for the optimal ARIMA model parameters. It takes time series data and several optional parameters to guide the search, such as `start_p`, `start_q`, `max_p`, `max_q`, `m` (for seasonality), `start_P`, `seasonal`, `d`, `D`, `trace`, `error_action`, `suppress_warnings`, and `stepwise`. The function returns a fitted ARIMA model object. ```python import pmdarima as pm from pmdarima import model_selection # Load or define your time series data, e.g., wineind # For demonstration, assume wineind is loaded # Example: wineind = pm.datasets.load_wineind() stepwise_fit = pm.auto_arima(wineind, start_p=1, start_q=1, max_p=3, max_q=3, m=12, start_P=0, seasonal=True, d=1, D=1, trace=True, error_action='ignore', # don't want to know if an order does not work suppress_warnings=True, # don't want convergence warnings stepwise=True) # set to stepwise print(stepwise_fit.summary()) ``` -------------------------------- ### Test Stationarity with ADF in Python Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/tips_and_tricks.rst Demonstrates the use of the Augmented Dickey-Fuller test to determine if a time series requires differencing to achieve stationarity. ```python from pmdarima.arima.stationarity import ADFTest adf_test = ADFTest(alpha=0.05) p_val, should_diff = adf_test.should_diff(y) ``` -------------------------------- ### pmdarima.pipeline: Pipelining transformers & ARIMAs Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/modules/classes.rst The ``pipeline.Pipeline`` class allows for chaining transformers together and a final ARIMA stage. ```APIDOC ## pmdarima.pipeline ### Description With the ``pipeline.Pipeline`` class, we can pipeline transformers together and into a final ARIMA stage. ### Classes * **pipeline.Pipeline**: Pipelining transformers and an ARIMA estimator. ``` -------------------------------- ### Load and split time series data Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/sun-spots.rst Imports necessary libraries, loads the sunspots dataset, and splits the data into training and testing sets based on a specified length. ```python import numpy as np import pandas as pd import matplotlib.pyplot as plt import pmdarima as pm from pmdarima.datasets import load_sunspots from pmdarima.model_selection import train_test_split y = load_sunspots(True) train_len = 2750 y_train, y_test = train_test_split(y, train_size=train_len) y_train.head() ``` -------------------------------- ### Implement Data Transformation Pipeline with AutoARIMA Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/tips_and_tricks.rst Demonstrates how to chain preprocessing steps, such as Box-Cox transformations, with an AutoARIMA model using the pmdarima Pipeline class. This ensures consistent data processing before fitting and prediction. ```python from pmdarima.pipeline import Pipeline from pmdarima.preprocessing import BoxCoxEndogTransformer import pmdarima as pm wineind = pm.datasets.load_wineind() train, test = wineind[:150], wineind[150:] pipeline = Pipeline([ ("boxcox", BoxCoxEndogTransformer()), ("model", pm.AutoARIMA(seasonal=True, suppress_warnings=True)) ]) pipeline.fit(train) pipeline.predict(5) ``` -------------------------------- ### Full Cross-Validation with Timing using pmdarima Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Demonstrates how to perform full cross-validation on a time series model using SlidingWindowForecastCV and record test scores and fit times. It requires the pmdarima library and a time series dataset. ```python import pmdarima as pm from pmdarima.model_selection import cross_validate, SlidingWindowForecastCV y = pm.datasets.load_wineind() model = pm.ARIMA(order=(2, 1, 2), seasonal_order=(1, 0, 1, 12)) cv = SlidingWindowForecastCV(h=12, step=12, window_size=60) results = cross_validate( model, y, cv=cv, scoring='mean_absolute_error', verbose=1 ) print(f"Test scores: {results['test_score']}") print(f"Fit times: {results['fit_time']}") ``` -------------------------------- ### Configure Stepwise Search Constraints Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/tips_and_tricks.rst Shows how to use the StepwiseContext manager to impose time limits on the auto_arima stepwise search process. This is useful for balancing model accuracy against computational time constraints. ```python import pmdarima as pm from pmdarima.arima import StepwiseContext data = pm.datasets.load_wineind() train, test = data[:150], data[150:] with StepwiseContext(max_dur=15): model = pm.auto_arima(train, stepwise=True, error_action='ignore', seasonal=True, m=12) ``` -------------------------------- ### Visualize Time Series Data and Model Components (Python) Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Demonstrates how to use pmdarima's plotting utilities to visualize time series data, including ACF, PACF, autocorrelation plots, comprehensive time series displays, and decomposition plots. ```python import pmdarima as pm from pmdarima import plot_acf, plot_pacf, autocorr_plot, tsdisplay from pmdarima.arima import decompose from pmdarima.utils import decomposed_plot # Load data y = pm.datasets.load_sunspots() # ACF plot fig_acf = plot_acf(y, lags=40, alpha=0.05, title='ACF', show=False) # PACF plot fig_pacf = plot_pacf(y, lags=40, alpha=0.05, title='PACF', show=False) # Autocorrelation plot (pandas style) ax_autocorr = autocorr_plot(y, show=False) # Comprehensive time series display (series + ACF + histogram) fig_tsdisplay = tsdisplay( y, lag_max=50, figsize=(10, 6), title='Sunspots Time Series', bins=30, show=False ) # Decomposition plot result = decompose(pm.datasets.load_airpassengers(), type_='multiplicative', m=12) axes_decomposed = decomposed_plot( result, figure_kwargs={'figsize': (10, 8)}, show=False ) # ACF/PACF computation (numerical values) acf_values = pm.acf(y, nlags=20) pacf_values = pm.pacf(y, nlags=20) print(f"First 5 ACF values: {acf_values[:5]}") print(f"First 5 PACF values: {pacf_values[:5]}") ``` -------------------------------- ### Fit Auto-ARIMA Model with pmdarima Source: https://github.com/alkaline-ml/pmdarima/blob/master/README.md Demonstrates fitting a seasonal auto-ARIMA model using pmdarima on the wineind dataset. It includes data loading, splitting, model fitting, forecasting, and visualization of results. ```python import pmdarima as pm from pmdarima.model_selection import train_test_split import numpy as np import matplotlib.pyplot as plt # Load/split your data y = pm.datasets.load_wineind() train, test = train_test_split(y, train_size=150) # Fit your model model = pm.auto_arima(train, seasonal=True, m=12) # make your forecasts forecasts = model.predict(test.shape[0]) # predict N steps into the future # Visualize the forecasts (blue=train, green=forecasts) x = np.arange(y.shape[0]) plt.plot(x[:150], train, c='blue') plt.plot(x[150:], forecasts, c='green') plt.show() ``` -------------------------------- ### Visualize Benchmark Results Source: https://github.com/alkaline-ml/pmdarima/blob/master/benchmarks/bench_autoarima.ipynb Uses matplotlib to plot the execution times of stepwise versus grid search methods. This provides a visual comparison of the computational efficiency of the two approaches. ```python import matplotlib.pyplot as plt def plot_benchmarks(): plt.figure('auto_arima benchmark results') xx = len(stepwise_results) plt.scatter(x=xx, y=stepwise_results, label='stepwise') plt.scatter(x=xx, y=grid_results, label='grid') plt.legend(loc='upper left') plt.ylabel('Time (s)') plt.show() ``` -------------------------------- ### Chaining Transformers and ARIMA with Pipeline Source: https://context7.com/alkaline-ml/pmdarima/llms.txt The Pipeline class allows for the sequential execution of data transformers followed by an ARIMA estimator. It supports automatic inverse transformation during prediction and can be serialized using pickle for deployment. ```python import pmdarima as pm from pmdarima.pipeline import Pipeline from pmdarima.preprocessing import BoxCoxEndogTransformer, FourierFeaturizer import pickle y = pm.datasets.load_sunspots() pipeline = Pipeline([ ('boxcox', BoxCoxEndogTransformer(lmbda2=1e-6)), ('fourier', FourierFeaturizer(m=12, k=4)), ('arima', pm.AutoARIMA( seasonal=False, stepwise=True, suppress_warnings=True )) ]) pipeline.fit(y) forecasts = pipeline.predict(n_periods=24, inverse_transform=True) with open('model_pipeline.pkl', 'wb') as f: pickle.dump(pipeline, f) ``` -------------------------------- ### Conduct Stationarity Tests Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Evaluates time series stationarity using KPSS, ADF, and PP tests. These help determine if differencing is required before modeling. ```python import pmdarima as pm from pmdarima.arima.stationarity import KPSSTest, ADFTest, PPTest import numpy as np y = pm.datasets.load_airpassengers() # KPSS Test kpss = KPSSTest(alpha=0.05, null='level') pval, should_diff = kpss.should_diff(y) # ADF Test adf = ADFTest(alpha=0.05) pval, should_diff = adf.should_diff(y) # PP Test pp = PPTest(alpha=0.05) pval, should_diff = pp.should_diff(y) # Test on stationary data stationary_y = np.diff(y) kpss_stationary = KPSSTest(alpha=0.05) pval, should_diff = kpss_stationary.should_diff(stationary_y) ``` -------------------------------- ### Manual ARIMA and SARIMA Model Fitting with pmdarima Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Allows manual specification of ARIMA (p, d, q) and seasonal ARIMA (P, D, Q, m) orders for fitting time series models. It provides methods for forecasting future values, generating in-sample predictions, and accessing model metrics like AIC, BIC, and parameters. ```python import pmdarima as pm import numpy as np # Load data y = pm.datasets.load_airpassengers() # Fit a manual ARIMA(1,1,1) model model = pm.ARIMA( order=(1, 1, 1), suppress_warnings=True ) model.fit(y) # Fit a seasonal ARIMA (SARIMA) model seasonal_model = pm.ARIMA( order=(1, 1, 1), seasonal_order=(1, 1, 1, 12), # (P, D, Q, m) suppress_warnings=True ) seasonal_model.fit(y) # Forecast future values forecasts = seasonal_model.predict(n_periods=24) print(f"Next 24 months: {forecasts[:6]}") # In-sample predictions in_sample_preds = seasonal_model.predict_in_sample() print(f"Fitted values shape: {in_sample_preds.shape}") # Access model metrics print(f"AIC: {seasonal_model.aic()}") print(f"BIC: {seasonal_model.bic()}") print(f"Model parameters: {seasonal_model.params()}") ``` -------------------------------- ### Control Stepwise Search in auto_arima with StepwiseContext (Python) Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Illustrates how to use the `StepwiseContext` manager to control the stepwise search process in `pm.auto_arima`. This allows setting limits on the maximum number of steps or the maximum duration of the search. ```python import pmdarima as pm from pmdarima.arima import StepwiseContext y = pm.datasets.load_wineind() # Limit maximum number of steps with StepwiseContext(max_steps=10): model_steps = pm.auto_arima( y, seasonal=True, m=12, stepwise=True, suppress_warnings=True ) print(f"Model order (max_steps=10): {model_steps.order}, Seasonal: {model_steps.seasonal_order}") # Limit maximum duration (soft limit in seconds) with StepwiseContext(max_dur=30): model_dur = pm.auto_arima( y, seasonal=True, m=12, stepwise=True, suppress_warnings=True ) print(f"Model order (max_dur=30): {model_dur.order}, Seasonal: {model_dur.seasonal_order}") # Combine limits (whichever occurs first) with StepwiseContext(max_steps=20, max_dur=60): model_combined = pm.auto_arima( y, seasonal=True, m=12, stepwise=True, suppress_warnings=True ) print(f"Model order (max_steps=20, max_dur=60): {model_combined.order}, Seasonal: {model_combined.seasonal_order}") # Nested contexts with StepwiseContext(max_steps=15): with StepwiseContext(max_dur=45): model_nested = pm.auto_arima(y, seasonal=True, m=12, stepwise=True, suppress_warnings=True) print(f"Model order (nested contexts): {model_nested.order}, Seasonal: {model_nested.seasonal_order}") ``` -------------------------------- ### Serialize and deserialize ARIMA models using pickle and joblib Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/serialization.rst This snippet demonstrates fitting an ARIMA model and saving it to disk using both pickle and joblib. It confirms the integrity of the serialized models by performing predictions and comparing the results. ```python from pmdarima.arima import auto_arima from pmdarima.datasets import load_lynx import numpy as np # For serialization: import joblib import pickle # Load data and fit a model y = load_lynx() arima = auto_arima(y, seasonal=True) # Serialize with Pickle with open('arima.pkl', 'wb') as pkl: pickle.dump(arima, pkl) # You can still make predictions from the model at this point arima.predict(n_periods=5) # Now read it back and make a prediction with open('arima.pkl', 'rb') as pkl: pickle_preds = pickle.load(pkl).predict(n_periods=5) # Or maybe joblib tickles your fancy joblib.dump(arima, 'arima.pkl') joblib_preds = joblib.load('arima.pkl').predict(n_periods=5) # show they're the same np.allclose(pickle_preds, joblib_preds) ``` -------------------------------- ### pmdarima.metrics: Time-series metrics Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/modules/classes.rst This submodule implements time-series metrics not available in scikit-learn. ```APIDOC ## pmdarima.metrics ### Description Implements time-series metrics that are not implemented in scikit-learn. ### Functions * **metrics.smape**: Computes the Symmetric Mean Absolute Percentage Error. ``` -------------------------------- ### Specifying ARIMA Model Order Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/tips_and_tricks.rst Demonstrates how to specify the order (p, d, q) for ARIMA models using tuples in the pmdarima library. This is crucial for defining the structure of the time series model. ```python order = (1, 0, 12) # p=1, d=0, q=12 order = (1, 1, 3) # p=1, d=1, q=3 ``` -------------------------------- ### Sliding Window Cross-Validation Source: https://context7.com/alkaline-ml/pmdarima/llms.txt SlidingWindowForecastCV maintains a fixed-size training window that shifts forward in time. This approach is useful for models that may be sensitive to outdated historical data. ```python import pmdarima as pm from pmdarima.model_selection import SlidingWindowForecastCV y = pm.datasets.load_wineind() cv = SlidingWindowForecastCV( h=6, step=4, window_size=48 ) cv_generator = cv.split(y) train_idx, test_idx = next(cv_generator) ``` -------------------------------- ### Display pmdarima Environment Information Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/contributing.rst A Python utility to display version information for pmdarima and its dependencies, which is required when filing bug reports. ```python import pmdarima as pm pm.show_versions() ``` -------------------------------- ### Visualize Forecasts and Confidence Intervals Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/stocks.rst Uses Matplotlib to plot the training data, actual test values, predicted values, and confidence intervals for the time series model. ```python fig, axes = plt.subplots(2, 1, figsize=(12, 12)) axes[0].plot(y_train, color='blue', label='Training Data') axes[0].plot(test_data.index, forecasts, color='green', marker='o', label='Predicted Price') axes[0].plot(test_data.index, y_test, color='red', label='Actual Price') axes[0].legend() axes[1].plot(y_train, color='blue', label='Training Data') axes[1].plot(test_data.index, forecasts, color='green', label='Predicted Price') conf_int = np.asarray(confidence_intervals) axes[1].fill_between(test_data.index, conf_int[:, 0], conf_int[:, 1]) ``` -------------------------------- ### Performing Time Series Differencing Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/tips_and_tricks.rst Illustrates how to use the `diff` function from `pmdarima.utils` to compute differences in time series data. This is essential for handling non-stationary data by making it stationary. ```python from pmdarima.utils import c, diff x = c(10, 4, 2, 9, 34) # lag 1, diff 1 diff(x, lag=1, differences=1) # Returns: array([ -6., -2., 7., 25.], dtype=float32) diff(x, lag=1, differences=2) # Returns: array([ 4., 9., 18.], dtype=float32) diff(x, lag=2, differences=1) # Returns: array([-8., 5., 32.], dtype=float32) ``` -------------------------------- ### Define Benchmarking Logic Source: https://github.com/alkaline-ml/pmdarima/blob/master/benchmarks/bench_autoarima.ipynb Contains the core benchmark function that measures execution time of auto_arima and a wrapper to iterate through different search strategies. It utilizes garbage collection to ensure consistent measurement conditions. ```python import numpy as np import gc from datetime import datetime mu_second = 0.0 + 10 ** 6 def benchmark(x, results, **kwargs): from pmdarima.arima import auto_arima gc.collect() tstart = datetime.now() auto_arima(x, **kwargs) delta = datetime.now() - tstart results.append(delta.seconds + delta.microseconds / mu_second) def fit_benchmark(**kwargs): kwargs.pop('stepwise', None) kwargs.pop('n_jobs', None) for stepwise, results in ((True, stepwise_results), (False, grid_results)): benchmark(wineind, results, **kwargs) ``` -------------------------------- ### Import Libraries and Load Data for Stock Market Prediction (Python) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/stocks.rst Imports necessary libraries (Numpy, Pandas, Matplotlib, pmdarima) and loads the Microsoft stock dataset using pmdarima's built-in utility. Requires pmdarima version 1.5.2+ and Python 3.6+. ```python import numpy as np import pandas as pd import matplotlib.pyplot as plt # %matplotlib inline import pmdarima as pm print(f"Using pmdarima {pm.__version__}") # Using pmdarima 1.5.2 ``` ```python from pmdarima.datasets.stocks import load_msft df = load_msft() df.head() ``` -------------------------------- ### pmdarima.preprocessing: Preprocessing transformers Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/modules/classes.rst This submodule provides transformer classes for pre-processing time series or exogenous arrays. ```APIDOC ## pmdarima.preprocessing ### Description Provides a number of transformer classes for pre-processing time series or exogenous arrays. ### Classes * **preprocessing.BoxCoxEndogTransformer**: Transformer for applying the Box-Cox transformation to endogenous data. * **preprocessing.DateWrapper**: A wrapper for date-based transformations. * **preprocessing.ExogTransformer**: A transformer for exogenous data. * **preprocessing.LogEndogTransformer**: Transformer for applying the log transformation to endogenous data. * **preprocessing.RobustScaler**: Scales features using statistics that are robust to outliers. * **preprocessing.Scaler**: Scales features to a given range. ``` -------------------------------- ### Create Fourier Features with FourierFeaturizer Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Generates exogenous Fourier terms to model seasonality in time series data. This is useful for long seasonal periods and can be integrated into pmdarima pipelines. ```python import pmdarima as pm from pmdarima.preprocessing import FourierFeaturizer from pmdarima.pipeline import Pipeline # Load monthly data y = pm.datasets.load_wineind() # Create Fourier features (m=12 for monthly, k=4 sine/cosine pairs) fourier = FourierFeaturizer(m=12, k=4) fourier.fit(y) # Transform to get Fourier terms y_unchanged, X_fourier = fourier.transform(y) print(f"Fourier features shape: {X_fourier.shape}") # Generate Fourier features for forecasting _, X_future = fourier.transform(y, n_periods=24) # Use in a pipeline with non-seasonal ARIMA pipeline = Pipeline([ ('fourier', FourierFeaturizer(m=12, k=3)), ('arima', pm.AutoARIMA(seasonal=False, suppress_warnings=True)) ]) pipeline.fit(y) forecasts = pipeline.predict(n_periods=12) ``` -------------------------------- ### Split Time Series Data for Training and Testing (Python) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/stocks.rst Splits the loaded stock market data into training and testing sets. This is crucial for evaluating model performance on unseen data, respecting the temporal nature of time series. ```python from pmdarima.model_selection import train_test_split train_len = int(df.shape[0] * 0.8) train_data, test_data = train_test_split(df, train_size=train_len) y_train = train_data['Open'].values y_test = test_data['Open'].values print(f"{train_len} train samples") print(f"{df.shape[0] - train_len} test samples") # 6386 train samples # 1597 test samples ``` -------------------------------- ### Update ARIMA Model with New Observations and Exogenous Variables (Python) Source: https://context7.com/alkaline-ml/pmdarima/llms.txt Demonstrates how to update an existing ARIMA model with a batch of new observations and optionally with exogenous variables. This is useful for online learning or adapting models to new data. ```python import pmdarima as pm import numpy as np # Assume 'train', 'test' are time series data and 'model' is a fitted pmdarima model # Example data loading (replace with your actual data) y_train = pm.datasets.load_wineind()[:150] y_test = pm.datasets.load_wineind()[150:] model = pm.auto_arima(y_train, seasonal=True, m=12, suppress_warnings=True) # Batch update with multiple observations model.update(y_test[0:5]) print(f"Updated AIC: {model.aic():.2f}") # Update with exogenous variables X_train = np.random.randn(len(y_train), 2) X_test = np.random.randn(len(y_test), 2) # Re-fit model with exogenous variables model_exog = pm.auto_arima(y_train, X=X_train, seasonal=True, m=12, suppress_warnings=True) model_exog.update(y_test[:5], X=X_test[:5]) print(f"Updated AIC with exogenous variables: {model_exog.aic():.2f}") ``` -------------------------------- ### Fit ARIMA Model and Predict in R using forecast Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/issue12/issue-12.ipynb This R code snippet demonstrates how to load data, convert it to a time series object, and fit an ARIMA model using the auto.arima function from the forecast package. It also shows how to predict future values. ```R library(forecast) df = read.csv('dummy_data.csv') head(df) y = ts(df$occupancy, frequency=3) auto.arima(y) ``` -------------------------------- ### Display Model Summary after Pipeline Fit (Python) Source: https://github.com/alkaline-ml/pmdarima/blob/master/doc/usecases/sun-spots.rst Displays a summary of the fitted ARIMA model, including coefficients, statistics, and information criteria, after it has been processed through the Pmdarima pipeline. ```python fit2.summary() ``` -------------------------------- ### Visualize time-series data Source: https://github.com/alkaline-ml/pmdarima/blob/master/examples/issue12/issue-12.ipynb Plots the occupancy data over time using matplotlib to visually inspect seasonality. ```python import numpy as np from matplotlib import pyplot as plt %matplotlib inline # extract the data we're interested in n_samples = data.shape[0] xlab, y = data.time, data.occupancy plt.plot(np.arange(n_samples), y) plt.axis([0, n_samples, y.min(), y.max()]) plt.show() ``` -------------------------------- ### Performing Sequential Time Series Data Splitting Source: https://context7.com/alkaline-ml/pmdarima/llms.txt The train_test_split function provides a way to partition time series data into training and testing sets without shuffling, preserving the temporal order. It supports both proportion-based and absolute size-based splitting, as well as handling exogenous variables. ```python import pmdarima as pm from pmdarima.model_selection import train_test_split import numpy as np y = pm.datasets.load_wineind() # Split by proportion y_train, y_test = train_test_split(y, test_size=0.2) # Split with exogenous variables X = np.random.randn(len(y), 3) y_train, y_test, X_train, X_test = train_test_split(y, X, train_size=150) ```