Generic filters
Exact matches only

# Advanced Time Series Analysis with ARMA and ARIMA

## Understand and implement ARMA and ARIMA models in Python for time series forecasting In previous articles, we introduced moving average processes MA(q), and autoregressive processes AR(p) as two ways to model time series. Now, we will combine both methods and explore how ARMA(p,q) and ARIMA(p,d,q) models can help us to model and forecast more complex time series.

• ARMA models

By the end of this article, you should be comfortable with implementing ARMA and ARIMA models in Python and you will have a checklist of steps to take when modelling time series.

The notebook and dataset are here.

Let’s get started!

For hands-on video tutorials on machine learning, deep learning, and artificial intelligence, checkout my YouTube channel.

Recall that an autoregressive process of order p is defined as:

Where:

• p is the order

Recall also that a moving average process q is defined as:

Where:

• q is the order

Then, an ARMA(p,q) is simply the combination of both models into a single equation:

Hence, this model can explain the relationship of a time series with both random noise (moving average part) and itself at a previous step (autoregressive part).

Let’s how an ARMA(p,q) process behaves with a few simulations.

## Simulate and ARMA(1,1) process

Let’s start with a simple example of an ARMA process of order 1 in both its moving average and autoregressive part.

First, let’s import all libraries that will be required throughout this tutorial:

`from statsmodels.graphics.tsaplots import plot_pacffrom statsmodels.graphics.tsaplots import plot_acffrom statsmodels.tsa.arima_process import ArmaProcessfrom statsmodels.stats.diagnostic import acorr_ljungboxfrom statsmodels.tsa.statespace.sarimax import SARIMAXfrom statsmodels.tsa.stattools import adfullerfrom statsmodels.tsa.stattools import pacffrom statsmodels.tsa.stattools import acffrom tqdm import tqdm_notebookimport matplotlib.pyplot as pltimport numpy as npimport pandas as pdimport warningswarnings.filterwarnings('ignore')%matplotlib inline`

Then, we will simulate the following ARMA process:

In code:

`ar1 = np.array([1, 0.33])ma1 = np.array([1, 0.9])simulated_ARMA_data = ArmaProcess(ar1, ma1).generate_sample(nsample=10000)`

We can now plot the first 200 points to visualize our generated time series:

`plt.figure(figsize=[15, 7.5]); # Set dimensions for figureplt.plot(simulated_ARMA_data)plt.title("Simulated ARMA(1,1) Process")plt.xlim([0, 200])plt.show()`

And you should get something similar to:

Then, we can take a look at the ACF and PACF plots:

`plot_pacf(simulated_ARMA_data);plot_acf(simulated_ARMA_data);`

As you can see, we cannot infer the order of the ARMA process by looking at these plots. In fact, looking closely, we can see some sinusoidal shape in both ACF and PACF functions. This suggests that both processes are in play.

## Simulate an ARMA(2,2) process

Similarly, we can simulate an ARMA(2,2) process. In this example, we will simulate the following equation:

In code:

`ar2 = np.array([1, 0.33, 0.5])ma2 = np.array([1, 0.9, 0.3])simulated_ARMA2_data = ArmaProcess(ar1, ma1).generate_sample(nsample=10000)`

Then, we can visualize the simulated data:

`plt.figure(figsize=[15, 7.5]); # Set dimensions for figureplt.plot(simulated_ARMA2_data)plt.title("Simulated ARMA(2,2) Process")plt.xlim([0, 200])plt.show()`

Looking at the ACF and PACF plots:

`plot_pacf(simulated_ARMA2_data);plot_acf(simulated_ARMA2_data);`

As you can see, both plots exhibit the same sinusoidal trend, which further supports the fact that both an AR(p) process and a MA(q) process is in play.

ARIMA stands for AutoRegressive Integrated Moving Average.

This model is the combination of autoregression, a moving average model and differencing. In this context, integration is the opposite of differencing.

Differencing is useful to remove the trend in a time series and make it stationary.

It simply involves subtracting a point a t-1 from time t. Realize that you will, therefore, lose the first data point in a time series if you apply differencing once.

Mathematically, the ARIMA(p,d,q) now requires three parameters:

• p: the order of the autoregressive process

and the equations is expressed as:

Just like with ARMA models, the ACF and PACF cannot be used to identify reliable values for p and q.

However, in the presence of an ARIMA(p,d,0) process:

• the ACF is exponentially decaying or sinusoidal

Similarly, in the presence of an ARIMA(0,d,q) process:

• the PACF is exponentially decaying or sinusoidal

Let’s walk through an example of modelling with ARIMA to get some hands-on experience and better understand some modelling concepts.

Let’s revisit a dataset that we analyzed previously. This dataset was used to show the Yule-Walker equation can help us estimate the coefficients of an AR(p) process.

Now, we will use the same dataset, but model the time series with an ARIMA(p,d,q) model.

`data = pd.read_csv('jj.csv')data.head()`