Skip to content
Generic filters
Exact matches only

Advanced Time Series Analysis with ARMA and ARIMA

Understand and implement ARMA and ARIMA models in Python for time series forecasting

Marco Peixeiro
Photo by Djim Loic on Unsplash

In previous articles, we introduced moving average processes MA(q), and autoregressive processes AR(p) as two ways to model time series. Now, we will combine both methods and explore how ARMA(p,q) and ARIMA(p,d,q) models can help us to model and forecast more complex time series.

This article will cover the following topics:

  • ARMA models

By the end of this article, you should be comfortable with implementing ARMA and ARIMA models in Python and you will have a checklist of steps to take when modelling time series.

The notebook and dataset are here.

Let’s get started!

For hands-on video tutorials on machine learning, deep learning, and artificial intelligence, checkout my YouTube channel.

Recall that an autoregressive process of order p is defined as:

Autoregressive process of order p


  • p is the order

Recall also that a moving average process q is defined as:

Moving average process of order q


  • q is the order

Then, an ARMA(p,q) is simply the combination of both models into a single equation:

ARMA process of order (p,q)

Hence, this model can explain the relationship of a time series with both random noise (moving average part) and itself at a previous step (autoregressive part).

Let’s how an ARMA(p,q) process behaves with a few simulations.

Simulate and ARMA(1,1) process

Let’s start with a simple example of an ARMA process of order 1 in both its moving average and autoregressive part.

First, let’s import all libraries that will be required throughout this tutorial:

from import plot_pacf
from import plot_acf
from statsmodels.tsa.arima_process import ArmaProcess
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import pacf
from statsmodels.tsa.stattools import acf
from tqdm import tqdm_notebook
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import warnings
%matplotlib inline

Then, we will simulate the following ARMA process:

ARMA (1,1) process

In code:

ar1 = np.array([1, 0.33])
ma1 = np.array([1, 0.9])
simulated_ARMA_data = ArmaProcess(ar1, ma1).generate_sample(nsample=10000)

We can now plot the first 200 points to visualize our generated time series:

plt.figure(figsize=[15, 7.5]); # Set dimensions for figure
plt.title("Simulated ARMA(1,1) Process")
plt.xlim([0, 200])

And you should get something similar to:

Simulated ARMA(1,1) process

Then, we can take a look at the ACF and PACF plots:

PACF anf ACF plots for the simulated ARMA(1,1) process

As you can see, we cannot infer the order of the ARMA process by looking at these plots. In fact, looking closely, we can see some sinusoidal shape in both ACF and PACF functions. This suggests that both processes are in play.

Simulate an ARMA(2,2) process

Similarly, we can simulate an ARMA(2,2) process. In this example, we will simulate the following equation:

ARMA(2,2) process

In code:

ar2 = np.array([1, 0.33, 0.5])
ma2 = np.array([1, 0.9, 0.3])
simulated_ARMA2_data = ArmaProcess(ar1, ma1).generate_sample(nsample=10000)

Then, we can visualize the simulated data:

plt.figure(figsize=[15, 7.5]); # Set dimensions for figure
plt.title("Simulated ARMA(2,2) Process")
plt.xlim([0, 200])
Simulated ARMA(2,2) process

Looking at the ACF and PACF plots:

PACF and ACF for ARMA(2,2) process

As you can see, both plots exhibit the same sinusoidal trend, which further supports the fact that both an AR(p) process and a MA(q) process is in play.

ARIMA stands for AutoRegressive Integrated Moving Average.

This model is the combination of autoregression, a moving average model and differencing. In this context, integration is the opposite of differencing.

Differencing is useful to remove the trend in a time series and make it stationary.

It simply involves subtracting a point a t-1 from time t. Realize that you will, therefore, lose the first data point in a time series if you apply differencing once.

Mathematically, the ARIMA(p,d,q) now requires three parameters:

  • p: the order of the autoregressive process

and the equations is expressed as:

General ARIMA(p,d,q) process

Just like with ARMA models, the ACF and PACF cannot be used to identify reliable values for p and q.

However, in the presence of an ARIMA(p,d,0) process:

  • the ACF is exponentially decaying or sinusoidal

Similarly, in the presence of an ARIMA(0,d,q) process:

  • the PACF is exponentially decaying or sinusoidal

Let’s walk through an example of modelling with ARIMA to get some hands-on experience and better understand some modelling concepts.

Let’s revisit a dataset that we analyzed previously. This dataset was used to show the Yule-Walker equation can help us estimate the coefficients of an AR(p) process.

Now, we will use the same dataset, but model the time series with an ARIMA(p,d,q) model.

You can grab the notebook or download the dataset to follow along.

First, we import the dataset and display the first five rows:

data = pd.read_csv('jj.csv')

Then, let’s plot the entire dataset:

plt.figure(figsize=[15, 7.5]); # Set dimensions for figure
plt.scatter(data['date'], data['data'])
plt.title('Quarterly EPS for Johnson & Johnson')
plt.ylabel('EPS per share ($)')
Quarterly EPS for Johnson&Johnson

As you can see, there is both a trend and a change in variance in this time series.

Let’s plot the ACF and PACF functions:


As you can see, there is no way of determining the right order for the AR(p) process or MA(q) process.

The plots above are also a clear indication of non-stationarity. To further prove this point, let’s use the augmented Dicker-Fuller test:

# Augmented Dickey-Fuller testad_fuller_result = adfuller(data['data'])
print(f'ADF Statistic: {ad_fuller_result[0]}')
print(f'p-value: {ad_fuller_result[1]}')

Here, the p-value is larger than 0.05, meaning the we cannot reject the null hypothesis stating that the time series is non-stationary.

Therefore, we must apply some transformation and some differencing to remove the trend and remove the change in variance.

We will hence take the log difference of the time series. This is equivalent to taking the logarithm of the EPS, and then apply differencing once. Note that because we are differencing once, we will get rid of the first data point.

# Take the log difference to make data stationarydata['data'] = np.log(data['data'])
data['data'] = data['data'].diff()
data = data.drop(data.index[0])

Now, let’s plot the new transformed data:

plt.figure(figsize=[15, 7.5]); # Set dimensions for figure
plt.title("Log Difference of Quarterly EPS for Johnson & Johnson")
Log difference of the quarterly EPS for Johnson&Johnson

It seems that trend and the change in variance were removed, but we want to make sure that it is the case. Therefore, we apply the augmented Dickey-Fuller test again to test for stationarity.

# Augmented Dickey-Fuller testad_fuller_result = adfuller(data['data'])
print(f'ADF Statistic: {ad_fuller_result[0]}')
print(f'p-value: {ad_fuller_result[1]}')

This time, the p-value is less than 0.05, we reject the null hypothesis, and assume that the time series is stationary.

Now, let’s look at the PACF and ACF to see if we can estimate the order of one of the processes in play.


Examining the PACF above, it seems that there is an AR process of order 3 or 4 in play. However, the ACF is not informative and we see some sinusoidal shape.

Therefore, how can we make sure that we choose the right order for both the AR(p) and MA(q) processes?

We will need try different combinations of orders, fit an ARIMA model with those orders, and use a criterion for order selection.

This brings us to the topic of Akaike’s Information Criterion or AIC.