dkent
dkent

Reputation: 331

Is there a Problem with SARIMAX Predictions Using Statsmodels?

I'm building a time series model in Python using the Statsmodels library.

I seem to be getting erroneous results when I use the predict method on a SARIMAX model with exogenous regressors. (The predict method seems to work fine in the no external regressors case.)

Here is a full reproduction of the problem:

import numpy as np
import pandas as pd
from statsmodels.tsa.statespace.sarimax import SARIMAX, SARIMAXResults

history = np.array([95818, 126537, 102856, 161188, 150539, 165138, 146603, 154334,
           150875, 134049, 137822, 126369, 124641, 111735, 126453])

history_exog = np.array([11.9835, 12.1981, 11.7108, 10.8174, 9.48247, 8.49162, 8.15208, 7.81663, 
                8.22422, 9.9492, 10.8724, 10.9911, 9.55874, 6.67079, 3.13028])

future_exog = np.array([0.279386, -1.72252, -2.87699, -2.64897, -1.51616])

model = SARIMAX(history,
                order=(1,0,0),
                seasonal_order=(0,0,0,52),
                exog = history_exog,
                enforce_stationarity=False,
                enforce_invertibility=False)

model_fit = model.fit(method = 'cg')

yhat = model_fit.predict(start = len(history),
                         end = len(history) + 5 - 1,
                         exog = future_exog)       

model_fit.summary()

enter image description here

The five steps ahead prediction I get (yhat) are as follows:

You can see the problem in the very first prediction. Given the model parameters, I would have thought that the prediction would be:

yhat(t+1) = (126453 x 0.9898) + (-4579.3944 x 0.279386) = 123883.76

The prediction I get is 138068.

Where am I going wrong here? Or could this be a bug?

Upvotes: 0

Views: 1003

Answers (1)

cfulton
cfulton

Reputation: 3195

The SARIMAX model is of the form "regression with SARIMA errors". For example, for an ARX(1) model, this is:

y(t) = beta' x(t) + e(t)
e(t) = phi e(t-1) + z(t)

while your forecasting equation would be accurate for a model of the form:

y(t) = beta' x(t) + phi y(t-1) + z(t)

If you are interested in an AR(p) model (i.e. does not include any MA terms), then you can use sm.tsa.AutoReg, which is of that form.

Upvotes: 1

Related Questions