BhavinNagda
BhavinNagda

Reputation: 1

Time Series Models with Co Variates should have higher accuracy than without Co Variates

I have a built a Time Series model with monthly observations (2012 to 2018 monthly observations) on crop science data which has Yearly seasonality. The farmers buying the crop protection products also depend on the time of rainfall. The same product has peak during Nov Dec Jan months as well.

I have built one SARIMA model(without covariates) and SARIMAX model(with covariates). Covariates being Avg Rainfall, Cum Rainfall, Avg temperature,min temp, max temp, Avg humidity etc

The main question here is, my SARIMAX model should give me better accuracy then SARIMA model since we have covariates which is assisting the model to predict better. Is my assumption correct?

Currently, SARIMA is giving me a better accuracy here.

Upvotes: 0

Views: 40

Answers (1)

9mat
9mat

Reputation: 1234

I suppose you use MLE to estimate SARIMA and SARIMAX. But I am not sure what accuracy you are using, so my guess is RMSE.

MLE will maximize the likelihood, so mathematically, SARIMAX will give you a higher likelihood than SARIMA on the same sample for sure (unconstrained optimization will always give better results than constrained optimization).

However, it does not guarantee better RMSE because RMSE is the square of the linear residuals, and, in the case of SARIMA (due to the MA part), are not perfectly correlated with the log-likelihood as in the case of linear regression.

So, it is perfectly normal to have higher log-likelihood, but also higher RMSE (and lower R-squared) for SARIMAX. In that case, you have an overfitted SARIMAX, and the X seems to be not very informative in predicting y.

Some other things you may also want to check:

  • Does the estimation use the same set of sample? If there are missing values in X, the estimation may drop more observation when estimating SARIMAX than SARIMA
  • Information criteria: AIC and BIC will adjust for the extra degree freedom between the models and tell you better if the model is really overfitted, or there may be something wrong with the estimation and/or data

Upvotes: 0

Related Questions