Reputation: 1
I have a built a Time Series model with monthly observations (2012 to 2018 monthly observations) on crop science data which has Yearly seasonality. The farmers buying the crop protection products also depend on the time of rainfall. The same product has peak during Nov Dec Jan months as well.
I have built one SARIMA model(without covariates) and SARIMAX model(with covariates). Covariates being Avg Rainfall, Cum Rainfall, Avg temperature,min temp, max temp, Avg humidity etc
The main question here is, my SARIMAX model should give me better accuracy then SARIMA model since we have covariates which is assisting the model to predict better. Is my assumption correct?
Currently, SARIMA is giving me a better accuracy here.
Upvotes: 0
Views: 40
Reputation: 1234
I suppose you use MLE to estimate SARIMA and SARIMAX. But I am not sure what accuracy you are using, so my guess is RMSE.
MLE will maximize the likelihood, so mathematically, SARIMAX will give you a higher likelihood than SARIMA on the same sample for sure (unconstrained optimization will always give better results than constrained optimization).
However, it does not guarantee better RMSE because RMSE is the square of the linear residuals, and, in the case of SARIMA (due to the MA part), are not perfectly correlated with the log-likelihood as in the case of linear regression.
So, it is perfectly normal to have higher log-likelihood, but also higher RMSE (and lower R-squared) for SARIMAX. In that case, you have an overfitted SARIMAX, and the X seems to be not very informative in predicting y.
Some other things you may also want to check:
Upvotes: 0