ARIMA out of sample prediction in statsmodels?

Question

I have a timeseries forecasting problem that I am using the statsmodels python package to address. Evaluating using the AIC criteria, the optimal model turns out to be quite complex, something like ARIMA(27,1,8) [ I haven't done an exhaustive search of the parameter space, but it seems to be at a minima around there]. I am having real trouble validating and forecasting with this model though, because it takes a very long time (hours) to train a single model instance, so doing repeated tests is very difficult.

In any case, what I really need as a minimum in order to be able to use statsmodels in operations (assuming I can get the model validated somehow first) is an mechanism for incorporating new data as it arrives in order to make the next set of forecasts. I would like to be able to fit a model on the available data, pickle it, and then unpickle later when the next datapoint is available and incorporate that into an updated set of forecasts. At the moment I have to re-fit the model each time new data becomes available, which as I said takes a very long time.

I had a look at this question which address essentially the problem I have but for ARMA models. For the ARIMA case however there is the added complexity of the data being differenced. I need to be able to produce new forecasts of the original timeseries (c.f. typ='levels' keyword in the ARIMAResultsWrapper.predict method). It's my understanding that statsmodels cannot do this at present, but what components of the existing functionality would I need to use in order to write something to do this myself?

Edit: I am also using transparams=True, so the prediction process needs to be able to transform the predictions back into the original timeseries, which is an additional difficulty in a homebrew approach.

ARIMA out of sample prediction in statsmodels?

Answers (1)

Related Questions