kenshin9
kenshin9

Reputation: 2365

Amazon Forecast - Updating Datasets and Retraining

I'm looking to update our target time series dataset in Amazon Forecast since it's been a couple weeks now and I wanted to see how incremental updates work. I'm reading over the documentation here: https://docs.aws.amazon.com/forecast/latest/dg/updating-data.html

I've read it over and over again, but I feel like it's contradictory. But I very well may just not be understanding the terminology.

As you collect new data, you might want to use it to generate new forecasts. Forecast does not automatically retrain a predictor when you import an updated dataset, but you can manually retrain a predictor to generate a new forecast with the updated data. For instance, if you collect daily sales data and want to include new data points in your forecast, you could import the updated data and use it to generate a forecast without training a new predictor. For newly imported data to have an impact on your forecasts, you must retrain the predictor.

I've bolded and italicized the parts I'm confused about. In both cases, it sounds like I'm importing new data to append. The first portion says that the new data points could be included and used for a forecast without training a new predictor. And then the next sentence says that new data will only have an impact if it's retrained. I feel like it's either contradictory or there's a subtle difference that I'm not getting.

Has anyone worked with Forecast and has insight into this?

Upvotes: 0

Views: 243

Answers (1)

dingus
dingus

Reputation: 1001

My understanding is that, similar to the SageMaker DeepAR algorithm (which exposes a bit more low-level detail but of course isn't guaranteed to work exactly the same way under the hood), the underlying models analyze a historical lead-in or "context" period to make predictions autoregressively.

As a result there's a separation between what we might call the "weights" and the "states" of the model: First you train a model (weights) based on historical data, then at inference time you show it recent observations (states) and have it predict based on those and what it learned.

For this reason you can import new data and generate updated forecasts, without re-training your predictor... but the accuracy will probably be less good because you haven't re-trained the model itself, you're just relying on the previously learned patterns + the recent actuals giving enough information to predict well.

From what I've seen, for most use-cases the Forecast pricing means that training cost is small compared to inference cost of generating forecasts: So if accuracy is important I'd probably just re-train every time. Appreciate this might not be true for everyone though (e.g. if you're only forecasting a small number of variables/products/etc).

Upvotes: 0

Related Questions