Reputation: 11
I do not understand how model.predict(...)
works on a a time series forecasting problem. I usually use it with a CNN and it is pretty straight forward but for time series I don't understand what it returns.
For example I am currently doing an exercise where I have to forecast the power consumption based on data using LTSM, I succeeded to train my model but when I want to know what the power cusumption will be tomorrow (so no data except past ones) I don't know what input to use.
Upvotes: 1
Views: 244
Reputation: 2701
Traditional ML algorithms, which you might be more used to, generally expect the data in a 2D structure like this:
For sequential data, such as a stream of timed events associated with each user, it’s also possible to create a lagged 2D dataset, where the history of different features for different IDs is aligned into single rows, with this structure:
This can be a good way to work because once your data is in the correct shape you can use it with fast to set up and train models. However, models using features engineered using this approach generally don’t have any capacity to “learn” anything about the natural sequence of the data. To something like a tree-based ensemble model receiving this format, feature 1 at time t and time t-1 in the example above are treated completely independently and this can severely limit the model’s predictive power.
There are types of deep learning architecture specifically designed for modelling sequence data called recurrent neural nets (RNN). Two of the most popular cells to use in these are long short term memory (LSTM) and gated recurrent units (GRU). There’s a good post on how to understand how LSTM cells work here, but the TL;DR is they have a structure that allows them to learn from sequences of data.
Cells like LSTM expect a 3D tensor of input data. We arrange it so that one axis has the data features along it, the second axis has the sequence steps (like time ticks) and the third axis has each of the different examples we want to predict a single "y" value for stacked along it. Using the same type of dataset as the lagged example above, it would look something like this:
The ability to learn patterns in sequences of data like this is particularly beneficial for both time series and text data, which are naturally ordered.
To return to your original question, when you want to predict something in your test set you'll need to pass it sequences represented just like the ones it was trained in (this is a reasonably good rule of supervised learning in general). For example, if the data is trained like the last example above, you'll need to pass it a 2D example for each ID you want to make a prediction for.
You should explore the way the original training data is represented and make sure you understand it well, as you'll need to create the same shape of data to make predictions. X_train.shape
is a great place to start, if you have your training data in a pandas dataframe or numpy arrays, to see what the dimensionality is, and then you can inspect entries along each axis until you get a good feel for the data it contains.
Upvotes: 2