BoltzmannBrain
BoltzmannBrain

Reputation: 5412

LSTM predictions increase with each training batch

I'm building an LSTM with Keras for time-series prediction, but I want the model to train in mini-batches (windows) and make predictions online, as described here. This is because the data is streamed in one data record at a time. For example, with a window size of 500, at timestep 500 the model will have trained on steps 1-500, and will now try to predict 501, then 502, 503, and so on. The model won't train again until timestep 1000.

But the results are odd, where the predicted values increase with each training window, as shown in this plot. Any ideas as to what is wrong here?

I have a small architecture:

layers = {'input': inputDims, 'hidden1': 35, 'hidden2': 35, 'output': 1}
model = Sequential()
model.add(LSTM(
    input_length=self.sequenceLength,
    input_dim=self.layers['input'],
    output_dim=self.layers['hidden1'],
    return_sequences=True)
model.add(Dropout(0.2))
model.add(LSTM(
    self.layers['hidden2'],
    return_sequences=False)
model.add(Dropout(0.2))
model.add(Dense(output_dim=self.layers['output']))
self.model.add(Activation('linear'))
model.compile(loss='mse', optimizer='rmsprop')

Upvotes: 0

Views: 237

Answers (1)

Christian Hirsch
Christian Hirsch

Reputation: 2056

This might not be a problem of your specific implementation but an incarnation of a conceptual issue of LSTMs with long time series. A good starting point is the paper Learning to forget: Continual prediction with LSTM. In particular, the authors observe that

[...] even LSTM fails to learn to correctly process certain very long or continual time series that are not a priori segmented into appropriate training subsequences

and that

[...] cell states sc often tend to grow linearly during the presentation of a time series [...]

Upvotes: 0

Related Questions