why do I get completely different predictions from Keras LSTM network dependent on the order of predictions?

i have a script of building an LSTM model, fit it to train data, predict on some test data. (and just for fun plot predictions on train data, since they should be close to the train data, just to know if my model is constructed well)

1) The first Problem is, that the predictions on test and train data are totally different, depending on if I predict on train or test first.

2) The second Problem might be correlated to the first one, so every time I run my Script, the predictions on the test data is totally different. I know neural networks have some kind of randomness, but as you can see in my resulting plots, its totally different:

edit1: I tried to set 'stateful=False' as suggested in comments without success.

edit2: I've updated the script and the plots and provided some basic sinewave sample data within the new code. The problems still exist even in that simple example

resulting plots of predictions with stateful=False

I got an input signal X as a sine-wave with 100 time steps and random amplitude and frequency. My target y correlates to X (in every time step) and is - in this case - also a sine-wave. The shape of my data is

X_train.shape = (100, 1, 1)
y_train.shape = (100,)
X_test.shape = (100, 1, 1)
y_test.shape = (100,)

I'm using LSTM network trying to fit a complete sine-wave, so batch size = 100, and predicting every single point of a test signal, so batch size for prediction = 1. Also I'm manually resetting the state of the LSTM after every epoch, as mentioned here: https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/

For building my network i followed the "keras-rules" as mentioned here: Delayed echo of sin - cannot reproduce Tensorflow result in Keras

I know the basic approaches of solving the problems, like suggested here: Wrong predictions with LSTM Neural Network but nothing worked for me.

I'm grateful for any kind of help on this, and also on asking better questions, in case I did something wrong because it's my first post here on stack.

Thanks you all! here is my code example:

import numpy as np
import matplotlib.pyplot as plt
from keras import models, layers, optimizers
from keras.callbacks import Callback


# create training sample data
Fs = 100  # sample rate
z = np.arange(100)
f = 1  # frequency in Hz
X_train = np.sin(2 * np.pi * f * z / Fs)
y_train = 0.1 * np.sin(2 * np.pi * f * z / Fs)


# create test sample data
f = 1  # frequency in Hz
X_test = np.sin(2 * np.pi * f * z / Fs) * 2
y_test = 0.2 * np.sin(2 * np.pi * f * z / Fs)


# convert data into LSTM compatible format
y_train = np.array(y_train)
y_test = np.array(y_test)
X_train = X_train.reshape(X_train.shape[0], 1, 1)
X_test = X_test.reshape(X_test.shape[0], 1, 1)


# build and compile model
model = models.Sequential()
model.add(layers.LSTM(1, batch_input_shape=(len(X_train), X_train.shape[1], X_train.shape[2]),
                      return_sequences=False, stateful=False))
model.add(layers.Dense(X_train.shape[1], input_shape=(1,), activation='linear'))
model.compile(optimizer=optimizers.Adam(lr=0.01, decay=0.008, amsgrad=True), loss='mean_squared_error', metrics=['mae'])


# construct a class for keras callbacks, to make sure the cell state is reset after each epoch
class ResetStatesAfterEachEpoch(Callback):
    def on_epoch_end(self, epoch, logs=None):
        self.model.reset_states()

reset_state = ResetStatesAfterEachEpoch()
callbacks = [reset_state]


# fit model to training data
history = model.fit(X_train, y_train, epochs=20000, batch_size=len(X_train),
                        shuffle=False, callbacks=callbacks)


# re-define LSTM model with weights of fit model to predict for 1 point, so also re-define the batch size to 1
new_batch_size = 1
new_model = models.Sequential()
new_model.add(layers.LSTM(1, batch_input_shape=(new_batch_size, X_test.shape[1], X_test.shape[2]), return_sequences=False,
                          stateful=False))
new_model.add(layers.Dense(X_test.shape[1], input_shape=(1,), activation='linear'))

# copy weights to new model
old_weights = model.get_weights()
new_model.set_weights(old_weights)


# single point prediction on train data
y_pred_train = new_model.predict(X_train, batch_size=new_batch_size)

# single point prediction on test data
y_pred_test = new_model.predict(X_test, batch_size=new_batch_size)

# plot predictions
plt.figure()
plt.plot(y_test, 'r', label='ground truth test',
         linestyle='dashed', linewidth=0.8)
plt.plot(y_train, 'b', label='ground truth train',
         linestyle='dashed', linewidth=0.8)
plt.plot(y_pred_test, 'g',
         label='y pred test', linestyle='dotted',
         linewidth=0.8)
plt.plot(y_pred_train, 'k',
         label='y pred train', linestyle='-.',
         linewidth=0.8)
plt.title('pred order: test, train')
plt.xlabel('time steps')
plt.ylabel('y')
plt.legend(prop={'size': 8})
plt.show()

Upvotes: 1

why do I get completely different predictions from Keras LSTM network dependent on the order of predictions?

Answers (2)

Related Questions