Reputation: 132
Since I am kind of new in this field I tried following the official tutorial from tensorflow for predicting time series. https://www.tensorflow.org/tutorials/structured_data/time_series
Following problem occurs: -When training a multivariate model, after 2 or 3 epochs the kernel dies and restarts.
However this doesn't happen with a simpler univariate model, which has only one LSTM layer (not really sure if this makes a difference).
Second however, this problem just happened today. Yesterday the training of the multivariate model was possible and error-free.
As can be seen in the tutorial in the link below the model looks like this:
multi_step_model = tf.keras.models.Sequential()
multi_step_model.add(tf.keras.layers.LSTM(32,return_sequences=True,input_shape=x_train_multi.shape[-2:]))
multi_step_model.add(tf.keras.layers.LSTM(16, activation='relu'))
multi_step_model.add(tf.keras.layers.Dense(72))
multi_step_model.compile(optimizer=tf.keras.optimizers.RMSprop(clipvalue=1.0), loss='mae')
And the kernel dies after executing the following cell (usually after 2 or 3 epochs).
multi_step_history = multi_step_model.fit(train_data_multi, epochs=10,
steps_per_epoch=300,
validation_data=val_data_multi,
validation_steps=50)
I have uninstalled and reinstalled tf, restarted my laptop, but nothing seems to work.
Any ideas?
OS: Windows 10 Surface Book 1
Upvotes: 0
Views: 1527
Reputation: 132
Problem was a too big batch size. Reducing it from 1024 to 256 solved the crashing problem.
Solution taken from the comment of rbwendt on this thread on github.
Upvotes: 0