Reputation: 12908
I am trying to train an LSTM with keras using TensorFlow backend on toy data and am getting this error:
ValueError: Error when checking target: expected dense_39 to have 2 dimensions, but got array with shape (996, 1, 1)
The error occurs immediately upon calling model.fit
; nothing seems to run. It seems to me that Keras is checking dimensions, but ignoring the fact that it should be taking batches of my target with each batch of my input. The error shows the full dimension of my target array, which implies to me that it's never split into batches by Keras, at least while checking dimensions. For the life of me I can't figure out why this would be or anything else that might help.
My network definition with expected layer output shapes in comments:
batch_shape = (8, 5, 1)
x_in = Input(batch_shape=batch_shape, name='input') # (8, 5, 1)
seq1 = LSTM(8, return_sequences=True, stateful=True)(x_in) # (8, 5, 8)
dense1 = TimeDistributed(Dense(8))(seq1) # (8, 5, 8)
seq2 = LSTM(8, return_sequences=False, stateful=True)(dense1) # (8, 8)
dense2 = Dense(8)(seq2) # (8, 8)
out = Dense(1)(dense2) # (8, 1)
model = Model(inputs=x_in, outputs=out)
optimizer = Nadam()
model.compile(optimizer=optimizer, loss='mean_squared_error')
model.summary()
The model summary, shapes as expected:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (8, 5, 1) 0
_________________________________________________________________
lstm_28 (LSTM) (8, 5, 8) 320
_________________________________________________________________
time_distributed_18 (TimeDis (8, 5, 8) 72
_________________________________________________________________
lstm_29 (LSTM) (8, 8) 544
_________________________________________________________________
dense_38 (Dense) (8, 8) 72
_________________________________________________________________
dense_39 (Dense) (8, 1) 9
=================================================================
Total params: 1,017
Trainable params: 1,017
Non-trainable params: 0
_________________________________________________________________
My toy data, where the target is just a line decreasing from 100 to 0, and the input is just an array of zeros. I want to do one-step-ahead prediction, so I create rolling windows of my input and target using a rolling_window()
method defined below:
target = np.linspace(100, 0, num=1000)
target_rolling = rolling_window(target[4:], 1)[:, :, None]
target_rolling.shape # (996, 1, 1) <-- this seems to be the array that's causing the error
x_train = np.zeros((1000,))
x_train_rolling = rolling_window(x_train, 5)[:, :, None]
x_train_rolling.shape # (996, 5, 1)
The rolling_window()
method:
def rolling_window(arr, window):
shape = arr.shape[:-1] + (arr.shape[-1] - window + 1, window)
strides = arr.strides + (arr.strides[-1],)
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
And my training loop:
reset_state = LambdaCallback(on_epoch_end=lambda _, _: model.reset_states())
callbacks = [reset_state]
history = model.fit(x_train_rolling, y_train_rolling,
batch_size=8,
epochs=100,
validation_split=0.,
callbacks=callbacks)
I have tried:
return_sequence=True
in the second LSTM with a Flatten
layer after. Same error.return_sequence=True
without a Flatten
layer. This gives a different error because it is expecting a target with the same shape as the output, which at that point is (batch_size, 5, 1)
and not (batch_size, 1, 1)
.Note that none of these questions seem to directly answer mine, although I was really hopeful on a couple:
Upvotes: 1
Views: 363
Reputation: 320
Posting the solution I wrote in the comments: Since there is an extra dimension, the "-1" makes the dimension self adjust to what ever number it has to be to fit the other dimensions. Since only two dimensions are give, "(-1,1)" would make it "(996, 1)".
target_rolling = target_rolling.reshape(-1,1)
before
at target_rolling.shape # (996, 1, 1)
Upvotes: 1