Reputation: 125
I am attempting to implement image sequence prediction in Python 3.5.2 | Anaconda 4.2.0 (64-bit) on my windows 10 machine. I have the latest version of keras and tensorflow.
Each image is 160x128. My training set is 1008 images, size 1008x160x128x1. I want to do a simple network with one convolutional layer and one LSTM layer, for now, where each image is convolved to extract features and then fed into the LSTM to learn the time-dependencies. The output should be k (in the case below k=1) predicted images, size 160x128. The code is below as well as the model.summary().
The output of my convolution layer is 4 dimensional (None, 79, 63, 32). So I reshape the output so it is (None, 32, 79*63) and is the right number of dimensions for the LSTM layer (though I thought this is taken care of behind the scenes...). The model then compiles without error (if I did not do reshape then a dimension error is thrown).
Because each element of my training data is only 1 time point per sample, I do not use TimeDistributed on the convolution layer (after much research, it seems this is the solution). However, I believe for the output layer, all samples come together, so there are as many time points as there are samples and TimeDistributed is to be used. If I do this, then I get the following error:
Traceback (most recent call last): File "C:\seqn_pred\read_images_dataset.py", line 104, in model.fit(train_x, train_y, epochs = 10, batch_size = 1, verbose = 1) File "c:\users\l\anaconda3\lib\site-packages\keras\engine\training.py", line 950, in fit batch_size=batch_size) File "c:\users\l\anaconda3\lib\site-packages\keras\engine\training.py", line 787, in _standardize_user_data exception_prefix='target') File "c:\users\l\anaconda3\lib\site-packages\keras\engine\training_utils.py", line 137, in standardize_input_data str(data_shape)) ValueError: Error when checking target: expected time_distributed_62 to have shape (32, 1) but got array with shape (128, 160)
I have searched all relevant posts on stackoverflow and have tried all relevant "solutions" with no success. And when I attempt to do units = 160*128, there is again an issue with shape (32, 160*128) versus (128, 160). Additionally, I attempted to reshape the target data to be 1008x(160*128)x1 (since TimeDistributed requires 3-d data as well as flattening each target) to get yet another error
ValueError: Error when checking target: expected time_distributed_64 to have shape (32, 20480) but got array with shape (20480, 1)
I have also attempted to run the last layer without the TimeDistributed, and I still receive an error with respect to the target shape.
ValueError: Error when checking target: expected dense_1 to have shape (32, 1) but got array with shape (160, 128)
The primary issue is with shape/dimension both between the convolution and LSTM layer as well as for the final dense layer. Any help would be much appreciated.
train_x, test_x = [D2[i] for i in rand_indx], [D2[i] for i in range(N-1) if i not in rand_indx]
train_y, test_y = [D2[i+1] for i in rand_indx], [D2[i+1] for i in range(N-1) if i not in rand_indx]
train_x = np.array(train_x)
train_x = train_x.reshape(len(train_x), n, m,1)
train_y = np.array(train_y)
train_y = train_y.reshape(train_y.shape[0], train_y.shape[1]*train_y.shape[2], 1)
model = Sequential()
#model.add(TimeDistributed(Conv2D(filters = 32, kernel_size = (3,3), strides = (1,1), activation = 'relu', padding = 'valid', input_shape = (1, n, m, 1))))
#model.add(TimeDistributed(MaxPooling2D(pool_size = (3,3))))
#model.add(TimeDistributed(Dropout(0.30)))
#model.add(TimeDistributed(Flatten()))
model.add(Conv2D(filters = 32, kernel_size = (3,3), strides = (1,1), activation = 'relu', padding = 'valid', input_shape = (n, m, 1)))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout(0.30))
model.add(Reshape((32,-1)))
model.add(LSTM(units = 20, activation = 'relu', return_sequences = True))
model.add(Dropout(0.1))
model.add(TimeDistributed(Dense(1, activation = 'relu')))
optim = krs.optimizers.Adam(lr = 0.375)
model.compile(loss = 'mse', optimizer = optim)
model.fit(train_x, train_y, epochs = 10, batch_size = 1, verbose = 1)
model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d_73 (Conv2D) (None, 158, 126, 32) 320
_________________________________________________________________
max_pooling2d_53 (MaxPooling (None, 79, 63, 32) 0
_________________________________________________________________
dropout_93 (Dropout) (None, 79, 63, 32) 0
_________________________________________________________________
reshape_13 (Reshape) (None, 32, 4977) 0
_________________________________________________________________
lstm_57 (LSTM) (None, 32, 20) 399840
_________________________________________________________________
dropout_94 (Dropout) (None, 32, 20) 0
_________________________________________________________________
dense_44 (Dense) (None, 32, 1) 21
=================================================================
Total params: 400,181
Trainable params: 400,181
Non-trainable params: 0
_________________________________________________________________
Upvotes: 0
Views: 539
Reputation: 11333
I'm a bit perplexed on what you're trying to achieve here. Here's my 2 cents.
Input : (1008, 160, 128, 1)
Output: (1008, 160*128)
If you have a single output target, you should not use return_sequences=True
in the LSTM layer and no need for a TimeDistributed
layer. The last bit needs to change as follows.
model.add(Reshape((32,-1)))
model.add(LSTM(units = 20, activation = 'relu'))
model.add(Dropout(0.1))
model.add(Dense(160*128, activation = 'relu'))
If you make the above changes, you can train the model with data having the above shapes for inputs and outputs.
But, There's a red flag you might wanna give some consideration.
model.add(Permute([3,1,2]))
model.add(Dropout(0.30))
model.add(Reshape((32,-1)))
Upvotes: 1