Reputation: 447
I have created a video dataset where each video have dimensions 5(frames) x 32(width) x 32(height) x 4 (channels). I'm trying to classify (binary classification) these videos using a CNN LSTM network but I'm confused about the input shape and how I should reshape my dataset to train the network.
model = Sequential()
model.add(TimeDistributed(Conv2D(64, 5, activation='relu', padding='same', name='conv1', input_shape=??))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same', name='pool1')))
model.add(TimeDistributed(Conv2D(64, 5, activation='relu', padding='same', name='conv2'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same', name='pool2')))
model.add(TimeDistributed(Conv2D(64, 5, activation='relu', padding='same', name='conv3'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same', name='pool3')))
model.add(TimeDistributed(Conv2D(64, 5, activation='relu', padding='same', name='conv4'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same', name='pool4')))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(256, return_sequences=False, dropout=0.5))
model.add(Dense(1, activation='sigmoid'))
Am I missing anything in the model?
Upvotes: 3
Views: 1894
Reputation: 11333
Your input shape should be (batch_size, time steps, height, width, channels)
. So it should be a 5 dimensional tensor.
Also, your input_shape
argument should go like this. It should be an argument for the TimeDistributed
layer not the Conv2D
layer, because TimeDistributed
is the first layer. Here, I'm showing what the input shape would be for a,
model.add(TimeDistributed(Conv2D(64, 5, activation='relu', padding='same', name='conv1'), input_shape=(5, 32, 32, 4)))
Upvotes: 1