Why is my layer output not the same dimensions as shown in my model summary?

Question

I have managed to create a successful RNN that can predict the next letter in a sequence of letters. However, I can't work out why a solution to a problem I was having is working.

My training data is of dimensions (39000,7,7)

My Model is as follows:

    model = Sequential()
    model.add(SimpleRNN(7, input_shape = [7,7], return_sequences = True))
    model.add(Flatten())
    model.add(Dense(7)) 
    model.add(Activation('softmax'))
    adam = optimizers.Adam(lr = 0.001)
    model.compile(loss='categorical_crossentropy',optimizer=adam, metrics=['accuracy'])
    model.summary()
    return model


Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn_49 (SimpleRNN)    (None, 7, 7)              105       
_________________________________________________________________
flatten_14 (Flatten)         (None, 49)                0         
_________________________________________________________________
dense_49 (Dense)             (None, 7)                 350       
_________________________________________________________________
activation_40 (Activation)   (None, 7)                 0         
=================================================================
Total params: 455
Trainable params: 455
Non-trainable params: 0
_________________________________________________________________

This works perfectly. My question is, why do I need the flatten layer? When I don't include it I get this model summary:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn_50 (SimpleRNN)    (None, 7, 7)              105       
_________________________________________________________________
dense_50 (Dense)             (None, 7, 7)              56        
_________________________________________________________________
activation_41 (Activation)   (None, 7, 7)              0         
=================================================================
Total params: 161
Trainable params: 161
Non-trainable params: 0
_________________________________________________________________

followed by this error

ValueError: Error when checking target: expected activation_41 to have 3 dimensions, but got array with shape (39000, 7)

My Question is: when the model summary says the output of the dense layer should be (None, 7 , 7) in the second example, and the error message says activation level is expecting exactly such a 3D input, why is the dense layer actually outputting a tensor of shape (39000,7) as according to the error message? I realise the flatten() layer solves this by putting everything in 2D, but im confused as to why it doesn't work without it.

Anshul Rai · Accepted Answer

In your error statement you can see that the error is caused when checking the target dimensions. Your model output without the flatten layer is of the shape (None, 7, 7) which is correctly shown in your model summary. The issue here is that your labels are of the shape (None, 7), so Keras throws a ValueError (probably during backpropogation) as your labels have one less dimension than the output of your network. Keras was expecting a (None, 7, 7) from labels to match the dimensions of your activation layer, but received a (None, 7) instead.

That is why using model.add(Flatten()) before adding the dense layer works fine, as the target dimensions and outputs dimensions are both (None, 7).

Why is my layer output not the same dimensions as shown in my model summary?

Answers (1)

Related Questions