Reputation: 2197
I was wondering how was working LSTM under Keras.
Let's take an example. I have maximum sentence length of 3 words. Example : 'how are you' I vectorize each words in a vector of len 4. So I will have a shape (3,4) Now, I want to use an lstm to do translation stuff. (Just an example)
model = Sequential()
model.add(LSTM(1, input_shape=(3,4), return_sequences=True))
model.summary()
I'm going to have an output shape of (3,1) according to Keras.
Layer (type) Output Shape Param #
=================================================================
lstm_16 (LSTM) (None, 3, 1) 24
=================================================================
Total params: 24
Trainable params: 24
Non-trainable params: 0
_________________________________________________________________
And this is what I don't understand.
Each unit of an LSTM (With return_sequences=True to have all the output of each state) should give me a vector of shape (timesteps, x) Where timesteps is 3 in this case, and x is the size of my words vector (In this case, 4)
So, why I got an output shape of (3,1) ? I searched everywhere, but can't figure it out.
Upvotes: 0
Views: 161
Reputation: 11895
Your interpretation of what the LSTM should return is not right. The output dimensionality doesn't need to match the input dimensionality. Concretely, the first argument of keras.layers.LSTM corresponds to the dimensionality of the output space, and you're setting it to 1.
In other words, setting:
model.add(LSTM(k, input_shape=(3,4), return_sequences=True))
will result in a (None, 3, k)
output shape.
Upvotes: 1