Reputation: 2271
My understanding is that in the Encoder Decoder LSTM, the decoder first state is same as the encoder final state (both hidden and cell states) . But I don't see that written explicitly in the code below (taken from many Keras tutorials).
model.add(LSTM(units, input_shape=(n_input, n_features),dropout=rdo, activation = keras.layers.LeakyReLU(alpha=0.2)))
model.add(RepeatVector(1))
model.add(LSTM(units, activation = keras.layers.LeakyReLU(alpha=0.2), return_sequences=True, dropout=rdo))
model.add(TimeDistributed(Dense(100, activation = keras.layers.LeakyReLU(alpha=0.2))))
model.add(TimeDistributed(Dense(n_features)))
Is this passing of states done automatically and at which stage?
Update: I think my assumption is probably not correct since this is a sequential architecture so only a single output is passed to the decoder layer. However, I am still wondering how not transferring the cell state and hidden state from the encoder to the decoder would still work ok (by work I mean produces a reasonable prediction?).
Upvotes: 0
Views: 95
Reputation:
The default value for return_sequences
in the LSTM layer below is False
. Hence, the output of this layer will be the hidden state of the last time step because it considers only one vector in the last time step and neglects all the others. This hidden state is fed into the decoder.
model.add(LSTM(units, input_shape=(n_input, n_features),dropout=rdo, activation = keras.layers.LeakyReLU(alpha=0.2)))
Please refer this article for a detailed explanation. Thank you!
Upvotes: 0