Travelling Salesman
Travelling Salesman

Reputation: 2271

How does Keras initialize decoder first state in Encoder Decoder LSTM?

My understanding is that in the Encoder Decoder LSTM, the decoder first state is same as the encoder final state (both hidden and cell states) . But I don't see that written explicitly in the code below (taken from many Keras tutorials).

model.add(LSTM(units, input_shape=(n_input, n_features),dropout=rdo, activation = keras.layers.LeakyReLU(alpha=0.2)))
model.add(RepeatVector(1))
model.add(LSTM(units, activation = keras.layers.LeakyReLU(alpha=0.2), return_sequences=True, dropout=rdo))
model.add(TimeDistributed(Dense(100, activation = keras.layers.LeakyReLU(alpha=0.2))))
model.add(TimeDistributed(Dense(n_features)))

Is this passing of states done automatically and at which stage?

Update: I think my assumption is probably not correct since this is a sequential architecture so only a single output is passed to the decoder layer. However, I am still wondering how not transferring the cell state and hidden state from the encoder to the decoder would still work ok (by work I mean produces a reasonable prediction?).

Upvotes: 0

Views: 95

Answers (1)

user11530462
user11530462

Reputation:

The default value for return_sequences in the LSTM layer below is False. Hence, the output of this layer will be the hidden state of the last time step because it considers only one vector in the last time step and neglects all the others. This hidden state is fed into the decoder.

model.add(LSTM(units, input_shape=(n_input, n_features),dropout=rdo, activation = keras.layers.LeakyReLU(alpha=0.2)))
 

Please refer this article for a detailed explanation. Thank you!

Upvotes: 0

Related Questions