gust
gust

Reputation: 945

Setting initial state of LSTM layer

I have the following code:

units = 1024
lstm_layer = tf.keras.layers.LSTM(units)

dim = tf.zeros([64,1024])

output, hidden = lstm_layer(embedded_data, initial_state = dim)

I get the following error message:

ValueError: An `initial_state` was passed that is 
not compatible with `cell.state_size`.
 Received `state_spec`=
ListWrapper([InputSpec(shape=(64, 1024), ndim=2)]);
 however `cell.state_size` is [1024, 1024]

When I do it with a GRU cell instead of a LSTM cell, it works fine. But for a LSTM cell, this code does not work. I realize that LSTM takes two parameters, hence the code asking for a cell state of [1024,1024], but I don't know how to set the initial state. I tried

initial_state = [dim, dim] 

and that doesn't work either, as it gives me

ValueError: too many values to unpack (expected 2).

I referenced LSTM Initial state from Dense layer but doesn't seem to resolve my issue...

Upvotes: 5

Views: 3228

Answers (1)

gust
gust

Reputation: 945

For future reference if anybody needs solving on this problem:

The problem is the output. Just using tf.zeros([64,1024]) works fine, you just need to have three outputs:

output, hidden_h,hidden_c = lstm_layer(embedded_data, initial_state = [dim,dim])

Upvotes: 2

Related Questions