Sathiyakugan
Sathiyakugan

Reputation: 684

Meaning of the hidden state in Keras LSTM

Since I am new to deep learning this question may be funny to you. but I couldn't visualize it in the mind. That's why I am asking about it.

I am giving a sentence as the vector to the LSTM, Think I have a sentence contains 10 words. Then I change those sentences to the vectors and giving it to the LSTM.

The length of the LSTM cells should be 10. But in most of the tutorials, I have seen they have added 128 hidden states. I couldn't understand and visualize it. What's that the word means by LSTM layer with "128-dimensional hidden state"

for example:

X = LSTM(128, return_sequences=True)(embeddings)

The summery of this looks

lstm_1 (LSTM)                (None, 10, 128)           91648    

Here It looks like 10 LSTM cells are added but why are that 128 hidden states there? Hope you may understand what I am expecting.

Upvotes: 3

Views: 1713

Answers (1)

Daniel GL
Daniel GL

Reputation: 1249

Short Answer: If you are more familiar with Convolutional Networks, you can thick of the size of the LSTM layer (128) is the equivalent to the size of a Convolutional layer. The 10 only means that the size of your input (lenght of your sequence is 10)

Longer Answer: You can check this article for more detail article about RNNs.

In the left image, a LSTM layer is represented with (xt) as the input with output (ht). The feedback arrow shows that there is some kind of memory inside the cell.

In practice in Keras (right image), this model is "unrolled" to give the whole input xt in parallel to our layer.

So when your summary is: lstm_1 (LSTM) (None, 10, 128) 91648
It means that your input sequence is 10 (x0,x1,x2,...,x9), and that the size of your LSTM is 128 (128 will be the dimension of your output ht)

enter image description here

Upvotes: 5

Related Questions