Reputation: 5040
In a keras example on LSTM for modeling IMDB sequence data (https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py), there is an embedding layer before input into a LSTM layer:
model.add(Embedding(max_features,128)) #max_features=20000
model.add(LSTM(128))
What does the embedding layer really do? In this case, does that mean the length of the input sequence into the LSTM layer is 128? If so, can I write the LSTM layer as:
model.add(LSTM(128,input_shape=(128,1))
But it is also noted the input X_train
has subjected to pad_sequences
processing:
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen) #maxlen=80
X_test = sequence.pad_sequences(X_test, maxlen=maxlen) #maxlen=80
It seems the input sequence length is 80?
Upvotes: 6
Views: 14346
Reputation: 20969
Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]
Basically this transforms indexes (that represent which words your IMDB review contained) to a vector with the given size (in your case 128).
If you don't know what embeddings are in general, here is the wikipedia definition:
Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers in a low-dimensional space relative to the vocabulary size ("continuous space").
Coming back to the other question you've asked:
In this case, does that means the length of the input sequence into the LSTM layer is 128?
not quite. For recurrent nets you'll have a time dimension and a feature dimension. 128 is your feature dimension, as in how many dimensions each embedding vector should have. The time dimension in your example is what is stored in maxlen
, which is used to generate the training sequences.
Whatever you supply as 128 to the LSTM layer is the actual number of output units of the LSTM.
Upvotes: 4