Reputation: 1473
I know that a LSTM cell has a number of ANNs inside.
But when defining the hidden layer for the same problem, I have seen some people using only 1 LSTM cell and others use 2, 3 LSTM cells like this -
model = Sequential()
model.add(LSTM(256, input_shape=(n_prev, 1), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(128, input_shape=(n_prev, 1), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(64, input_shape=(n_prev, 1), return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(1))
model.add(Activation('linear'))
Upvotes: 11
Views: 18188
Reputation: 19776
There are no "rules", but there are guidelines; in practice, you'd experiment with depth vs. width, each of which works differently:
In general, width extracts more features, whereas depth extracts richer features - but if there aren't many features to extract from given data, width should be lessened - and the "simpler" the data/problem, the less layers are suitable. Ultimately, however, it may be best to spare extensive analysis and try different combinations of each -- see this SO for more info.
Lastly, avoid Dropout
and use LSTM(recurrent_dropout=...)
instead (see linked SO).
Upvotes: 18