Reputation: 1886
I and building an LSTM model to predict a word given a sequence of characters. For now, my dataset has just ~3k words that are alphaNumeric. For some reason I'm hitting a max accuracy of .84 and I can't seem to get past that. I've tried adding additional LSTM layers, changed learning-rate and batch-size, but I can't get past the .84 accuracy limit.
I'm looking for guidance on how should I go about investigating this. I was planning on using Hyperas to tweak the model. I'm not sure if tweaking would help as I'm hitting that .84 limit while going from a 3 layer LSTM with 12 cells all the way to 3 layer with 24 cells.
Here is my definition of the model:
model = Sequential()
model.add(LSTM(24, input_shape=(data.getMaxLen(), data.uniqueChars), return_sequences=True))
model.add(LSTM(24, return_sequences=True))
model.add(LSTM(24, return_sequences=True))
model.add(TimeDistributed(Dense(12)))
model.add(AveragePooling1D())
model.add(Flatten())
model.add(Dense(data.uniqueTokensCount, activation='softmax'))
optimizer = RMSprop(lr=0.0005)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
history = model.fit(data.X, data.Y,
batch_size=data.uniqueTokensCount,
epochs=10000,
callbacks=[print_callback])
Model: 3xLSTM (24 cells), 1x Dense (12 cells), Max Length 15
Model: 3xLSTM (24 cells), 1x Dense (12 cells), Max Length 8
Model: 3xLSTM (12 cells), 1x Dense (6 cells), Max Length 8
Upvotes: 3
Views: 608
Reputation: 29
Try to see what your validation loss is. It is possible that you are overfitting and not getting past 0.84 because of this. Seeing that your training accuracy is fluctuating, you can try adding dropouts or regularizers to prevent overfitting.
Upvotes: 0