PeakyBlinder
PeakyBlinder

Reputation: 1117

Deep LSTM accuracy not crossing 50%

I am working on a classification problem of the semeval 2017 task 4A dataset can be found here and I am using deep LSTM network for it. In pre-processing, I have done lower casing->tokenization->lemmatization->removing stop words->removing punctuations. For word embeddings, I have used WORD2VEC model. There are 18,000 samples in my training set and 2000 samples in testing.

The code for my model is

model = Sequential()
model.add(Embedding(max_words, 30, input_length=max_len))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.3))
model.add(Bidirectional(LSTM(32, use_bias=True, return_sequences=True)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Bidirectional(LSTM(32, use_bias=True, return_sequences=True), input_shape=(128, 1,64)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(SeqSelfAttention(attention_activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

The value of max_words is 2000 and max_len is 300

But even after this, my testing accuracy is not crossing 50%. I can't figure out the problem. PS - I am using validation technique too. The loss function is 'Binary Crossentropy' and optimizer is 'Adam'.

Upvotes: 2

Views: 486

Answers (1)

JaeWoo So
JaeWoo So

Reputation: 576

Training "LSTM" is very different with other common deep learning model. I recommend a higher dropout rate like 0.7,0.8. and Adam optimizer is particularly unstable in LSTM with real world data. So, i recommend SGD scheduled for a momentum of 0.9 and ReduceLROnPlateau. You have to do very long training, and if spark loss is observed, the training is going very well. (Spark Loss is a word used by NVIDIA researchers. It refers to a phenomenon in which the value of Loss that appears to converge increases significantly.)

Upvotes: 0

Related Questions