Reputation: 73
I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:
model = Sequential()
model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?
there is my code: GitHubProject - Multi-Label-Text-Classification
Upvotes: 5
Views: 2297
Reputation: 2680
In last layer, the activation function you are using is sigmoid
, so binary_crossentropy
should be used. Incase you want to use categorical_crossentropy
then use softmax
as activation function in last layer.
Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh
as activation function in LSTM layers.
And you can try using LSTM's dropouts as well like dropout
and recurrent dropout
LSTM(units, dropout=0.2, recurrent_dropout=0.2,
activation='tanh')
You can define units as 64
or 128
. Start from small number and after testing you take them till 1024
.
You can try adding convolution
layer as well for extracting features or use Bidirectional LSTM
But models based Bidirectional
takes time to train.
Moreover, since you are working on text, pre-processing of text and size of training data
always play much bigger role than expected.
Edited
Add Class weights in fit parameter
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
class_weights))
model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
Upvotes: 8
Reputation: 8527
change:
model.add(Activation('sigmoid'))
to:
model.add(Activation('softmax'))
Upvotes: 4