How can I improve f1-score of cnn?

Question

I am working on a document classification problem. Multi-label classification 20 different labels, 1920 documents in training, and 480 in validation. The model is a CNN with FastText embeddings and I use a logistic regression model with Ngram as baseline. The problem is that the baseline model gives a f1-score of 0.36 while the cnn only gives 0.3.

The architecture I use is from here:
https://www.kaggle.com/vsmolyakov/keras-cnn-with-fasttext-embeddings

I have been doing some parameter tuning, and the current best parameters are: dropout. 0.25, learning rate 0.001, trainable embeddings false, 128 filters, prediction threshold 0.15 and kernel size 9.

Do you guys have ideas to parameters to be special aware of, ideas to change the architecture, anything that might improve the f1-score?

# Parameters 
BATCH_SIZE = 16
DROP_OUT = 0.25
N_EPOCHS = 20
N_FILTERS = 128
TRAINABLE = False
LEARNING_RATE = 0.001
N_DIM = 32
KERNEL_SIZE = 9

# Create model
model = Sequential()
model.add(Embedding(NB_WORDS, EMBED_DIM, weights=[embedding_matrix], 
                    input_length=MAX_SEQ_LEN, trainable=TRAINABLE))
model.add(Conv1D(N_FILTERS, KERNEL_SIZE, activation='relu', padding='same'))
model.add(MaxPooling1D(2))
model.add(Conv1D(N_FILTERS, KERNEL_SIZE, activation='relu', padding='same'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(DROP_OUT))
model.add(Dense(N_DIM, activation='relu', kernel_regularizer=regularizers.l2(1e-4)))
model.add(Dense(N_LABELS, activation='sigmoid'))  #multi-label (k-hot encoding)
adam = optimizers.Adam(lr=LEARNING_RATE, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
model.summary()

Edit
I think I got some wrong hyperparameters by fixing epochs to 20 during tuning. I am now trying with a stopping criteria, the model usually converges around 30-35 epochs. It seems dropout of 0.5 works better, and I am currently tuning batch size. If somebody has some experience/knowledge about the relationship between epochs and other hyperparameters feel free to share.

How can I improve f1-score of cnn?

Answers (1)

Related Questions