Reputation: 161
I'm implementing a CNN for speech recognition. The input is MEL frequencies with shape (85314, 99, 1) and the labels are one-hot encoded with 35 output classes (shape: (85314, 35)). When I run the model the training accuracy (image 2) starts high and stays the same over the number of epochs, while the loss on validation (image 1) increases. Hence, it is probably overfitting but I cannot find the origin of the issue. I already decreased the learning rate and played with batch sizes but the results stays the same. Also the amount of training data should be sufficient. Is there another issue with my hyper-parameter settings somewhere?
My model and hyper-parameters are defined as follows:
#hyperparameters
input_dimension = 85314
learning_rate = 0.0000025
momentum = 0.85
hidden_initializer = random_uniform(seed=1)
dropout_rate = 0.2
# create model
model = Sequential()
model.add(Convolution1D(nb_filter=32, filter_length=3, input_shape=(99, 1), activation='relu'))
model.add(Convolution1D(nb_filter=16, filter_length=1, activation='relu'))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(Dense(35, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['acc'])
history = model.fit(frequencies_train, labels_hot, validation_split=0.2, epochs=10, batch_size=50)
Upvotes: 1
Views: 766
Reputation: 1694
You are using "binary_crossentropy" for a problem of multiple classes. Change it to "categorical_crossentrop".
The accuracy computed with Keras using the binary_crossentropy with a model of more than 2 labels is just wrong.
Upvotes: 1