SupposeXYZ
SupposeXYZ

Reputation: 374

Possible reasons for overfitting the dataset

The dataset I used contains 33k images. The training contains 27k and validation set contains 6k images.
I used the following CNN code for the model :

model = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same", input_shape=(row, col, ch)))
model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, activation='relu', border_mode="same"))
model.add(Convolution2D(128, 3, 3, activation='relu', border_mode="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Activation('relu'))
model.add(Dense(1024))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(1))
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss="mse", metrics=["mae"])

The output I obtain has a decreasing training loss but increasing validation loss suggesting overfitting. But I have included dropouts which should have helped in preventing overfitting.Following is the snap of output when trained for 10 epochs :

Epoch 1/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0629 - mean_absolute_error: 0.1428 Epoch 00000: val_loss improved from inf to 0.07595, saving model to dataset/-00-val_loss_with_mymodel-0.08.hdf5
27040/27040 [==============================] - 4666s - loss: 0.0629 - mean_absolute_error: 0.1428 - val_loss: 0.0759 - val_mean_absolute_error: 0.1925
Epoch 2/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0495 - mean_absolute_error: 0.1287 Epoch 00001: val_loss did not improve
27040/27040 [==============================] - 4605s - loss: 0.0494 - mean_absolute_error: 0.1287 - val_loss: 0.0946 - val_mean_absolute_error: 0.2289
Epoch 3/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0382 - mean_absolute_error: 0.1119 Epoch 00002: val_loss did not improve
27040/27040 [==============================] - 4610s - loss: 0.0382 - mean_absolute_error: 0.1119 - val_loss: 0.1081 - val_mean_absolute_error: 0.2463

So, what is wrong? Are there any other methods to prevent overfitting?
Does shuffling of data help?

Upvotes: 0

Views: 543

Answers (1)

Thomas Pinetz
Thomas Pinetz

Reputation: 7148

I would try to add weight decay of 1E-4. This can be done by adding the weight decay layer wise like this: model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same", input_shape=(row, col, ch), W_regularizer=l2(1E-4), b_regularizer=l2(1E-4))). L2 can be found in keras.regularizers (https://keras.io/regularizers/#example). Weight regularization is very good at combating overfitting.

However overfitting might not only be a result of your model, but also of your model. If the validation data is somehow "harder" then your train data then it might just be that you can not fit it as well.

Upvotes: 1

Related Questions