Reputation: 1608
I'm adding some batch normalization to my model in order to improve the training time, following some tutorials. This is my model:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
#NB: adding more parameters increases the probability of overfitting!! Try to cut instead of adding neurons!!
model.add(Dense(units=512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=20, activation='softmax'))
Without batch normalization, i get around 50% accuracy on my data. Adding batch normalization destroys my performance, with a validation accuracy reduced to 10%.
Why is this happening?
Upvotes: 6
Views: 5732
Reputation: 127
Try using lesser number of batch normalization layers. And it is a general practice to use it at the last convolution layer. Start with just one of them and add more if it improves the validation accuracy.
Upvotes: 1
Reputation: 614
I'm not sure if this is what you are asking, but batch normalization is still active during validation, it's just that the parameters are defined and set during training and not altered during validation.
As for why batch normalization is not good for your model/problem in general, it's like any hyper parameter, some work well with some scenarios, not well with others. Do you know if this is the best placement for BN within your network? Other than that would need to know more about your data and problem to give any further guesses.
Upvotes: 2