Reputation: 771
I wonder if it is a problem to use BatchNormalization when there are only 2 convolutional layers in a CNN. Can this have adverse effects on classification performance? Now I don't mean the training time, but really the accuracy? Is my network overloaded with unneccessary layers? I want to train the network with a small data set.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), input_shape=(28,28,1), padding = 'same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compilke(optimizer="Adam", loss='categorical_crossentropy, metrics =['accuracy'])
Many thanks.
Upvotes: 1
Views: 1510
Reputation: 167
Don’t Use With Dropout
Batch normalization offers some regularization effect, reducing generalization error, perhaps no longer requiring the use of dropout for regularization.
Removing Dropout from Modified BN-Inception speeds up training, without increasing overfitting.
— Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, 2015.
Further, it may not be a good idea to use batch normalization and dropout in the same network.
The reason is that the statistics used to normalize the activations of the prior layer may become noisy given the random dropping out of nodes during the dropout procedure.
Batch normalization also sometimes reduces generalization error and allows dropout to be omitted, due to the noise in the estimate of the statistics used to normalize each variable.
— Page 425, Deep Learning, 2016.
Source - machinelearningmastery.com - batch normalization
Upvotes: 1