Reputation: 1056
I am currently transfer learning using the MobilenetV2 architecture. I have added several Dense layers on the top before my classification. Should I add BatchNormalization
between these layers?
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(200,200,3))
x = base_model.output
x = GlobalAveragePooling2D(name="Class_pool")(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
X = Dense(20,activation='softmax')(x)
I have previously trained this network without any of these batch normalization layers and have struggled to get a good accuracy. I am only semi-successful after trying many combinations of learning rate and frozen layers. I am hoping this will help
Can too many BatchNormalization
layers be bad for a network?
Upvotes: 1
Views: 992
Reputation: 1729
Batch Normalization will help with covariance shift and as you are training on new data batch-wise, it would be a good thing for the network. There is nothing as too much BatchNormalization, just put after every layer that is having activations in it.
Upvotes: 1