Batch Normalization while Transfer Learning

Question

I am currently transfer learning using the MobilenetV2 architecture. I have added several Dense layers on the top before my classification. Should I add BatchNormalization between these layers?

base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(200,200,3))
x = base_model.output
x = GlobalAveragePooling2D(name="Class_pool")(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(.4)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
X = Dense(20,activation='softmax')(x)

I have previously trained this network without any of these batch normalization layers and have struggled to get a good accuracy. I am only semi-successful after trying many combinations of learning rate and frozen layers. I am hoping this will help Can too many BatchNormalization layers be bad for a network?

Abhishek Verma · Accepted Answer

Batch Normalization will help with covariance shift and as you are training on new data batch-wise, it would be a good thing for the network. There is nothing as too much BatchNormalization, just put after every layer that is having activations in it.

Batch Normalization while Transfer Learning

Answers (1)

Related Questions