Reputation: 942
I am trying to train my model by finetuning a pretrained model(vggface). My model has 12 classes with 1774 training images and 313 validation images, each class having around 150 images. So far I have been able to achieve a max accuracy around 80% with the following script in keras:
img_width, img_height = 224, 224
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
out = Dense(12, activation='softmax', name='classifier')(x)
custom_vgg_model = Model(vggface.input, out)
# Create the model
model = models.Sequential()
# Add the convolutional base model
model.add(custom_vgg_model)
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
# Change the batchsize according to your system RAM
train_batchsize = 16
val_batchsize = 16
train_generator = train_datagen.flow_from_directory(
train_data_path,
target_size=(img_width, img_height),
batch_size=train_batchsize,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_path,
target_size=(img_width, img_height),
batch_size=val_batchsize,
class_mode='categorical',
shuffle=True)
# Compile the model
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-3),
metrics=['acc'])
# Train the model
history = model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples/train_generator.batch_size ,
epochs=50,
validation_data=validation_generator,
validation_steps=validation_generator.samples/validation_generator.batch_size,
verbose=1)
So far I have tried:
Here's my change for that:
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
#vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
xx = Dense(256, activation = 'relu')(x)
x1 = BatchNormalization()(xx)
x2 = Dropout(0.3)(xx)
y = Dense(256, activation = 'relu')(x2)
yy = BatchNormalization()(y)
y1 = Dropout(0.3)(y)
z = Dense(256, activation = 'relu')(y1)
zz = BatchNormalization()(z)
z1 = Dropout(0.6)(zz)
x3 = Dense(12, activation='softmax', name='classifier')(z1)
custom_vgg_model = Model(vggface.input, x3)
I have made my activation as softmax now as suggested by SymbolixAU here. The val acc is now 81% still, while the training acc gets close to 99%
What am I doing wrong?
Upvotes: 2
Views: 3674
Reputation: 1249
Be careful with your connections. In the first two blocks the BatchNormalization is not connected to the dropout. Change the input of the two first dropout.
xx = Dense(256, activation = 'relu')(x)
x1 = BatchNormalization()(xx)
x2 = Dropout(0.3)(x1)
y = Dense(256, activation = 'relu')(x2)
yy = BatchNormalization()(y)
y1 = Dropout(0.3)(yy)
The values you give implies that your network is overfitting. The batch normalization or adding more dropout might help. But, given the small number of images you should really try image augmentation to increase the number of training images.
Upvotes: 1