Reputation: 361
I am doing transfer learning on a pre-trained model with an own dataset. Shortly, I used pretrained resnet50 model with 224x224 input shape. I am loading the model like:
train_datagen = ImageDataGenerator(validation_split=0.1,rescale=1./255,preprocessing_function=preprocess_input) # set validation split
img_size = 224
batch_size = 32
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_size, img_size),
batch_size=batch_size,
color_mode='rgb',
subset='training') # set as training data
validation_generator = train_datagen.flow_from_directory(
train_data_dir, # same directory as training data
target_size=(img_size, img_size),
batch_size=batch_size,
color_mode='rgb',
subset='validation') # set as validation data
model = ResNet50(include_top=False, weights=None, input_shape=(224,224,3))
model.load_weights("a trained model weights on 224x224")
model.layers.pop()
for layer in model.layers:
layer.trainable = False
x = model.layers[-1].output
x = Flatten(name='flatten')(x)
x = Dropout(0.2)(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(101, activation='softmax', name='pred_age')(x)
top_model = Model(inputs=model.input, outputs=predictions)
top_model.compile(loss='categorical_crossentropy',
optimizer=adam,
metrics=[accuracy])
EPOCHS = 100
BATCH_SIZE = 32
STEPS_PER_EPOCH = 4424 // BATCH_SIZE
VALIDATION_STEPS = 466 // BATCH_SIZE
callbacks = [LearningRateScheduler(schedule=Schedule(EPOCHS, initial_lr=lr_rate)),
ModelCheckpoint(str(output_dir) + "/weights.{epoch:03d}-{val_loss:.3f}-{val_age_mae:.3f}.hdf5",
monitor="val_age_mae",
verbose=1,
save_best_only=False,
mode="min")
]
hist = top_model.fit_generator(generator=train_set,
epochs=100,
steps_per_epoch = 4424//32,
validation_data=val_set,
validation_steps = 466//32,
verbose=1,
callbacks=callbacks)
Total params: 75,020,261 Trainable params: 51,432,549 Non-trainable params: 23,587,712
Epoch 1/100 140/140 [==============================] - 1033s 7s/step - loss: > 14.5776 - age_mae: 12.2994 - val_loss: 15.6144 - val_age_mae: 24.8527
Epoch 00001: val_age_mae improved from inf to 24.85268, saving model > Epoch 2/100 140/140 [==============================] - 969s 7s/step - loss: 14.7104 - age_mae: 11.2545 - val_loss: 15.6462 - val_age_mae: 25.1104
TEpoch 00002: val_age_mae did not improve from 24.85268 TEpoch 3/100 T140/140 [==============================] - 769s 5s/step - loss: >T14.6159 - age_mae: 13.5181 - val_loss: 15.7551 - val_age_mae: 29.4640
Epoch 00003: val_age_mae did not improve from 24.85268 Epoch 4/100 140/140 [==============================] - 815s 6s/step - loss: > 14.6509 - age_mae: 13.0087 - val_loss: 15.9366 - val_age_mae: 18.3581 Epoch 00004: val_age_mae improved from 24.85268 to 18.35811 Epoch 5/100 140/140 [==============================] - 1059s 8s/step - loss: > > 14.3882 - age_mae: 11.8039 - val_loss: 15.6825 - val_age_mae: 24.6937
Epoch 00005: val_age_mae did not improve from 18.35811 Epoch 6/100 140/140 [==============================] - 1052s 8s/step - loss: > 14.4496 - age_mae: 13.6652 - val_loss: 15.4278 - val_age_mae: 24.5045 Epoch 00006: val_age_mae did not improve from 18.35811
I already runned this couple times, and after epoch 4 it is not improving anymore. Also the dataset contains around 5000 images. 4511 images belonging to training set. 476 images belonging to validation set.
I get the following loss graph
Upvotes: 0
Views: 988
Reputation: 377
This problem occurs in pre-trained networks having BatchNormalization() and it is very stressful. Trust me! The logic being that BatchNormalization() will destroy all the trained weights if the model is not trained properly, it loses everything during back propagation. I recommend you to try loading the model this way:
model = ResNet50(include_top=False, weights=None, input_shape=(224,224,3))
model.trainable = False
inputs = keras.Input(shape=(224,224,3))
x = model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dropout(0.5)(x) #if your model requires one.
outputs = keras.layers.Dense(num_classes, activation='softmax')(x)
and then you can add the fully connected layer of your choice and then collect the output of the entire model as follows:
model = keras.Model(inputs,outputs)
If you are looking for further explanation, I recommend you to go through this link.
You can then continue training the model by freezing or unfreezing how many ever layers you want to, according to the dataset. Hope this helps.
Upvotes: 1