AaronDT
AaronDT

Reputation: 4060

Fine-tuning ResNet50 with Keras - val_loss keeps increasing

I am trying to customize resnet50 using keras with a tensorflow backend. However, upon tranining my val_loss keeps increasing. Trying different learning rates and batch sizes does not resolve the problem.

Using different preprocessing methods such as rescaling or using the preprocess_input function for resnet50 inside the ImageDataGenerator did not not solve the problem either.

This is the code I am using

Importing and preprocessing data:

from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import preprocess_input, decode_predictions

IMAGE_SIZE = 224
BATCH_SIZE = 32

num_classes = 27

main_path = "C:/Users/aaron/Desktop/DATEN/data"

gesamt_path = os.path.join(main_path, "ML_DATA")
labels = listdir(gesamt_path)

data_generator = ImageDataGenerator(#rescale=1./255, 
                                    validation_split=0.20,
                                   preprocessing_function=preprocess_input)

train_generator = data_generator.flow_from_directory(gesamt_path, target_size=(IMAGE_SIZE, IMAGE_SIZE), shuffle=True, seed=13,
                                                     class_mode='categorical', batch_size=BATCH_SIZE, subset="training")

validation_generator = data_generator.flow_from_directory(gesamt_path, target_size=(IMAGE_SIZE, IMAGE_SIZE), shuffle=False, seed=13,
                                                     class_mode='categorical', batch_size=BATCH_SIZE, subset="validation")

Defining and training the model

img_width = 224
img_height = 224 

model = keras.applications.resnet50.ResNet50()

classes = list(iter(train_generator.class_indices))
model.layers.pop()
for layer in model.layers:
    layer.trainable=False
last = model.layers[-1].output
x = Dense(len(classes), activation="softmax")(last)
finetuned_model = Model(model.input, x)
finetuned_model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
for c in train_generator.class_indices:
    classes[train_generator.class_indices[c]] = c
finetuned_model.classes = classes



earlystopCallback = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=8, verbose=1, mode='auto')
tbCallBack = keras.callbacks.TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)

history = finetuned_model.fit_generator(train_generator,
                    validation_data=validation_generator, 
                    epochs=85, verbose=1,callbacks=[tbCallBack,earlystopCallback])

Upvotes: 2

Views: 2535

Answers (3)

sob3kx
sob3kx

Reputation: 13

There is known ""problem"" (strange design) regarding BN in Keras and your bad result may be related to this issue.

Upvotes: 0

Ioannis Nasios
Ioannis Nasios

Reputation: 8527

In your training you are using a pretrained model (resnet50) changing only the last layer because you want to predict only a few classes and not the 1000 classes the pretrained model was trained on (that's the meaning of transfer learning).

You are freezing all weights and you are not letting your model to train. Try:

model = keras.applications.resnet50.ResNet50(include_top=False, pooling='avg')

for layer in model.layers:
    layer.trainable=False
last = model.output
x = Dense(512, activation='relu')(last)
x = Dropout(0.5)(x)
#x = BatchNormalization()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
#x = BatchNormalization()(x)
x = Dense(len(classes), activation="softmax")(x)

You can modify the code above, change 512 number of neurons, add or not dropout/batchnormalization, use as many dense layers as you want....

Upvotes: 0

pitfall
pitfall

Reputation: 2621

  1. You need to match the preprocessing used for the pretrained network, not come up your own preprocessing. Double check the network input tensor, i.e. whether the channel-wise average of your input matches that of the data used for the pretrained network.

  2. It could be that your new data is very different from the data used for the pretrained network. In that case, all BN layers gonna migrate their pretrained mean/var to new values, so an increasing loss is also possible (but eventually the loss should decrease).

Upvotes: 1

Related Questions