Rex
Rex

Reputation: 43

Keras Transfer Learning Resnet50 using fit_generator got high acc but low val_acc problem

I'm using Resnet50 model to do transfer learning, using 100,000 images in total of 20 scenes(MIT Place365 dataset). I trained only the last 160 layers(due to the memory restriction). The problem is I got a pretty high accuracy but extremely low validation accuracy, I think this might be an overfitting problem but I don't know how to solve it. I will really appreciate if anyone can give me advice about how to solve my low val_acc problem, thank you very much. My code is as follows:

V1 = np.load("C:/Users/Desktop/numpydataKeras_20_val/imgonehot_val_500.npy")
V2 = np.load("C:/Users/Desktop/numpydataKeras_20_val/labelonehot_val_500.npy") 


net = keras.applications.resnet50.ResNet50(include_top=False, weights='imagenet', input_tensor=None, input_shape=(224, 224, 3))

x = net.output
x = Flatten()(x)
x = Dense(128)(x)
x = Activation('relu')(x)
x = Dropout(0.5)(x)
output_layer = Dense(20, activation='softmax', name='softmax')(x)
net_final = Model(inputs=net.input, outputs=output_layer)

for layer in net_final.layers[:-160]:
    layer.trainable = False
for layer in net_final.layers[-160:]:
    layer.trainable = True

net_final.compile(Adam(lr=.00002122), loss='categorical_crossentropy', metrics=['accuracy'])

def data_generator():
    n = 100000
    Num_batch = 100000/100
    arr = np.arange(1000)
    np.random.shuffle(arr)
    while (True):
        for i in arr:
            seed01 = random.randint(0,1000000)

            X_batch  = np.load( "C:/Users/Desktop/numpydataKeras/imgonehot_"+str((i+1)*100)+".npy" )
            np.random.seed(seed01)
            np.random.shuffle(X_batch)

            y_batch = np.load( "C:/Users/Desktop/numpydataKeras/labelonehot_"+str((i+1)*100)+".npy" )
            np.random.seed(seed01)
            np.random.shuffle(y_batch)

            yield X_batch, y_batch

weights_file = 'C:/Users/Desktop/Transfer_learning_resnet50_fit_generator_02s.h5'
early_stopping = EarlyStopping(monitor='val_acc', patience=5, mode='auto', verbose=2)
model_checkpoint = ModelCheckpoint(weights_file, monitor='val_acc', save_best_only=True, verbose=2)
callbacks = [early_stopping, model_checkpoint]

model_fit = net_final.fit_generator(
    data_generator(),
    steps_per_epoch=1000,
    epochs=5,
    validation_data=(V1, V2),
    callbacks=callbacks,
    verbose=1,
    pickle_safe=False)

The followings are the printouts:

Epoch 1/5
1000/1000 [==============================] - 3481s 3s/step - loss: 1.7917 - acc: 0.4757 - val_loss: 3.5872 - val_acc: 0.0560

Epoch 00001: val_acc improved from -inf to 0.05600, saving model to C:/Users/Desktop/Transfer_learning_resnet50_fit_generator_02s.h5
Epoch 2/5
1000/1000 [==============================] - 4884s 5s/step - loss: 1.1287 - acc: 0.6595 - val_loss: 4.2113 - val_acc: 0.0520

Epoch 00002: val_acc did not improve from 0.05600
Epoch 3/5
1000/1000 [==============================] - 4964s 5s/step - loss: 0.8033 - acc: 0.7464 - val_loss: 4.9595 - val_acc: 0.0520

Epoch 00003: val_acc did not improve from 0.05600
Epoch 4/5
1000/1000 [==============================] - 4961s 5s/step - loss: 0.5677 - acc: 0.8143 - val_loss: 4.5484 - val_acc: 0.0520

Epoch 00004: val_acc did not improve from 0.05600
Epoch 5/5
1000/1000 [==============================] - 4928s 5s/step - loss: 0.3999 - acc: 0.8672 - val_loss: 4.6155 - val_acc: 0.0400

Epoch 00005: val_acc did not improve from 0.05600

Upvotes: 3

Views: 1650

Answers (1)

Metal3d
Metal3d

Reputation: 2941

Following https://github.com/keras-team/keras/issues/9214#issuecomment-397916155 it seems that batch normalization should be trainable.

The following code can replace the loop where you set/unset trainable layers:

for layer in model.layers:
    if hasattr(layer, 'moving_mean') and hasattr(layer, 'moving_variance'):
        layer.trainable = True
        K.eval(K.update(layer.moving_mean, K.zeros_like(layer.moving_mean)))
        K.eval(K.update(layer.moving_variance, K.zeros_like(layer.moving_variance)))
    else:
        layer.trainable = False

On my own data, I needed to reduce batch size to avoid OOM, and I now have:

Epoch 1/10
470/470 [==============================] - 90s 192ms/step - loss: 0.3513 - acc: 0.8660 - val_loss: 0.1299 - val_acc: 0.9590
Epoch 2/10
470/470 [==============================] - 83s 177ms/step - loss: 0.2204 - acc: 0.9163 - val_loss: 0.1276 - val_acc: 0.9471
Epoch 3/10
470/470 [==============================] - 83s 177ms/step - loss: 0.2219 - acc: 0.9184 - val_loss: 0.1048 - val_acc: 0.9589
Epoch 4/10
470/470 [==============================] - 83s 177ms/step - loss: 0.1813 - acc: 0.9327 - val_loss: 0.1857 - val_acc: 0.9303

Warning, it may impact accuracy and you must freeze your model to avoid weird inference. But it seems to be the only way that worked for me.

Another comment https://github.com/keras-team/keras/issues/9214#issuecomment-422490253 only checks the layer names to set it trainable if it's a batch normalization, but it didn't changed anything for me. Maybe it can help for your dataset.

Upvotes: 1

Related Questions