Reputation: 55
I am using Keras with the Tensorflow backend to train a modified Resnet-50 that classifies objects into 15 categories. I am using the Adam optimizer and I tried learning rates of 0.001 and 0.01 but got similar results.
The problem I am facing is that the loss and accuracy both show similar behavior (in the training and validation datasets). They both go up or down at similar times and I expected to get higher accuracies as loss went down. What can be causing this behavior?
Here are some Tensorboard curves from my last run:
Edit: The code for the model is the following:
#Model creation:
def create_model(possible_labels):
rn50 = ResNet50(include_top=True, weights=None)
layer_name = rn50.layers[-2].name
model = Model(rn50.input,
Dense(len(possible_labels))(rn50.get_layer(layer_name).output))
adam = Adam(lr=0.0001)
model.compile(loss='categorical_crossentropy',
optimizer=adam, metrics=['accuracy'])
checkpointer = ModelCheckpoint(filepath='the_best_you_ever_had',
verbose=1, save_best_only=True)
tensorboard = TensorBoard()
return model, [checkpointer, tensorboard]
model, checkpointers = create_model(labels)
#Dataset generation:
train_datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
channel_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2
)
val_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
'data\\train',
target_size=(224, 224),
batch_size=32,
class_mode='categorical'
)
val_generator = val_datagen.flow_from_directory(
'data\\validation',
target_size=(224, 224),
batch_size=32,
class_mode='categorical'
)
#Model training:
model.fit_generator(
train_generator,
steps_per_epoch=5000,
epochs=50,
validation_data=val_generator,
callbacks=checkpointers
)
Upvotes: 2
Views: 426
Reputation: 55
I found the error in my code and it was the fact that I was just using the default (linear) activation in my added last layer. I switched it to a softmax activation (as it is a classification and not a regression problem) by doing this in the code:
model = Model(rn50.input,
Dense(len(possible_labels), activation='softmax')
(rn50.get_layer(layer_name).output))
and then the curves started behaving as expected and I achieved 96% accuracy.
Upvotes: 1