Yoan B. M.Sc
Yoan B. M.Sc

Reputation: 1505

Tensorflow: loss and accuracy stay flat training CNN on image classification

I copied / pasted this Tensorflow tutorial into a Jupyter notebook. (As of this writting they changed the tutorial to the flower data set instead of the dog one, but the question still applies). https://www.tensorflow.org/tutorials/images/classification

The first part (without augmentation) runs fine and I get similar results.

But with data augmentation, my Loss and Accuracy stay flat across all epoch. I've checked this posts already on SO : Keras accuracy does not change How to fix flatlined accuracy and NaN loss in tensorflow image classification Tensorflow: loss decreasing, but accuracy stable

None of this applied, since the dataset is a standard one, I don't have the problem of corrupted data, plus I printed a couple of images augmented and it works fine (see below).

I've tried adding more fully connected layers to increase the model capacity, dropout to limit over fitting,... nothing change here are the curve :

Any ideas as to why? Have I missed something in the code? I know training a DL model is a lot of trial and error, but I'm sure there must be some logic or intuition beyond randomly turning the knobs until something happens.

Thanks !

enter image description here

Source Data : _URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

Params :

batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150

Preprocessing stage :

image_gen = ImageDataGenerator(rescale=1./255,
    rotation_range=20,
    width_shift_range=0.15,
    height_shift_range=0.15,
    horizontal_flip=True,
    zoom_range=0.2)

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH))

augmented_images = [train_data_gen[0][0][i] for i in range(5)]
plotImages(augmented_images)

image_gen_val = ImageDataGenerator(rescale=1./255)

val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
                                                 directory=validation_dir,
                                                 target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                 class_mode='binary')

enter image description here

Model :

model_new = Sequential([
    Conv2D(16, 2, padding='same', activation='relu', 
           input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
    MaxPooling2D(),
    Conv2D(32, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Dropout(0.2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1)
])

model_new.compile(optimizer='adam',
                  loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  metrics=['accuracy'])

model_new.summary()

history = model_new.fit(
    train_data_gen,
    steps_per_epoch= total_train // batch_size,
    epochs=epochs,
    validation_data=val_data_gen,
    validation_steps= total_val // batch_size
)

Upvotes: 0

Views: 1039

Answers (1)

Yoan B. M.Sc
Yoan B. M.Sc

Reputation: 1505

As suggested by @today, class_method= 'binary' was missing from the training data generator Now the model is able to train properly.

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH),
                                               class_method = 'binary')

Upvotes: 0

Related Questions