Reputation: 45
I built a very simple Convolutional Neural Network using the pre-trained VGG16.
I am using the Pokemon generation one dataset containing 10.000 images belonging to 149 different classes. I manually split the dataset, 0.7 for training and 0.3 for validation in different directories.
The problem is that I am getting high accuracy but the validation accuracy is not very high.
In the code below, there is the best configuration found, using Adam optimizer with 0.0001 of learning rate.
Can someone suggest me how I can improve the performance and avoid overfitting?
Code:
import tensorflow as tf
import numpy as np
vgg_model = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape = (224,224,3))
vgg_model.trainable = False
model = tf.keras.models.Sequential()
model.add(vgg_model)
model.add(tf.keras.layers.Flatten(input_shape=vgg_model.output_shape[1:]))
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dense(149, activation='softmax'))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001, decay=0.0001/100), loss='categorical_crossentropy', metrics=['accuracy'])
train= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
training_set = train.flow_from_directory('datasets/generation/train', target_size=(224,224), class_mode = 'categorical')
val_set = train.flow_from_directory('datasets/generation/test', target_size=(224,224), class_mode = 'categorical')
history = model.fit_generator(training_set, steps_per_epoch = 64, epochs = 100, validation_data = val_set, validation_steps = 64)
Here is the output every 10 epochs:
Epoch 1/100
64/64 [====================] - 57s 891ms/step - loss: 4.8707 - acc: 0.0654 - val_loss: 4.7281 - val_acc: 0.0718
Epoch 10/100
64/64 [====================] - 53s 821ms/step - loss: 2.9540 - acc: 0.4141 - val_loss: 3.2206 - val_acc: 0.3447
Epoch 20/100
64/64 [====================] - 56s 869ms/step - loss: 1.9040 - acc: 0.6279 - val_loss: 2.6155 - val_acc: 0.4577
Epoch 30/100
64/64 [====================] - 50s 781ms/step - loss: 1.2899 - acc: 0.7658 - val_loss: 2.3345 - val_acc: 0.4897
Epoch 40/100
64/64 [====================] - 53s 832ms/step - loss: 1.0192 - acc: 0.8096 - val_loss: 2.1765 - val_acc: 0.5149
Epoch 50/100
64/64 [====================] - 55s 854ms/step - loss: 0.7948 - acc: 0.8672 - val_loss: 2.1082 - val_acc: 0.5359
Epoch 60/100
64/64 [====================] - 52s 816ms/step - loss: 0.5774 - acc: 0.9106 - val_loss: 2.0673 - val_acc: 0.5435
Epoch 70/100
64/64 [====================] - 52s 811ms/step - loss: 0.4383 - acc: 0.9385 - val_loss: 2.0499 - val_acc: 0.5454
Epoch 80/100
64/64 [====================] - 56s 881ms/step - loss: 0.3638 - acc: 0.9473 - val_loss: 1.9849 - val_acc: 0.5501
Epoch 90/100
64/64 [====================] - 55s 860ms/step - loss: 0.2860 - acc: 0.9609 - val_loss: 1.9564 - val_acc: 0.5531
Epoch 100/100
64/64 [====================] - 52s 815ms/step - loss: 0.2328 - acc: 0.9697 - val_loss: 2.0334 - val_acc: 0.5615
Upvotes: 2
Views: 2212
Reputation: 8527
As i can see in your output above you are not overfitting yet but there is a huge spread between train and validation score. There are plenty of things you can try to improve you validation score.
add a dropout layer like this:
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(149, activation='softmax'))
Upvotes: 2