Cael
Cael

Reputation: 41

CNN validation accuracy high, but bad at predictions?

I am trying to build a CNN to distinguish between 3 classes which are genuine faces, printed faces, and replayed faces. I prepared the data as so:

classes = ['Genuine', 'Printed', 'Replay']

base_dir = '/Dataset'

import os
import numpy as np
import glob
import shutil

for cl in classes:
  img_path = os.path.join(base_dir, cl)
  images = glob.glob(img_path + '/*.jpg')
  print("{}: {} Images".format(cl, len(images)))
  num_train = int(round(len(images)*0.8))
  train, val = images[:num_train], images[num_train:]

  for t in train:
    if not os.path.exists(os.path.join(base_dir, 'train', cl)):
      os.makedirs(os.path.join(base_dir, 'train', cl))
    shutil.move(t, os.path.join(base_dir, 'train', cl))

  for v in val:
    if not os.path.exists(os.path.join(base_dir, 'val', cl)):
      os.makedirs(os.path.join(base_dir, 'val', cl))
    shutil.move(v, os.path.join(base_dir, 'val', cl))

from tensorflow.keras.preprocessing.image import ImageDataGenerator

image_gen_train = ImageDataGenerator(
                    rescale=1./255,
                    rotation_range=45,
                    width_shift_range=.15,
                    height_shift_range=.15,
                    horizontal_flip=True,
                    zoom_range=0.5
                    )

batch_size = 32
IMG_SHAPE = 96 
train_data_gen = image_gen_train.flow_from_directory(
                                                batch_size=batch_size,
                                                directory=train_dir,
                                                shuffle=True,
                                                target_size=(IMG_SHAPE,IMG_SHAPE),
                                                class_mode='sparse'
                                                )

I built a simple model like the following:

    ## Model
import tensorflow as tf
from keras import regularizers
from keras.layers.normalization import BatchNormalization

IMG_SHAPE = (96, 96, 3)
batch_size = 32

## Trainable classification head


aConv_layer = tf.keras.layers.Conv2D(576, (3, 3), padding="same", 
                                     activation="relu", input_shape= IMG_SHAPE)
aConv_layer = tf.keras.layers.Conv2D(144, (3, 3), padding="same", 
                                     activation="relu", input_shape= IMG_SHAPE)

gmaxPool_layer = tf.keras.layers.GlobalMaxPooling2D() #reduces input from 4D to 2D
maxPool_layer = tf.keras.layers.MaxPool2D(pool_size=(1, 1), strides=None, 
                                          padding='valid', data_format=None,
                                          )

batNor_layer = tf.keras.layers.BatchNormalization(axis=-1, momentum=0.99, 
                                                  epsilon=0.001, 
                                center=True, scale=True, 
                                beta_initializer='zeros', 
                                gamma_initializer='ones', 
                                moving_mean_initializer='zeros', 
                                moving_variance_initializer='ones', 
                                beta_regularizer=None, gamma_regularizer=None, 
                                beta_constraint=None, gamma_constraint=None)

flat_layer = tf.keras.layers.Flatten()

dense_layer = tf.keras.layers.Dense(9, activation='softmax', 
                                    kernel_regularizer=regularizers.l2(0.01))

prediction_layer = tf.keras.layers.Dense(3, activation='softmax')

model = tf.keras.Sequential([
     #base_model,
     tf.keras.layers.Conv2D(576, (3, 3), padding="same", activation="relu", input_shape= IMG_SHAPE),
     tf.keras.layers.Dense(288, activation='softmax', kernel_regularizer=regularizers.l2(0.01)),
     tf.keras.layers.MaxPool2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None),
     tf.keras.layers.Conv2D(144, (3, 3), padding="same", activation="relu"),
     tf.keras.layers.Dense(72, activation='softmax', kernel_regularizer=regularizers.l2(0.01)),
     tf.keras.layers.MaxPool2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None),
     #
     batNor_layer,
     gmaxPool_layer,
     tf.keras.layers.Flatten(),
     #tf.keras.layers.Dropout(0.5),
     prediction_layer                        
])

learning_rate = 0.001

## Compiles the model
model.compile(optimizer=tf.keras.optimizers.Adam(lr=learning_rate),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy']
)

I trained the model and got the following, which I would assume to be great results:

CNN result

However, whenever I tried to predict an image with the following code, it would almost always get it wrong:

import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()
for fn in uploaded.keys():

# predicting images
  path = fn
  img = image.load_img(path, target_size=(96, 96))
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)
  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
print(fn)
print('Genuine | Printout | Replay')
print(np.argmax(classes))

How can the predictions be wrong when the validation accuracy be so high? Here is the Codelab, if it helps.

Upvotes: 0

Views: 1479

Answers (2)

Anis Cherid
Anis Cherid

Reputation: 11

Somehow, the image generator of Keras works well when combined with fit() or fit_generator() function, but fails miserably when combined with predict_generator() or the predict() function. The Keras predict() function generally fails when working with batch prediction.

When using Plaid-ML Keras back-end for AMD processor, I would rather loop through all test images one-by-one and get the prediction for each image in each iteration.

import os
from PIL import Image
import keras
import numpy

# code for creating dan training model is not included

print("Prediction result:")
dir = "/path/to/test/images"
files = os.listdir(dir)
correct = 0
total = 0
#dictionary to label all animal category class.
classes = {
    0:'This is Cat',
    1:'This is Dog',
}
for file_name in files:
    total += 1
    image = Image.open(dir + "/" + file_name).convert('RGB')
    image = image.resize((100,100))
    image = numpy.expand_dims(image, axis=0)
    image = numpy.array(image)
    image = image/255
    pred = model.predict_classes([image])[0]
    animals_category = classes[pred]
    if ("cat" in file_name) and ("cat" in sign):
        print(correct,". ", file_name, animals_category)
        correct+=1
    elif ("dog" in file_name) and ("dog" in animals_category):
        print(correct,". ", file_name, animals_category)
        correct+=1
print("accuracy: ", (correct/total))

Upvotes: 0

jkr
jkr

Reputation: 19260

Process the images for prediction in the same way that you processed your images for training. Specifically, rescale your images like you did with ImageDataGenerator.

import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()
for fn in uploaded.keys():

# predicting images
  path = fn
  img = image.load_img(path, target_size=(96, 96))
  x = image.img_to_array(img)
  # Rescale image.
  x = x / 255.
  x = np.expand_dims(x, axis=0)
  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
print(fn)
print('Genuine | Printout | Replay')
print(np.argmax(classes))

Upvotes: 4

Related Questions