ImageDataGenerator Predict Class - Why are the predictions not correctly converting from probabilities to predicted class?

Question

I have a directory set up like this:

images

-- val
    --class1
    --class2
-- test
   --all_classes
-- train
    --class1
    --class2

In each dir is a set of images. I want to predict if each image in test belongs to class 1 or class 2.

I wrote this to read in the training and validation data:

train_path = "/content/drive/train/"
valid_path = "/content/drive/val/"

train_datagen = ImageDataGenerator(
    rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_directory(
  directory=train_path,
  batch_size=32,
  class_mode='binary',
  target_size=(150,150)
)

validation_generator=test_datagen.flow_from_directory(
  directory=valid_path,
  batch_size=32,
  class_mode='binary',
  target_size=(150,150)
)

Created a network:

def create_network(): 
  model = Sequential()
  model.add(Input(shape=(150,150,3)))

  model.add(Conv2D(32, kernel_size=3,strides=(1, 1),activation='relu', padding='valid', dilation_rate=1))
  model.add(MaxPooling2D(pool_size=(2, 2)))

  model.add(Conv2D(64, kernel_size=3, strides=(1, 1), activation='relu',padding='valid', dilation_rate=1))
  model.add(MaxPooling2D(pool_size=(2, 2)))

  model.add(Flatten())
  model.add(Dense(512, activation='relu'))

  model.add(Dense(1, activation='sigmoid'))
  plot_model(model, to_file='/content/drive/question1_model.png', show_shapes=True, show_layer_names=True)

  model.compile(optimizer = 'adam',
                   loss = 'binary_crossentropy', 
                   metrics = ['accuracy'])
  return model

Fit the model:

def fit_model(train_generator=train_generator, validation_generator=validation_generator,network=create_network()):
  checkpoint_path = "/content/drive/question1_checkpoint.h5"
  checkpoint_dir = os.path.dirname(checkpoint_path)

  callbacks_list = [
      callbacks.EarlyStopping(
          monitor = 'accuracy',
          patience = 5,
      ),

      callbacks.ModelCheckpoint(
          filepath=checkpoint_path,
          monitor = 'val_loss',
          #save_weights_only=True,
          save_best_only=True,
      ),

  ]

  model = network
  history = model.fit(train_generator,
                      epochs=200,
                      validation_data=validation_generator,
                      batch_size=32, 
                      callbacks = callbacks_list,
                      verbose=1
                      )
  return history,model,time_taken

history,model = fit_model(train_generator,validation_generator)

The accuracy and val accuracy of the model is >80%, and I loaded it back in to predict:

model = load_model('/content/drive/question1_checkpoint.h5')

And then I wanted to predict a set of images in test directory:

test_datagen = ImageDataGenerator(rescale=1./255)
test_path = "/content/drive/test/"

test_generator = test_datagen.flow_from_directory(
  directory=test_path,
  batch_size=16,
  class_mode='binary',
  target_size=(150,150),
  shuffle = False
)
test_generator.reset()
filenames = test_generator.filenames
nb_samples = len(filenames)
batch_size=16
predict = model.predict(test_generator,steps=test_generator.n/batch_size)

When I print the start of predict, I can see:

[[6.09035552e-01]
 [2.47541070e-02]
 [7.37663209e-02]
 [5.22839129e-02]
 [2.94408262e-01]
 [1.39171720e-01]
 [6.15863085e-01]

Which I think gives me the probability for class 1 right. But then when I print the class of each prediction with:

predicted_class_indices=np.argmax(predict,axis=-1)
print(predicted_class_indices)

The output is:

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0]

This means that my predicted probabilities are not being correctly translated to the class right? because for example 2.47541070e-02 is 0.02, whereas 6.09035552e-01 is 0.60, so shouldn't these have been predicted to be in different classes? Could someone show me where I'm going wrong?

ImageDataGenerator Predict Class - Why are the predictions not correctly converting from probabilities to predicted class?

Answers (1)

Related Questions