Reputation: 55
I am trying to train a machine learning model to classify images, but I am getting some issues when I attempt to use the categorical_crossentropy loss function.
Here is the code that I am using to generate my model.
import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import pathlib
import glob
import matplotlib.pyplot as plt
from tensorflow.keras import layers
from tensorflow.keras import callbacks
from tensorflow import keras
from datetime import datetime
import tensorboard
if __name__ == "__main__":
#This first section mostly follows the tutorial at https://www.tensorflow.org/tutorials/images/classification
data_dir = "img_directories"
image_count = len(list(glob.glob(f'{data_dir}/*/*.png')))
print(image_count)
batch_size = 128
img_height = 100
img_width = 100
#Set up training data
val_split = 0.2
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=val_split,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
#Set up testing data
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=val_split,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size,
color_mode='rgb')
class_names = train_ds.class_names
print(class_names)
num_classes = len(train_ds.class_names)
# Normalize data
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
#Set up model
model = tf.keras.Sequential()
# model.add(layers.experimental.preprocessing.Rescaling((1./255),input_shape=(100, 100, 3)))
model.add(layers.Conv2D(64, (3,3), activation='relu',input_shape=(img_height, img_width, 3)))
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(layers.Dropout(0.2))
model.add(layers.Conv2D(64, (5,5)))
model.add(layers.MaxPooling2D(pool_size=(3,3)))
model.add(layers.Dense(64))
model.add(layers.Flatten())
model.add(layers.Dense(num_classes, activation='softmax'))
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.summary()
earlystopping = callbacks.EarlyStopping(monitor ="val_loss",
mode ="min", patience = 7,
restore_best_weights = True)
history=model.fit(
normalized_ds,
validation_data=val_ds,
epochs=100,
callbacks=[earlystopping]
)
Setting the loss function to categorical_crossentropy
gives me the following errors:
ValueError: Shapes (None, 1) and (None, 32) are incompatible
Where 32 is the number of classes in my dataset that I have, therefore it is having issues with my output layer.
However, it does not appear to have issues when I try to run it with sparse_categorical_crossentropy
How do I make it work with categorical_crossentropy
because I have so few classes?
Edit:
I have tried something like this, but I am still getting errors similar to the original.
val_imgs, val_labels = next(iter(val_ds))
val_labels_one_hot=tf.one_hot(labels_batch,num_classes)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.summary()
earlystopping = callbacks.EarlyStopping(monitor ="val_loss",
mode ="min", patience = 7,
restore_best_weights = True)
history=model.fit(
train_ds,
validation_data=[val_imgs,val_labels_one_hot],
epochs=100,
callbacks=[earlystopping]
)
Upvotes: 0
Views: 511
Reputation: 10985
sparse_categorical_crossentropy
(documentation) assumes integers whereas categorical_crossentropy
(documentation) assumes one-hot encoding vectors. You can use both but sparse_categorical_crossentropy
works because you're providing each label with shape (None, 1)
.
In summary, if you want to use categorical_crossentropy
, you'll need to convert your current target tensor to one-hot encodings (which will then be used by the final softmax layer).
Upvotes: 1