Ivy Jackson
Ivy Jackson

Reputation: 55

"ValueError: Shapes (None, 1) and (None, 32) are incompatible" when training image classification network in TensorFlow using categorical_crossentropy

I am trying to train a machine learning model to classify images, but I am getting some issues when I attempt to use the categorical_crossentropy loss function.

Here is the code that I am using to generate my model.

import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import pathlib
import glob
import matplotlib.pyplot as plt
from tensorflow.keras import layers
from tensorflow.keras import callbacks 
from tensorflow import keras
from datetime import datetime
import tensorboard

if __name__ == "__main__":
    #This first section mostly follows the tutorial at https://www.tensorflow.org/tutorials/images/classification
    data_dir = "img_directories"
    image_count = len(list(glob.glob(f'{data_dir}/*/*.png')))
    print(image_count)

    batch_size = 128
    img_height = 100 
    img_width = 100

    #Set up training data
    val_split = 0.2

    train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=val_split,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

    #Set up testing data
    val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=val_split,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size, 
    color_mode='rgb')

    class_names = train_ds.class_names
    print(class_names)

    num_classes = len(train_ds.class_names)

    #   Normalize data
    normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
    normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
    image_batch, labels_batch = next(iter(normalized_ds))

    #Set up model

    model = tf.keras.Sequential()
    # model.add(layers.experimental.preprocessing.Rescaling((1./255),input_shape=(100, 100, 3)))
    model.add(layers.Conv2D(64, (3,3), activation='relu',input_shape=(img_height, img_width, 3)))
    model.add(layers.MaxPooling2D(pool_size=(2,2)))
    model.add(layers.Dropout(0.2))
    model.add(layers.Conv2D(64, (5,5)))
    model.add(layers.MaxPooling2D(pool_size=(3,3)))
    model.add(layers.Dense(64))
    model.add(layers.Flatten())
    model.add(layers.Dense(num_classes, activation='softmax'))
    
    model.compile(optimizer='adam',
                loss='categorical_crossentropy',
                metrics=['accuracy'])


    model.summary()

    earlystopping = callbacks.EarlyStopping(monitor ="val_loss",  
                                        mode ="min", patience = 7,  
                                        restore_best_weights = True) 


    history=model.fit(
    normalized_ds,
    validation_data=val_ds,
    epochs=100,
    callbacks=[earlystopping]
    )

Setting the loss function to categorical_crossentropy gives me the following errors:

    ValueError: Shapes (None, 1) and (None, 32) are incompatible

Where 32 is the number of classes in my dataset that I have, therefore it is having issues with my output layer.

However, it does not appear to have issues when I try to run it with sparse_categorical_crossentropy

How do I make it work with categorical_crossentropy because I have so few classes?

Edit:

I have tried something like this, but I am still getting errors similar to the original.

    val_imgs, val_labels = next(iter(val_ds))
    val_labels_one_hot=tf.one_hot(labels_batch,num_classes)

    model.compile(optimizer='adam',
                loss='categorical_crossentropy',
                metrics=['accuracy'])


    model.summary()

    earlystopping = callbacks.EarlyStopping(monitor ="val_loss",  
                                        mode ="min", patience = 7,  
                                        restore_best_weights = True) 

    history=model.fit(
    train_ds,
    validation_data=[val_imgs,val_labels_one_hot],
    epochs=100,
    callbacks=[earlystopping]
    )

Upvotes: 0

Views: 511

Answers (1)

runDOSrun
runDOSrun

Reputation: 10985

sparse_categorical_crossentropy (documentation) assumes integers whereas categorical_crossentropy (documentation) assumes one-hot encoding vectors. You can use both but sparse_categorical_crossentropy works because you're providing each label with shape (None, 1).

In summary, if you want to use categorical_crossentropy, you'll need to convert your current target tensor to one-hot encodings (which will then be used by the final softmax layer).

Upvotes: 1

Related Questions