faiza rashid
faiza rashid

Reputation: 765

ValueError: Shapes (None, 1) and (None, 2) are incompatible

I am training a facial expression (angry vs happy) model. Last dense output layer was previously 1 but when i predict an image it's output was always 1 with 64 % accuracy. So i changed it to 2 for 2 outputs. But now i am getting this error::

Epoch 1/15

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-54-9c7272c38dcb> in <module>()
     11     epochs=epochs,
     12     validation_data = val_data_gen,
---> 13     validation_steps = validation_steps,
     14 
     15 )

10 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

ValueError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
        outputs = self.distribute_strategy.run(
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step  **
        y, y_pred, sample_weight, regularization_losses=self.losses)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:205 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:143 __call__
        losses = self.call(y_true, y_pred)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:246 call
        return self.fn(y_true, y_pred, **self._fn_kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:1527 categorical_crossentropy
        return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4561 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1117 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (None, 1) and (None, 2) are incompatible

The relevant code is :

    model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),
  
    Flatten(),
    Dense(512, activation='relu'),
    Dense(2,activation='softmax')
])
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 46, 46, 32)        320       
_________________________________________________________________
batch_normalization_4 (Batch (None, 46, 46, 32)        128       
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 15, 15, 32)        0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 7200)              0         
_________________________________________________________________
dense_8 (Dense)              (None, 512)               3686912   
_________________________________________________________________
dense_9 (Dense)              (None, 2)                 1026      
=================================================================
Total params: 3,688,386
Trainable params: 3,688,322
Non-trainable params: 64
_________________________________________________________________


epochs = 15
steps_per_epoch = train_data_gen.n//train_data_gen.batch_size
validation_steps = val_data_gen.n//val_data_gen.batch_size



history = model.fit(
    x=train_data_gen,
    steps_per_epoch=steps_per_epoch,
    epochs=epochs,
    validation_data = val_data_gen,
    validation_steps = validation_steps,
    
)

Upvotes: 66

Views: 209016

Answers (8)

MohitHustler
MohitHustler

Reputation: 71

Changing from 'categorical_crossentropy' to 'sparse_categorical_crossentropy' worked for me in case of multilabel classification

Upvotes: 7

hungtran273
hungtran273

Reputation: 1357

If your dataset was load with image_dataset_from_directory, use label_mode='categorical'

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  path,
  label_mode='categorical'
)

Or load with flow_from_directory, flow_from_dataframe then use class_mode='categorical'

train_ds = ImageDataGenerator.flow_from_directory(
  path,
  class_mode='categorical'
)

Upvotes: 31

arilwan
arilwan

Reputation: 3983

As @Akash pointed out, should convert your labels to one-hot encoded, like so:

y = keras.utils.to_categorical(y, num_classes=num_classes_in_your_case)

Upvotes: 3

Masum
Masum

Reputation: 283

I encountered this problem myself and in my case, the problem was in the declaration of the model. I was trying to use VGG16 for transfer learning and I used the wrong layer in place of the output. Instead of using the prediction layer that I created, I used another layer. So look in your model if you misplaced any layer when you encounter this error.

Upvotes: 0

zakk616
zakk616

Reputation: 1542

i was facing the same problem my shapes were

shape of X (271, 64, 64, 3)
shape of y (271,)
shape of trainX (203, 64, 64, 3)
shape of trainY (203, 1)
shape of testX (68, 64, 64, 3)
shape of testY (68, 1)

and

loss="categorical_crossentropy"

i changed it to

loss="sparse_categorical_crossentropy"

and it worked like a charm for me

Upvotes: 89

Akash Kumar
Akash Kumar

Reputation: 483

You can change the labels from binary values to categorical and continue with the same code. For example,

from keras.utils import to_categorical
one_hot_label = to_cateorical(input_labels)
# change to [1, 0, 0,..., 0]  --> [[0, 1], [1, 0], ..., [1, 0]]

You can go through this link to understand better Keras API.

If you want to use categorical crossentropy for two classes, use softmax and do one hot encoding. Either for binary classification, you can use binary crossentropy as in previous answer mentioned by using sigmoid activation function.

  1. Categorical Cross entropy:
model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),

    Flatten(),
    Dense(512, activation='relu'),
    Dense(2,activation='softmax')  # activation change
])
model.compile(optimizer='adam',
              loss='categorical_crossentropy', # Loss
              metrics=['accuracy'])
  1. Binary Crossentropy
model = Sequential([
    Conv2D(32,3, activation='relu', input_shape=(48,48,1)),
    BatchNormalization(),
    MaxPooling2D(pool_size=(3, 3)),

    Flatten(),
    Dense(512, activation='relu'),
    Dense(1,activation='sigmoid') #activation change
])
model.compile(optimizer='adam',
              loss='binary_crossentropy', # Loss
              metrics=['accuracy'])

Upvotes: 10

Anirudh R.Huilgol.
Anirudh R.Huilgol.

Reputation: 859

Even I was facing the same problem I changed class_mode='categorical' instead of class_mode='binary' in flow_from_directory method that worked for me

Upvotes: 3

Mike
Mike

Reputation: 1539

Change Categorical Cross Entropy to Binary Cross Entropy since your output label is binary. Also Change Softmax to Sigmoid since Sigmoid is the proper activation function for binary data

Upvotes: 75

Related Questions