Reputation: 969
I am trying to create a CNN with tensorflow, my images are 64x64x1 images and I have an array of 3662 images which I am using for training. I have total 5 labels which I have one-hot encoded. I am getting this error everytime:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [3662,5] and labels shape [18310]
[[{{node loss_2/dense_5_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
my neural network structure is this:
def cnn_model():
model = models.Sequential()
# model.add(layers.Dense(128, activation='relu', ))
model.add(layers.Conv2D(128, (3, 3), activation='relu',input_shape=(64, 64, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu',padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
return model
My model summary is this:
Model: "sequential_3"
Layer (type) Output Shape Param #
conv2d_9 (Conv2D) (None, 62, 62, 128) 1280
max_pooling2d_6 (MaxPooling2 (None, 31, 31, 128) 0
conv2d_10 (Conv2D) (None, 31, 31, 64) 73792
max_pooling2d_7 (MaxPooling2 (None, 15, 15, 64) 0
conv2d_11 (Conv2D) (None, 15, 15, 64) 36928
dense_4 (Dense) (None, 15, 15, 64) 4160
flatten_2 (Flatten) (None, 14400) 0
dense_5 (Dense) (None, 5) 72005
Total params: 188,165
Trainable params: 188,165
Non-trainable params: 0
my output array is of the shape (3662,5,1). I have seen other answers to same questions but I can't figure out the problem with mine. Where am I wrong?
Edit: My labels are stored in one hot encoded form using these:
df = pd.get_dummies(df)
diag = np.array(df)
diag = np.reshape(diag,(3662,5,1))
I have tried as numpy array and after converting to tensor(same for input as per documentation)
Upvotes: 0
Views: 941
Reputation: 5555
The problem lines within the choice of the loss function tf.keras.losses.SparseCategoricalCrossentropy()
. According to what you are trying to achieve you should use tf.keras.losses.CategoricalCrossentropy()
. Namely, the documentation of tf.keras.losses.SparseCategoricalCrossentropy()
Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided as integers.
On the other hand, the documentation of tf.keras.losses.CategoricalCrossentropy()
We expect labels to be provided in a one_hot representation.
And because your labels are encoded as one-hot, you should use tf.keras.losses.CategoricalCrossentropy()
Upvotes: 1