Reputation: 65
I want to do a simple test of the SparseCategoricalCrossentropy function, to see what exactly it does to an output. For that I use the output of the last layer of a MobileNetV2.
import keras.backend as K
full_model = tf.keras.applications.MobileNetV2(
input_shape=(224,224,3),
alpha=1.0,
include_top=True,
weights="imagenet",
input_tensor=None,
pooling=None,
classes=1000,
classifier_activation="softmax",)
func = K.function(full_model.layers[1].input, full_model.layers[155].output)
conv_output = func([processed_image])
y_pred = np.single(conv_output)
y_true = np.zeros(1000).reshape(1,1000)
y_true[0][282] = 1
scce = tf.keras.losses.SparseCategoricalCrossentropy()
scce(y_true, y_pred).numpy()
processed_image
is a 1x224x224x3 array created previously.
I'm getting the error ValueError: Shape mismatch: The shape of labels (received (1000,)) should equal the shape of logits except for the last dimension (received (1, 1000)).
I tried reshaping the arrays to match the dimensions the error mentioned, but it doesn't seem to work. What shapes does it accept?
Upvotes: 1
Views: 366
Reputation: 26708
Since you are using the SparseCategoricalCrossentropy
loss function, the shape of y_true
should be [batch_size]
and the shape of y_pred
should be
[batch_size, num_classes]
. Furthermore, y_true
should consist of integer values. See the documentation. In your concrete example, you could try something like this:
import keras.backend as K
import tensorflow as tf
import numpy as np
full_model = tf.keras.applications.MobileNetV2(
input_shape=(224,224,3),
alpha=1.0,
include_top=True,
weights="imagenet",
input_tensor=None,
pooling=None,
classes=1000,
classifier_activation="softmax",)
batch_size = 1
processed_image = tf.random.uniform(shape=[batch_size,224,224,3])
func = K.function(full_model.layers[1].input,
full_model.layers[155].output)
conv_output = func([processed_image])
y_pred = np.single(conv_output)
# Generates an integer between 0 and 999 representing a class index.
y_true = np.random.randint(low = 0, high = 999, size = batch_size)
# [984]
scce = tf.keras.losses.SparseCategoricalCrossentropy()
scce(y_true, y_pred).numpy()
# y_pred encodes a probability distribution here and the calculated loss is 10.69202
You can experiment with the batch_size
to see how everything works. In the example above, I just used a batch_size
of 1.
Upvotes: 1