Reputation: 423
I'm using a custom training loop. The loss that is returned by tf.keras.losses.categorical_crossentropy
is an array of I'm assuming (1,batch_size)
. Is this what it is supposed to return or a single value?
In the latter case, any idea what I could be doing wrong?
Upvotes: 5
Views: 1237
Reputation: 5797
If you have a prediction shape of (samples of batch, classes)
tf.keras.losses.categorical_crossentropy
returns the losses in the shape of (samples of batch,)
.
So, if your labels are:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
And your predictions are:
[[0.9 0.05 0.05]
[0.5 0.89 0.6 ]
[0.05 0.01 0.94]]
You will get a loss like:
[0.10536055 0.8046684 0.06187541]
In most case your model will use these value's mean for the update of your model parameters. So if you manually do the updates you can use:
loss = tf.keras.backend.mean(losses)
Upvotes: 2
Reputation: 86600
Most usual losses return the original shape minus the last axis.
So, if your original y_pred
shape was (samples, ..., ..., classes)
, then your resulting shape will be (samples, ..., ...)
.
This is probably because Keras may use this tensor in further calculations, for sample weights and maybe other things.
In a custom loop, if these dimensions are useless, you can simply take a K.mean(loss_result)
before calculating the gradients. (Where K
is either keras.backend
or tensorflow.keras.backend
)
Upvotes: 2