Reputation: 65
I'm looking for a cross entropy loss function in Pytorch that is like the CategoricalCrossEntropyLoss
in Tensorflow.
My labels are one hot encoded and the predictions are the outputs of a softmax layer. For example (every sample belongs to one class):
targets = [0, 0, 1]
predictions = [0.1, 0.2, 0.7]
I want to compute the (categorical) cross entropy on the softmax values and do not take the max values of the predictions as a label and then calculate the cross entropy. Unfortunately, I did not find an appropriate solution since Pytorch's CrossEntropyLoss is not what I want and its BCELoss is also not exactly what I need (isn't it?).
Does anyone know which loss function to use in Pytorch or how to deal with it? Many thanks in advance!
Upvotes: 1
Views: 4969
Reputation: 40668
I thought Tensorflow's CategoricalCrossEntropyLoss
was equivalent to PyTorch's CrossEntropyLoss
but it seems not. The former takes OHEs while the latter takes labels as well. It seems, however, that the difference is:
torch.nn.CrossEntropyLoss
is a combination of torch.nn.LogSoftmax
and torch.nn.NLLLoss()
:
tf.keras.losses.CategoricalCrossEntropyLoss
is something like:
Your predictions have already been through a softmax. So only the negative log-likelihood needs to be applied. Based on what was discussed here, you could try this:
class CategoricalCrossEntropyLoss(nn.Module):
def __init__(self):
super().__init__()
def forward(self, y_hat, y):
return F.nll_loss(y_hat.log(), y.argmax(dim=1))
Above the prediction vector is converted from one-hot-encoding to label with torch.Tensor.argmax
.
If that's correct why not just use torch.nn.CrossEntropyLoss
in the first place? You would just have to remove the softmax on your model's last layer and convert your targets labels.
Upvotes: 4