joni
joni

Reputation: 65

Pytorch - (Categorical) Cross Entropy Loss using one hot encoding and softmax

I'm looking for a cross entropy loss function in Pytorch that is like the CategoricalCrossEntropyLoss in Tensorflow.

My labels are one hot encoded and the predictions are the outputs of a softmax layer. For example (every sample belongs to one class):

targets = [0, 0, 1]
predictions = [0.1, 0.2, 0.7]

I want to compute the (categorical) cross entropy on the softmax values and do not take the max values of the predictions as a label and then calculate the cross entropy. Unfortunately, I did not find an appropriate solution since Pytorch's CrossEntropyLoss is not what I want and its BCELoss is also not exactly what I need (isn't it?).

Does anyone know which loss function to use in Pytorch or how to deal with it? Many thanks in advance!

Upvotes: 1

Views: 4969

Answers (1)

Ivan
Ivan

Reputation: 40668

I thought Tensorflow's CategoricalCrossEntropyLoss was equivalent to PyTorch's CrossEntropyLoss but it seems not. The former takes OHEs while the latter takes labels as well. It seems, however, that the difference is:

  • torch.nn.CrossEntropyLoss is a combination of torch.nn.LogSoftmax and torch.nn.NLLLoss():

    enter image description here

  • tf.keras.losses.CategoricalCrossEntropyLoss is something like:

    enter image description here

Your predictions have already been through a softmax. So only the negative log-likelihood needs to be applied. Based on what was discussed here, you could try this:

class CategoricalCrossEntropyLoss(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, y_hat, y):
        return F.nll_loss(y_hat.log(), y.argmax(dim=1))

Above the prediction vector is converted from one-hot-encoding to label with torch.Tensor.argmax.


If that's correct why not just use torch.nn.CrossEntropyLoss in the first place? You would just have to remove the softmax on your model's last layer and convert your targets labels.

Upvotes: 4

Related Questions