Is label encoding enough for output labels?

Question

For ordinal features it makes sense to use label encoding. But for categorical features we use one hot encoding. But these are the conventions for input features. But for output variables is it necessary to use one hot encoding if the output labels are categorical? Or I may use label encoding as well? Which one is preferable?

I am training a fruit classifier having 120 classes. I am using a ResNet50 model pre-trained on ImageNet as a feature extractor and using these features I am training a Logistic Regression classifier (transfer learning). As there are 120 classes, for label encoding the labels will be ranged from 0 to 119. Will it be okay to train model keeping them label-encoded? I am asking this because in the following documentation of sklearn they are allowing me to do so:

sklearn.preprocessing.LabelEncoder

Here they are saying:

..."This transformer should be used to encode target values, i.e. y, and not the input X."

But I am confused why it is okay to do so as in label encoding each of the output variables is not getting the same priority as they would get if I used one hot encoding.

Szymon Maszke · Accepted Answer

But for output variable is it necessary to use one hot encoding if the output labels are categorical?

No, it's not necessary and won't matter in your case. On the other hand, not all the algorithms can return data in a not-one-hot-encoded way:

RandomForest can classify using label encoding as it's "just" returning one of N target values based on internal if-like conditions (simplified).
ResNet50, as it's a neural network, will return matrix [samples, labels] with logits (unnormalized probability) or probabilities with which loss is calculated. It couldn't return values like [0, 2, 18, 25] (for 4 samples) as operation like argmax breaks the gradient (it's taken along columns to get the index of the label when calculating stuff like accuracy but I wouldn't consider it part of the network).

Still many frameworks allow you to encode labels as ordinal as it's more memory efficient. You can see PyTorch's torch.nn.CrossEntropyLoss, it takes in targets saved as ordinal values.

As pointed out in the comments both are encodings of labels and can be easily transferred as needed.

Is label encoding enough for output labels?

Answers (1)

Related Questions