Benison Sam
Benison Sam

Reputation: 2825

Best Loss Function for Multi-Class Multi-Target Classification Problem

I have a classification problem and I don't know how to categorize this classification problem. As per my understanding,

A Multiclass classification problem is where you have multiple mutually exclusive classes and each data point in the dataset can only be labelled by one class. For example, in an Image Classification task for fruits, a fruit data point labelled as an apple cannot be an orange and an orange cannot be a banana and so on. Each data point, in this case can only be any one of the fruits of the fruits class and so is labelled accordingly.

Where as ...

A Multilabel classification is a problem where you have multiple sets of mutually exclusive classes of which the data point can be labelled simultaneously. For example, in an Image Classification task for Cars, a car data point labelled as a sedan cannot be a hatchback and a hatchback cannot be a SUV and so on for the type of car. At the same time, the same car data point can be labelled one from VW, Ford, Mercedes, etc. as the car manufacturer. So in this case, the car data point is labeled from two different sets of mutually exclusive classes.

Please correct my understanding if I am thinking in a wrong way here.

Now to my problem, my classification problem with multiple classes, lets say A, B, C, D and E. Here each data point can have one or more than one classes from the set as shown below on the left:

|-------------|----------|              |-------------|-----------------|
|      X      |     y    |              |      X      |    One-Hot-Y    |
|-------------|----------|              |-------------|-----------------|
|     DP1     |   A, B   |              |     DP1     | [1, 1, 0, 0, 0] |
|-------------|----------|              |-------------|-----------------|
|     DP2     |   C      |              |     DP2     | [0, 0, 1, 0, 0] |
|-------------|----------|              |-------------|-----------------|
|     DP3     |   B, E   |              |     DP3     | [0, 1, 0, 0, 1] |
|-------------|----------|              |-------------|-----------------|
|     DP4     |   A, C   |              |     DP4     | [1, 0, 1, 0, 0] |
|-------------|----------|              |-------------|-----------------|
|     DP5     |   D      |              |     DP5     | [0, 0, 0, 1, 0] |
|-------------|----------|              |-------------|-----------------|

I One-Hot Encoded the labels for training as shown above on the right. My question is:

  1. What Loss function (preferably in PyTorch) can I use for training the model to optimize for the One-Hot encoded output
  2. What do we call such a classification problem? Multi-label or Multi-class?

Thank you for your answers!

Upvotes: 3

Views: 6537

Answers (1)

Szymon Maszke
Szymon Maszke

Reputation: 24874

What Loss function (preferably in PyTorch) can I use for training the model to optimize for the One-Hot encoded output

You can use torch.nn.BCEWithLogitsLoss (or MultiLabelSoftMarginLoss as they are equivalent) and see how this one works out. This is standard approach, other possibility could be MultilabelMarginLoss.

What do we call such a classification problem? Multi-label or Multi-class?

It is multilabel (as multiple labels can be present simultaneously). In one-hot encoding:

[1, 1, 0, 0, 0], [0, 1, 0, 0, 1] - multilabel
[0, 0, 1, 0, 0] - multiclass
[1], [0] - binary (special case of multiclass)

multiclass cannot have more than one 1 as all other labels are mutually exclusive.

Upvotes: 5

Related Questions