What sort of loss function should I use this multi-class multi-label(?) problem?

Question

In my experiment I am trying to train a neural network to detect if patients exhibit symptom A, B, C, D. My data consists of different angled photos of each patient along with whether or not they have symptom A, B, C, D.

Right now in, pytoch, I am using MSELoss and calculating my test error as the total number of correct classifications out of the total number of classifications. I'm guessing this is too naive and even inappropriate.

An example of a test error computation would be like this: Suppose we have 2 patients with two images each of them. Then there would be 16 total classifications (1 for whether patient 1 has symptom A, B, C, D in photo 1, etc). And if the model correctly predicted that in photo 1 patient 1 exhibited symptom A then that would add 1 to the total number of correct classifications.

aminrd · Accepted Answer

I suggest to use binary-crossentropy in multi-class multi-label classifications. This may seem counterintuitive for multi-label classification, but keep in mind that the goal here is to treat each output label as an independent distribution (or class).

In pytorch you can use torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean'). This creates a criterion that measures the Binary Cross Entropy between the target and the output.

What sort of loss function should I use this multi-class multi-label(?) problem?

Answers (1)

Related Questions