Alex Colombari
Alex Colombari

Reputation: 159

Multilabel with binary classification in Keras

At this moment, i working with image classification using Keras, sci-kit learn, etc.

I will try to explain all the problem. Like i said before, it's an image classification with multilabel. My dataframe contain 4000 microscopic oil samples, and the labels, represent some particles in the current sample. I will give one example below.

Well, all the images in dataframe are labeled. Imagine that, each image contain one array with 13 values, already in binary, and of course, 1 for positive and 0 for negative.

e.g.

[0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0]

That means for each image, it's possible to have multiple outputs, in this case, the objective is to give one oil sample to CNN, and that can be return what particles are present in image.

I don't know if it's clearly enough, sorry for that, now i will explain my real problem.

In my CNN, i already set the output layer to 13 (following the number of labels in each image). I don't know why, but when i train the model, the predicted Y return only one value, example:

Y predicted (sample 14): 3
Y predicted (sample 65): 11

I need to get the predctions with multi outputs, like:

Y predicted (sample 14): 3, 7, 9, 12
Y predicted (sample 65): 5, 8, 9, 11

I need help to solve this problem, because i stuck a long time trying. I appreciate if someone knows a strategy for this.

Thanks in advance!

Upvotes: 0

Views: 1049

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56407

Your problem is called multi-label classification. It means more than one class in the output of the model can be present at a time, not just one.

Given a vector of predictions, you can obtain individual classes by applying thresholding:

thresh = 0.5
p = model.predict(some_input)
classes = []
for prob, idx in enumerate(p):
    if prob > thresh:
        classes.append(idx)

print(classes)

After executing this you will get a variable sized vector with different classes, as predicted by the model. The threshold (thresh) is a parameter you have to tune using a performance metric for binary classification applied to each class. You can also have different thresholds for each class.

The threshold is something you have to tune. Now you will get a vector of 0s and 1s, where 0 in

Upvotes: 2

Related Questions