Reputation: 91
I'm currently trying to create a neural network to classify objects, given an image. The images can be a composition of objects such that the multi-label sums up to 1, e.g. one multi-label could look like: [0.3, 0.0, 0.7] for label_1, label_2, and label_3, respectively. These are not probabilities but rather the fractional amount of each object in the image.
Here's where most of my confusion comes in. Activations such as sigmoid and softmax convert the logits into probabilities. The multi-labels problems I have found seem to deal with boolean labels, in which case one could use a sigmoid activation (where probabilities across the labels do not need to sum up to 1) and perhaps set a probability threshold that defines a positive.
Using a similar approach, could I classify my problem? I originally thought of using softmax and interpreting the probabilities as fractions, as they also sum up to 1, but I'm not sure how mathematically sound that would be, particularly once I account for very small fractions in the labels. And that's not even to mention an accuracy metric.
Or perhaps I'm aproaching it from the wrong angle entirely. The problem started as a single-label multi-class problem, only taking into account the majority class for each image, so I have only looked into classification methods. Perhaps I should be using regression and a loss function such as MSE?
What do you guys think?
Upvotes: 0
Views: 38