Reputation: 914
When solving a binary classification problem, I think there are two possible ways in caffe.
The first one is using "SigmoidCrossEntropyLossLayer"
with one output unit.
The other one is using "SoftmaxWithLossLayer"
with two output units.
My question is what’s the difference between these two approaches?
Which one should I use?
Thank you very much!
Upvotes: 3
Views: 1498
Reputation: 114876
If you play a bit with the math, you can "duplicate" the predicted class probability of the "Sigmoid"
layer to 0.5*x_i
for class 1 and -0.5*x_i
for class 0, then the "SoftmaxWithLoss"
layer amounts to "SigmoindWithCrossEntropy"
on the single output predictions x_i
.
So I believe it can be said that these two methods can be regarded as equivalent for predicting binary outputs.
Upvotes: 1