Reputation: 359
Each element of my dataset has a multi-label tensor like [1, 0, 0, 1]
with varying combinations of 1's
and 0's
. In this scenario, since I have 4 tensors, I have the output layer of my neural network to be 4. In doing so with BCEWithLogitsLoss, I obtain an output tensor like [3, 2, 0, 0]
when I call model(inputs) which is in the range of (0, 3) as I specified with the output layer to have 4 output neurons. This does not match the format of what the target is expected to be, although when I change the number of output neurons to 2, I get a shape mismatch error. What needs to be done to fix this?
Upvotes: 4
Views: 5234
Reputation: 107
When you are doing
y_preds = model(input)
but your loss/ criterion was BCEWithLogitsLoss
, then the output, that is, y_pred
is the logit. Logit can be negative or positive. Logit is z, where z = w1x1 + w2x2 + ... wn*xn.
So, for your predictions while using BCEWithLogitsLoss, you need to pass this output (y_pred
) through a sigmoid layer (For this you can create a small function which returns 1/(1+np.exp(-np.dot(x,w)))
. And then you would be all set :)
Hope this helps!!!
Upvotes: 0
Reputation: 114786
When using BCEWithLogitsLoss
you make a 1D prediction per output binary label.
In your example, you have 4 binary labels to predict, and therefore, your model outputs 4d vector, each entry represents the prediction of one of the binary labels.
Using BCEWithLogitsLoss
you implicitly apply Sigmoid to your outputs:
This loss combines a Sigmoid layer and the BCELoss in one single class.
Therefore, if you want to get the predicted probabilities of your model, you need to add a torch.sigmoid
on top of your prediction. The sigmoid
function will convert your predicted logits to probabilities.
Upvotes: 4