Reputation: 163
I am using a model training program I have built for a toy example and trying to use it on another example. The only difference is this model was used for regression, hence I was using MSE as the error criterion, and now it is used for binary classification, hence I am using BCEWithLogitsLoss.
The model is very simple:
class Model(nn.Module):
def __init__(self, input_size, output_size):
super(Model, self).__init__()
self.fc1 = nn.Sequential(
nn.Linear(input_size, 8*input_size),
nn.PReLU() #parametric relu - same as leaky relu except the slope is learned
)
self.fc2 = nn.Sequential(
nn.Linear(8*input_size, 80*input_size),
nn.PReLU()
)
self.fc3 = nn.Sequential(
nn.Linear(80*input_size, 32*input_size),
nn.PReLU()
)
self.fc4 = nn.Sequential(
nn.Linear(32*input_size, 4*input_size),
nn.PReLU()
)
self.fc = nn.Sequential(
nn.Linear(4*input_size, output_size),
nn.PReLU()
)
def forward(self, x, dropout=dropout, batchnorm=batchnorm):
x = self.fc1(x)
x = self.fc2(x)
x = self.fc3(x)
x = self.fc4(x)
x = self.fc(x)
return x
And this is where I run it:
model = Model(input_size, output_size)
if (loss == 'MSE'):
criterion = nn.MSELoss()
if (loss == 'BCELoss'):
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = lr)
model.train()
for epoch in range(num_epochs):
# Forward pass and loss
train_predictions = model(train_features)
print(train_predictions)
print(train_targets)
loss = criterion(train_predictions, train_targets)
# Backward pass and update
loss.backward()
optimizer.step()
# zero grad before new step
optimizer.zero_grad()
train_size = len(train_features)
train_loss = criterion(train_predictions, train_targets).item()
pred = train_predictions.max(1, keepdim=True)[1]
correct = pred.eq(train_targets.view_as(pred)).sum().item()
#train_loss /= train_size
accuracy = correct / train_size
print('\nTrain set: Loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
train_loss, correct, train_size,
100. * accuracy))
However, when I print the loss, for some reason the loss already starts very low (around 0.6) before I have done any backwards pass! It remains this low all subsequent epochs. The prediction vector, however, looks like random garbage...
tensor([[-0.0447],
[-0.0640],
[-0.0564],
...,
[-0.0924],
[-0.0113],
[-0.0774]], grad_fn=<PreluBackward>)
tensor([[0.],
[0.],
[0.],
...,
[0.],
[0.],
[1.]])
epoch: 1, loss = 0.6842
I have no clue why is it doing that, and would appriciate any help. Thanks!
EDIT: I added the params if they can help anyone figuring this out:
if (dataset == 'adult_train.csv'):
input_size=9
print_every = 1
output_size = 1
lr = 0.001
num_epochs = 10
loss='BCELoss'
EDIT2: Added accuracy calculation in the middle block
Upvotes: 1
Views: 360
Reputation: 22214
BCELoss is not error.
The entropy of a Bernoulli distribution with p=0.5 is -ln(0.5) = 0.693. This is the loss you would expect if
or
Your model is in the second case. The network is currently guessing slightly negative logits for every prediction. Those will be interpreted as 0 class predictions. Since it seems your data is imbalanced towards 0 labels your accuracy will be the same as a model that always predicts 0. This is just an artifact of random weight initialization. If you keep reinitializing your model you'll find that sometimes it will always predict 1 too.
Upvotes: 2