Why am I getting a low error before I did any optimization?

Question

I am using a model training program I have built for a toy example and trying to use it on another example. The only difference is this model was used for regression, hence I was using MSE as the error criterion, and now it is used for binary classification, hence I am using BCEWithLogitsLoss.

The model is very simple:

class Model(nn.Module):
    def __init__(self, input_size, output_size):
        super(Model, self).__init__()
        self.fc1 = nn.Sequential( 
            nn.Linear(input_size, 8*input_size),
            nn.PReLU() #parametric relu - same as leaky relu except the slope is learned
        )
        self.fc2 = nn.Sequential( 
            nn.Linear(8*input_size, 80*input_size),
            nn.PReLU()
        )
        self.fc3 = nn.Sequential( 
            nn.Linear(80*input_size, 32*input_size),
            nn.PReLU()
        )
        self.fc4 = nn.Sequential( 
            nn.Linear(32*input_size, 4*input_size),
            nn.PReLU()
        )                   
        self.fc = nn.Sequential( 
            nn.Linear(4*input_size, output_size),
            nn.PReLU()
        )
                        

    def forward(self, x, dropout=dropout, batchnorm=batchnorm):
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        x = self.fc4(x)
        x = self.fc(x)

        return x

And this is where I run it:

model = Model(input_size, output_size)

if (loss == 'MSE'):
    criterion = nn.MSELoss()
if (loss == 'BCELoss'):
    criterion = nn.BCEWithLogitsLoss()

optimizer = torch.optim.SGD(model.parameters(), lr = lr)

model.train()
for epoch in range(num_epochs):
    # Forward pass and loss
    train_predictions = model(train_features)
    print(train_predictions)
    print(train_targets)


    loss = criterion(train_predictions, train_targets)
    
    # Backward pass and update
    loss.backward()
    optimizer.step()

    # zero grad before new step
    optimizer.zero_grad()


    train_size = len(train_features)
    train_loss = criterion(train_predictions, train_targets).item() 
    pred = train_predictions.max(1, keepdim=True)[1] 
    correct = pred.eq(train_targets.view_as(pred)).sum().item()
    #train_loss /= train_size
    accuracy = correct / train_size
    print('
Train set: Loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)
'.format(
        train_loss, correct, train_size,
        100. * accuracy))

However, when I print the loss, for some reason the loss already starts very low (around 0.6) before I have done any backwards pass! It remains this low all subsequent epochs. The prediction vector, however, looks like random garbage...

tensor([[-0.0447],
        [-0.0640],
        [-0.0564],
        ...,
        [-0.0924],
        [-0.0113],
        [-0.0774]], grad_fn=)
tensor([[0.],
        [0.],
        [0.],
        ...,
        [0.],
        [0.],
        [1.]])
epoch: 1, loss = 0.6842

I have no clue why is it doing that, and would appriciate any help. Thanks!

EDIT: I added the params if they can help anyone figuring this out:

if (dataset == 'adult_train.csv'):
    input_size=9
    print_every = 1
    output_size = 1
    lr = 0.001
    num_epochs = 10
    loss='BCELoss'

EDIT2: Added accuracy calculation in the middle block

jodag · Accepted Answer

BCELoss is not error.

The entropy of a Bernoulli distribution with p=0.5 is -ln(0.5) = 0.693. This is the loss you would expect if

Your data is evenly distributed
Your network is guessing randomly

or

Your network always predicts a uniform distribution

Your model is in the second case. The network is currently guessing slightly negative logits for every prediction. Those will be interpreted as 0 class predictions. Since it seems your data is imbalanced towards 0 labels your accuracy will be the same as a model that always predicts 0. This is just an artifact of random weight initialization. If you keep reinitializing your model you'll find that sometimes it will always predict 1 too.

Why am I getting a low error before I did any optimization?

Answers (1)

Related Questions