varun
varun

Reputation: 13

custom BCE loss giving undefined results

I tried to implement a Logistic regression model in Pytorch in a dataset which have 2 features.
I had already done the similar thing in only numpy.
But as I am learning Pytorch, I thought of using Pytorch to calculate the gradients itself instead of calculating it manually.

CODE:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
from sklearn.datasets import make_blobs

# making dataset
dataset = make_blobs(n_samples=200)
X = dataset[0]
y = dataset[1]

# for plotting the points
# plt.plot(X[y==0][:,0], X[y==0][:,1], '.')
# plt.plot(X[y==1][:,0], X[y==1][:,1], '.')

# making data the right shape
# X -> (n, m)
# y -> (1, m)
X = dataset[0].T
y = dataset[1].reshape(1, -1)

# converting to tensors 
X = torch.from_numpy(X)
y = torch.from_numpy(y)
X = X.type(torch.float32)
y = y.type(torch.float32)

# initializing weights, bias
n = X.shape[0]
m = X.shape[1]
W = torch.randn(n, 1, requires_grad = True)
b = torch.randn(1, requires_grad= True)

def sigmoid(x):
    return 1/(1 + torch.exp(-x))

def cost(X, y, W, b, m):
    fn = sigmoid( torch.matmul(W.T, X) + b )
    cost1 = y*torch.log(fn)
    cost2 = (1-y)*torch.log(1-fn)
    return (-1/m) * torch.sum(cost1 + cost2)

def logistic_regression(X, y, W, b, epochs=1000, learning_rate=0.0001):
    n = X.shape[0]
    m = X.shape[1]
    Y = y
    losses = []
    for i in range(epochs):
        loss = cost(X, y, W, b, m);
        losses.append(loss.item())
        
        loss.backward()
        
        with torch.no_grad():
            W -= learning_rate * W.grad
            b -= learning_rate * b.grad
            
        W.grad.zero_()
        b.grad.zero_()
        
    return losses

losses = logistic_regression(X, y, W, b)

The problem here is that sometimes this works fine but sometimes the values inside the losses is null, -inf. and sometimes the loss is negative.
Can anyone tell me what is the error here?

Some code for plotting decision boundary:

# To plot the decision boundary

theta = W.detach().numpy()
bias = b.detach().numpy()

X = dataset[0]
y = dataset[1]

# Do change the value of test_points according to the dataset
test_points = np.linspace(-10, 10, 100)
# the equation is -> theta0 * x1 + theta1 * x2 + b = 0
# we are plotting x1 (x-axis) vs x2 (y-axis)
# so x2 = (-theta0 * x1 - b)/theta1

y_points = (-theta[0]*test_points - bias)/theta[1]
plt.plot(X[y==0][:,0], X[y==0][:,1], '.')
plt.plot(X[y==1][:,0], X[y==1][:,1], '.')
plt.plot(test_points, y_points)

Upvotes: 0

Views: 118

Answers (1)

Alexey Birukov
Alexey Birukov

Reputation: 1660

The probable source of trouble here is sigmoid implementation.

def sigmoid(x):
    return 1/(1 + torch.exp(-x))

If x < ~-20 you get 1/inf, theoretically it is ok, but usually not, see optimal way of defining a numerically stable sigmoid function for a list in python

Upvotes: 1

Related Questions