Reputation: 13
I tried to implement a Logistic regression model in Pytorch in a dataset which have 2 features.
I had already done the similar thing in only numpy.
But as I am learning Pytorch, I thought of using Pytorch to calculate the gradients itself instead of calculating it manually.
CODE:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
from sklearn.datasets import make_blobs
# making dataset
dataset = make_blobs(n_samples=200)
X = dataset[0]
y = dataset[1]
# for plotting the points
# plt.plot(X[y==0][:,0], X[y==0][:,1], '.')
# plt.plot(X[y==1][:,0], X[y==1][:,1], '.')
# making data the right shape
# X -> (n, m)
# y -> (1, m)
X = dataset[0].T
y = dataset[1].reshape(1, -1)
# converting to tensors
X = torch.from_numpy(X)
y = torch.from_numpy(y)
X = X.type(torch.float32)
y = y.type(torch.float32)
# initializing weights, bias
n = X.shape[0]
m = X.shape[1]
W = torch.randn(n, 1, requires_grad = True)
b = torch.randn(1, requires_grad= True)
def sigmoid(x):
return 1/(1 + torch.exp(-x))
def cost(X, y, W, b, m):
fn = sigmoid( torch.matmul(W.T, X) + b )
cost1 = y*torch.log(fn)
cost2 = (1-y)*torch.log(1-fn)
return (-1/m) * torch.sum(cost1 + cost2)
def logistic_regression(X, y, W, b, epochs=1000, learning_rate=0.0001):
n = X.shape[0]
m = X.shape[1]
Y = y
losses = []
for i in range(epochs):
loss = cost(X, y, W, b, m);
losses.append(loss.item())
loss.backward()
with torch.no_grad():
W -= learning_rate * W.grad
b -= learning_rate * b.grad
W.grad.zero_()
b.grad.zero_()
return losses
losses = logistic_regression(X, y, W, b)
The problem here is that sometimes this works fine but sometimes the values inside the losses is null, -inf. and sometimes the loss is negative.
Can anyone tell me what is the error here?
Some code for plotting decision boundary:
# To plot the decision boundary
theta = W.detach().numpy()
bias = b.detach().numpy()
X = dataset[0]
y = dataset[1]
# Do change the value of test_points according to the dataset
test_points = np.linspace(-10, 10, 100)
# the equation is -> theta0 * x1 + theta1 * x2 + b = 0
# we are plotting x1 (x-axis) vs x2 (y-axis)
# so x2 = (-theta0 * x1 - b)/theta1
y_points = (-theta[0]*test_points - bias)/theta[1]
plt.plot(X[y==0][:,0], X[y==0][:,1], '.')
plt.plot(X[y==1][:,0], X[y==1][:,1], '.')
plt.plot(test_points, y_points)
Upvotes: 0
Views: 118
Reputation: 1660
The probable source of trouble here is sigmoid
implementation.
def sigmoid(x):
return 1/(1 + torch.exp(-x))
If x < ~-20
you get 1/inf, theoretically it is ok, but usually not, see optimal way of defining a numerically stable sigmoid function for a list in python
Upvotes: 1