iamnottheway
iamnottheway

Reputation: 226

regression line does't fit the data and gradient descent gives inaccurate weights - python3

I've implemented Linear regression and gradient descent from scratch and it gives me weird results like negative numbers which are really small.

Sample data

   609.0,241.0
   629.0,222.0
   620.0,233.0
   564.0,207.0
   645.0,247.0
   493.0,189.0
   606.0,226.0
   672.0,231.0
   778.0,263.0

Gray Kangaroos dataset

Sample Data can be found at : http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/slr/frames/frame.html

import numpy as np 
import matplotlib.pyplot as plt

# loading data from a csv file
x_dataset = np.array(data[0],dtype = np.float64)
y_dataset = np.array(data[1],dtype = np.float64)
m = len(y_dataset)
theta  = np.array([ 0 for z in range(len(x_dataset))],dtype = np.float64)
theta[0] = 0.5
theta[1] = 0.3

def hypothesis(x,theta_hyp):
    hyp =  np.dot(theta_hyp.T,x)
    return hyp

def gradient(theta,x,y,numIter = 30,alpha = 0.00000001):
    for i in range(numIter):
        loss = y - hypothesis(x,theta)
        error = np.sum(loss**2)/2*m
        print("Cost : {0} at {1} itertion".format(error,i))
        # just to plot the cost function 
        #cost_list.append(error)
        #iter_list.append(i)
        gradientD = np.dot(x.T,loss)
       # here if I subtract it gives me negative results
       theta = theta - alpha*gradientD 
   return theta

After playing with the problem I figured out that if theta is negative, the cost function increases. And if theta is positive, cost function decreases. I wanted the cost function to decrease so I changed the code a bit which gave me a positive theta and decreasing cost function.

  # adding gives +ve theta
  theta = theta + alpha*gradientD 

I plotted the graph of the cost function

Cost function J(theta): x axis - iterations | y axis - cost

After training it gives me some weights. When I use the weights to predict y it doesn't predict a good value. When I plot the regression line on the graph, it doesn't fit the data at all.

regression line plotted on the data

I'm still learning about this stuff and I'm not sure if my implementation is right. Also, my learning rate is really small. I've seen learning rates no smaller than 0.001. I used 0.001 as the learning rate but it gives an inaccurate cost function.

I'm not sure if I've been explicit, but I'd really appreciate your help.

Upvotes: 0

Views: 264

Answers (1)

T3am5hark
T3am5hark

Reputation: 866

You've got error and loss defined backwards... the error is the difference between the prediction and the data, and the loss function maps that error onto an objective for a fit routine. Your gradient calc is roughly correct (although it's not scaled to how you've defined the loss function, and the "loss" term in the gradient calc is actually the error).

However, your value of alpha (step size) is extremely small, which will impact speed of convergence. Because you only allow 30 iterations, it may not converge (it clearly starts in a really bad place with loss=6e7 - it's not clear from the scale of the graph how close to zero it gets by the 30th iteration). Try upping the alpha value to see if it gets closer to its final value in the 30 iterations allowed (based on the end-state loss value). Right now your graph of loss vs. iteration is getting swamped by the very high value of initial-state loss (plotting the log of the loss or log10 of the loss may make it easier to compare across experiments).

Upvotes: 0

Related Questions