hue.y
hue.y

Reputation: 9

Gradient Descent algorithm not converging for linear regression

I am trying to implement a gradient descent algorithm to minimize the parameters of the line of best fit for a ML class. I am minimizing the cost function. Here is what I have:

Here is the data:

    year    dipnet days fished  dipnet sockeye harvest
0   1996    10503   102821
1   1997    11023   114619
2   1998    10802   103847
3   1999    13738   149504
4   2000    12354   98262
5   2001    14772   150766
6   2002    14840   180028
7   2003    15263   223580
8   2004    18513   262831
9   2005    20977   295496
10  2006    12685   127630
11  2007    21908   291270
12  2008    20772   234109
13  2009    26171   339993
14  2010    28342   389552
15  2011    32818   537765
16  2012    34374   526992
17  2013    33193   347222
18  2014    36380   379823

and the code...

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv("D:/Assignment 1/Exercise1/dip-har-eff.csv")
data.head()

    year days fished sockeye harvest
0   1996    10503   102821
1   1997    11023   114619
2   1998    10802   103847
3   1999    13738   149504
4   2000    12354   98262

np_data = data.values
harvest = np_data[:, 2]
days = np_data[:, 1]

plt.scatter(days, harvest)

start = np.array([0,0])         #the starting values for b_0 and b_1
step = .01                      #the gradient multiplier
iterations = 30                 #number of iteration for the algorithm
batch_size = 3                 #the batch size

X = days[0:batch_size]
Y = harvest[0:batch_size]

def del_cost(b_0, b_1):
    error_x = []
    error_y = []
    for i in range(0, batch_size):
        e = (b_1*X[i] + b_0) - Y[i]
        error_x.append(e)
        f = ((b_1*X[i] + b_0) - Y[i])*X[i]
        error_y.append(f)
    d_x = (1/batch_size)*np.sum(error_x)
    d_y = (1/batch_size)*np.sum(error_y)
    return np.array([d_x, d_y])

for i in range(iterations):
    temp = start
    start = start - step*(del_cost(temp[0], temp[1]))
    print(start[0])
    print(start[1])

The output is...

1070.95666667
11550431.6467
-1244672383.04
-1.3417834015e+13
1.44590456124e+15
1.55871598694e+19
-1.67967091608e+21
-1.81072110836e+25
1.9512314035e+27
2.10346911158e+31
-2.26669638292e+33
-2.44354709455e+37
2.63316410506e+39
2.83860712307e+43
-3.05888042899e+45
-3.29753840927e+49
3.55342436154e+51
3.83066732703e+55
-4.12792359372e+57
-4.44998976483e+61
4.79530488393e+63
5.16944104421e+67
-5.57058492188e+69
-6.00520947728e+73
6.47120821783e+75
6.97610061854e+79
-7.51743962004e+81
-8.10396040706e+85
8.73282029238e+87
9.41416672011e+91
-1.01446974121e+94
-1.09362003986e+98
1.17848395063e+100
1.27043085931e+104
-1.36901512729e+106
-1.47582753557e+110
1.590350397e+112
1.7144316818e+116
-1.84747000587e+118
-1.99161210964e+122
2.14615938035e+124
2.31360563233e+128
-2.4931393047e+130
-2.68765739877e+134
2.8962171447e+136
3.12218391597e+140
-3.36446252059e+142
-3.62696242817e+146
3.90841138178e+148
4.21335091378e+152
-4.54030307537e+154
-4.89454365031e+158
5.27435574267e+160
5.68586809762e+164
-6.12708624038e+166
-6.60512977987e+170
7.11768178497e+172
7.67301292606e+176
-8.26843168262e+178
-8.91354588413e+182

I don't know why the parameters are 1) growing and not settling towards the mins, and 2) alternating signs each time. I have checked the calculations by hand for the first couple of iterations and they were correct. I can't figure out what is wrong, please help!!

Upvotes: 0

Views: 549

Answers (2)

Faraz Gerrard Jamal
Faraz Gerrard Jamal

Reputation: 248

Scaling the data could also help.You can use scikit learn for that.

Also, reduce the step size and see.

Upvotes: 0

S. Heilliette
S. Heilliette

Reputation: 1

Your step parameter is too large. You need to decrease it a lot. Try values like 0.001 or 0.0001

Upvotes: 0

Related Questions