Reputation: 73
I am performing multivariate linear regression in pure python as seen from the code below.Can someone please tell me what's wrong in his code? I have done the same for univariate linear regression. It performed well in it!
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x_df=pd.DataFrame([[2.0,70.0],[3.0,30.0],[4.0,80.0],[4.0,20.0],[3.0,50.0],[7.0,10.0],[5.0,50,0],[3.0,90.0],[2.0,20.0]])
y_df=pd.DataFrame([79.4,41.5,97.5,36.1,63.2,39.5,69.8,103.5,29.5])
x_df=x_df.drop(x_df.columns[2:], axis=1)
#print(x_df)
m=len(y_df)
#print(m)
x_df['intercept']=1
X=np.array(x_df)
#print(X)
#print(X.shape)
y=np.array(y_df).flatten()
#print(y.shape)
theta=np.array([0,0,0])
#print(theta)
def hypothesis(x,theta):
return np.dot(x,theta)
#print(hypothesis(X,theta))
def cost(x,y,theta):
m=y.shape[0]
h=np.dot(x,theta)
return np.sum(np.square(y-h))/(2.0*m)
#print(cost(X,y,theta))
def gradientDescent(x,y,theta,alpha=0.01,iter=1500):
m=y.shape[0]
for i in range(1500):
h=hypothesis(x,theta)
error=h-y
update=np.dot(error,x)
theta=np.subtract(theta,((alpha*update)/m))
print('theta',theta)
print('hyp',h)
print('y',y)
print('error',error)
print('cost',cost(x,y,theta))
print(gradientDescent(X,y,theta))
and the output I get is :-
theta [ nan nan nan]
hyp [ nan nan nan nan nan nan nan nan nan]
y [ 79.4 41.5 97.5 36.1 63.2 39.5 69.8 103.5 29.5]
error [ nan nan nan nan nan nan nan nan nan]
cost nan
Can someone please help me in solving this? i have been struck like almost 5 hours trying!
Upvotes: 0
Views: 144