Reputation: 37
I am a brand new to programming and am taking a course in Python. I was asked to do linear regression on a data set that my professor gave out. Below is the program I have written (it doesn't work).
from math import *
f=open("data_setshort.csv", "r")
data = f.readlines()
f.close()
xvalues=[]; yvalues=[]
for line in data:
x,y=line.strip().split(",")
x=float(x.strip())
y=float(y.strip())
xvalues.append(x)
yvalues.append(y)
def regression(x,y):
n = len(x)
X = sum(x)
Y = sum(y)
for i in x:
A = sum(i**2)
return A
for i in x:
for j in y:
C = sum(x*y)
return C
return C
D = (X**2)-nA
m = (XY - nC)/D
b = (CX - AY)/D
return m,b
print "xvalues:", xvalues
print "yvalues:", yvalues
regression(xvalues,yvalues)
I am getting an error that says: line 23, in regression, A = sum (I**2). TypeError: 'float' object is not iterable.
I need to eventually create a plot for this data set (which I know how to do) and for the line defined by the regression. But for now I am trying to do linear regression in Python.
Upvotes: 0
Views: 2784
Reputation: 35
You should probably put in something like A += i**2
As you must understand from the error message that you cannot iterate over a float, which means if i=2
you can't iterate over it as it is not a list, but if as you need to sum all the squares of x, you are iterating over x in for i in x
and then you add the squares of i i**2
to A A+=i**2
adn then you return A.
Hope this helps!
Upvotes: 0
Reputation: 2386
You can't sum over a single float, but you can sum over lists. E. g. you probably mean A = sum([xi**2 for xi in x])
to calculate Sum of each element in x to the power of 2
. You also have various return
statements in your code that don't really make any sense and can probably be removed completely, e. g. return C
after the loop. Additionally, multiplication of two variables a
and b
can only be done by using a*b
in python. Simply writing ab
is not possible and will instead be regarded as a single variable with name "ab".
The corrected code could look like this:
def regression(x,y):
n = len(x)
X = sum(x)
Y = sum(y)
A = sum([xi**2 for xi in x])
C = sum([xi*yi for xi, yi in zip(x,y)])
D = X**2 - n*A
m = (X*Y - n*C) / float(D)
b = (C*X - A*Y) / float(D)
return (m, b)
Upvotes: 1