Reputation:
just as a disclaimer, this question is in regards to schoolwork. Literally though, my professor said to use this site for aid.
I am taking machine learning, and while our professor is a brilliant mathematician, he may be a little lacking on the programming side of things.
The name of the game here is to read code portions and find/fix the mistake.
I spend hours on this part, and I reckon my issue is having a dot product between a dataframe and numpy zeroes.
Issues occur like unsupported operand type(s) for +: 'float' and 'str'
I tried reading documentation and this site for a workaround, but I very novice in programming, especially a library like numpy and pandas
This is programming with python pandas
# Initialize the paarmeter set theta with zeros length as equal to column size in X
X = pd.DataFrame(X)
theta = np.zeros(X.shape[1], dtype = int)
print(theta)
def cost_function(X, y, theta):
"""
cost_function(X, y, theta) computes the cost of using theta as the
parameter for linear regression to fit the data points in X and y
"""
## number of training examples
m = len(y)
## Calculate the cost with the given parameters
J = 1/(2*m)*np.sum((X.dot(theta)-y)**2)
return J
#Initial cost
cost_function(X,y,theta)
running the last line yields the most issues.
Upvotes: 0
Views: 314
Reputation: 51
Just use np.dot(X,theta)
instead of X.dot(theta)
. Here is the edited code.
def cost_function(X, y, theta):
"""
cost_function(X, y, theta) computes the cost of using theta as the
parameter for linear regression to fit the data points in X and y
"""
## number of training examples
m = len(y)
## Calculate the cost with the given parameters
J = 1/(2*m)*np.sum((np.dot(X,theta)-y)**2)
return J
#Initial cost
cost_function(X,y,theta)
Hope this will help you.
Upvotes: 1