user11141880
user11141880

Reputation:

Python numpy and pandas for machine learning

just as a disclaimer, this question is in regards to schoolwork. Literally though, my professor said to use this site for aid.

I am taking machine learning, and while our professor is a brilliant mathematician, he may be a little lacking on the programming side of things.

The name of the game here is to read code portions and find/fix the mistake.

I spend hours on this part, and I reckon my issue is having a dot product between a dataframe and numpy zeroes.

Issues occur like unsupported operand type(s) for +: 'float' and 'str'

I tried reading documentation and this site for a workaround, but I very novice in programming, especially a library like numpy and pandas

This is programming with python pandas

# Initialize the paarmeter set theta with zeros length as equal to column size in X
X = pd.DataFrame(X)
theta = np.zeros(X.shape[1], dtype = int)
print(theta)


def cost_function(X, y, theta):
    """
    cost_function(X, y, theta) computes the cost of using theta as the
    parameter for linear regression to fit the data points in X and y
    """
    ## number of training examples
    m = len(y) 

    ## Calculate the cost with the given parameters
    J = 1/(2*m)*np.sum((X.dot(theta)-y)**2)

    return J

#Initial cost
cost_function(X,y,theta)

running the last line yields the most issues.

Upvotes: 0

Views: 314

Answers (1)

Subhayan Samanta
Subhayan Samanta

Reputation: 51

Just use np.dot(X,theta) instead of X.dot(theta). Here is the edited code.

def cost_function(X, y, theta):
"""
cost_function(X, y, theta) computes the cost of using theta as the
parameter for linear regression to fit the data points in X and y
"""
## number of training examples
    m = len(y) 

## Calculate the cost with the given parameters
    J = 1/(2*m)*np.sum((np.dot(X,theta)-y)**2)

    return J

#Initial cost
cost_function(X,y,theta)

Hope this will help you.

Upvotes: 1

Related Questions