Reputation: 6728
It seems that the following code finds the gradient descent correctly:
def gradientDescent(x, y, theta, alpha, m, numIterations):
xTrans = x.transpose()
for i in range(0, numIterations):
hypothesis = np.dot(x, theta)
loss = hypothesis - y
cost = np.sum(loss ** 2) / (2 * m)
print("Iteration %d | Cost: %f" % (i, cost))
# avg gradient per example
gradient = np.dot(xTrans, loss) / m
# update
theta = theta - alpha * gradient
return theta
Now suppose we have the following sample data:
For the 1st row of sample data, we will have:
x = [2104, 5, 1, 45]
, theta = [1,1,1,1]
, y = 460
.
However, we are nowhere specifying in the lines :
hypothesis = np.dot(x, theta)
loss = hypothesis - y
which row of the sample data to consider. Then how come this code is working fine ?
Upvotes: 2
Views: 5419
Reputation: 1
a hypothesis y is represented by y = w0 + w1*x1 + w2*x2 + w3*x3 + ...... wn*xn where w0 is the intercept. How is the intercept figured out in hypothesis formula abose in np.dot(x, theta)
I am assuming X = data representing features. and theta can be an array like [1,1,1.,, ] of rowSize(data)
Upvotes: 0
Reputation: 545
This looks like a slide from Andrew Ng's excellent Machine Learning course!
The code works because you're using matrix types (from the numpy library?), and the basic operators (+, -, *, /) have been overloaded to perform matrix arithmetic - therefore you don't need to iterate over each row.
Upvotes: 2
Reputation: 752
First: Congrats on taking the course on Machine Learning on Coursera! :)
hypothesis = np.dot(x,theta)
will compute the hypothesis for all x(i) at the same time, saving each h_theta(x(i)) as a row of hypothesis
. So there is no need to reference a single row.
Same is true for loss = hypothesis - y
.
Upvotes: 3