ArekBulski
ArekBulski

Reputation: 5078

Gradient Descent diverges, learning rate too high

There is a piece of code below, which does GD step by step but theta is diverging. What could be wrong?

X = arange(100)
Y = 50 + 4*X + uniform(-20, 20, X.shape)

theta = array([0,0])
alpha = 0.001
# one step of GD
theta0 = theta[0] - alpha * sum( theta[0]+theta[1]*x-y    for x,y in zip(X,Y))/len(X)
theta1 = theta[1] - alpha * sum((theta[0]+theta[1]*x-y)*x for x,y in zip(X,Y))/len(X)
theta = [theta0, theta1]

Upvotes: 0

Views: 116

Answers (1)

ArekBulski
ArekBulski

Reputation: 5078

Learning rate was too high.

alpha = 0.0001

Upvotes: 1

Related Questions