Reputation: 289
I am studying gradient descent
method with Deep learning from scratch. In the book example, there are some code that is hard to understand. this is the code.
def gradient_descent(f, init_x, lr = 0.01, step_num = 100):
x = init_x
x_hist = []
for i in range(step_num):
x_hist.append(x) # plot with x_hist
grad = numerical_gradient(f, x)
x -= lr * grad
return x, x_hist
def function_2(x):
return x[0]**2 + x[1]**2
init_x = np.array([-3.0, 4.0])
x, x_hist = gradient_descent(function_2, init_x, lr = 0.1, step_num = 100)
I'm try to plot x_hist
to see the decrease of 'x'. But when I print x_hist
, it comes like this.
x_hist
[array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10]),
array([-6.11110793e-10, 8.14814391e-10])]
I can fix this problem if i change x_hist.append(x)
to x_hist.append(x.copy())
.
Unfortunately, I don't know why this is different. Can anyone tell me the different between those?(Sorry for the English)
Upvotes: 0
Views: 84
Reputation: 173
Your list x_hist contains a reference to x, not the value. So correcting it by x_hist.append(x.copy()) is a good way.
Upvotes: 1