Reputation: 97
How can I get the gradients of a function using tf.gradients? the below is working when I use GradientDescentOptimizer.minimize(), tf.gradients seems to be evaluating 1 at the deriv of x^2+2 which is 2x
What am I missing ?
x = tf.Variable(1.0, trainable=True)
y = x**2 + 2
grad = tf.gradients(y, x)
#grad = tf.train.GradientDescentOptimizer(0.1).minimize(y)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
grad_value = sess.run(grad)
print(grad_value)
Upvotes: 2
Views: 903
Reputation: 6499
If I understand your question correctly, you want to find the value of x
which minimizes x^2 + 2
.
To do so, you need to repeatedly call GradientDescentOptimizer
until x
converges to the value which minimizes the function. This is because gradient descent is an iterative technique.
Also, in tensorflow, the method minimize
of GradientDescentOptimizer
does both computing the gradients and then applying them to the relevant variables (x
in your case). So the code should look like this (Notice I commented the grad
variable, which is not required unless you want to look at the gradient values):
x = tf.Variable(1.0, trainable=True)
y = x**2 + 2
# grad = tf.gradients(y, x)
grad_op = tf.train.GradientDescentOptimizer(0.2).minimize(y)
init = tf.global_variables_initializer()
n_iterations = 10
with tf.Session() as sess:
sess.run(init)
for i in range(n_iterations):
_, new_x = sess.run([grad_op, x])
print('Iteration:', i,', x:', new_x)
and you get:
Iteration: 0 , x: 1.0
Iteration: 1 , x: 0.6
Iteration: 2 , x: 0.36
Iteration: 3 , x: 0.216
Iteration: 4 , x: 0.07776
Iteration: 5 , x: 0.07776
Iteration: 6 , x: 0.046656
Iteration: 7 , x: 0.01679616
Iteration: 8 , x: 0.010077696
Iteration: 9 , x: 0.010077696
which you see in converging to the true answer which is 0.
If you increase the learning rate of GradientDescentOptimizer
, from 0.2 to 0.4, it will converge to 0 much faster.
EDIT
OK, based on my new understanding of the question, to manually implement gradient descent, you cannot do x = x - alpha * gradient
because this is python operation which simply replaces the object x
. You need to tell tensorflow to add the op to the graph, and this can be done using x.assign
. It will look like:
x = tf.Variable(1.0, trainable=True)
y = x**2 + 2
grad = tf.gradients(y, x)
# grad_op = tf.train.GradientDescentOptimizer(0.5).minimize(y)
update_op = x.assign(x - 0.2*grad[0])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(10):
new_x = sess.run([update_op, x])
print('Iteration:', i,', x:', new_x)
and we get the same answer as the native GradientDescentOptimizer
:
Iteration: 0 , x: 1.0
Iteration: 1 , x: 0.6
Iteration: 2 , x: 0.36
Iteration: 3 , x: 0.1296
Iteration: 4 , x: 0.1296
Iteration: 5 , x: 0.077759996
Iteration: 6 , x: 0.046655998
Iteration: 7 , x: 0.027993599
Iteration: 8 , x: 0.01679616
Iteration: 9 , x: 0.010077696
Upvotes: 5