ZKA
ZKA

Reputation: 97

minimize a function in Tensorflow

How can I get the gradients of a function using tf.gradients? the below is working when I use GradientDescentOptimizer.minimize(), tf.gradients seems to be evaluating 1 at the deriv of x^2+2 which is 2x

What am I missing ?

x = tf.Variable(1.0, trainable=True)
y = x**2 + 2

grad = tf.gradients(y, x)
#grad = tf.train.GradientDescentOptimizer(0.1).minimize(y)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    grad_value = sess.run(grad)
    print(grad_value)

Upvotes: 2

Views: 903

Answers (1)

Gerges
Gerges

Reputation: 6499

If I understand your question correctly, you want to find the value of x which minimizes x^2 + 2.

To do so, you need to repeatedly call GradientDescentOptimizer until x converges to the value which minimizes the function. This is because gradient descent is an iterative technique.

Also, in tensorflow, the method minimize of GradientDescentOptimizer does both computing the gradients and then applying them to the relevant variables (x in your case). So the code should look like this (Notice I commented the grad variable, which is not required unless you want to look at the gradient values):

x = tf.Variable(1.0, trainable=True)
y = x**2 + 2

# grad = tf.gradients(y, x)
grad_op = tf.train.GradientDescentOptimizer(0.2).minimize(y)

init = tf.global_variables_initializer()

n_iterations = 10
with tf.Session() as sess:
    sess.run(init)
    for i in range(n_iterations):
        _, new_x = sess.run([grad_op, x])
        print('Iteration:', i,', x:', new_x)

and you get:

Iteration: 0 , x: 1.0
Iteration: 1 , x: 0.6
Iteration: 2 , x: 0.36
Iteration: 3 , x: 0.216
Iteration: 4 , x: 0.07776
Iteration: 5 , x: 0.07776
Iteration: 6 , x: 0.046656
Iteration: 7 , x: 0.01679616
Iteration: 8 , x: 0.010077696
Iteration: 9 , x: 0.010077696

which you see in converging to the true answer which is 0.

If you increase the learning rate of GradientDescentOptimizer, from 0.2 to 0.4, it will converge to 0 much faster.

EDIT

OK, based on my new understanding of the question, to manually implement gradient descent, you cannot do x = x - alpha * gradient because this is python operation which simply replaces the object x. You need to tell tensorflow to add the op to the graph, and this can be done using x.assign. It will look like:

x = tf.Variable(1.0, trainable=True)
y = x**2 + 2

grad = tf.gradients(y, x)
# grad_op = tf.train.GradientDescentOptimizer(0.5).minimize(y)

update_op = x.assign(x - 0.2*grad[0])

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    for i in range(10):
        new_x = sess.run([update_op, x])
        print('Iteration:', i,', x:', new_x)

and we get the same answer as the native GradientDescentOptimizer:

Iteration: 0 , x: 1.0
Iteration: 1 , x: 0.6
Iteration: 2 , x: 0.36
Iteration: 3 , x: 0.1296
Iteration: 4 , x: 0.1296
Iteration: 5 , x: 0.077759996
Iteration: 6 , x: 0.046655998
Iteration: 7 , x: 0.027993599
Iteration: 8 , x: 0.01679616
Iteration: 9 , x: 0.010077696

Upvotes: 5

Related Questions