Reputation: 31
import tensorflow as tf
def f(x):
return tf.multiply(x, x)
x = tf.Variable([3.])
with tf.GradientTape() as test_tape:
test_tape.watch(x)
with tf.GradientTape() as train_tape:
train_tape.watch(x)
fx = f(x)
gradient = train_tape.gradient(fx, x) # df(x)/x = d(x^2)/dx = 2x
x_prime = x.__copy__() # x' = x
x_prime = tf.subtract(x_prime, tf.multiply(gradient, 0.01)) # x' = x' - 0.01 * 2x = 0.98x
fx_prime = f(x_prime)
gradient = test_tape.gradient(fx_prime, x) # df(x')/dx = df(0.98x)/dx = 1.9208 * x = 5.7624
print(gradient)
I'm learning tensorflow2.0 GradientTape() and testing this code, which calculate a second derivative d(x-0.01*df(x)/dx)/dx. Given x = 3 and f(x) = x*x, the result is 5.7624. And the code above get the right answer. Then I tried to replace the line
x_prime = tf.subtract(x_prime, tf.multiply(gradient, 0.01))
by
optimizer = tf.optimizers.SGD()
optimizer.apply_gradients(zip([gradient], [x_prime]))
And got the wrong answer 5.88, I can't get this around and guess GradientTape does not trace apply_gradients? Does anybody know why?
python-3.7, tensorflow-2.0.0
Upvotes: 2
Views: 230
Reputation: 31
OK, I get the answer by myself. optimizer.apply_gradients operation does not generate a node in the Graph, it just changes the value in the original memory space, so there will be no connection between x_prime and previous nodes. Besides, some other operations or functions also does not work in GradientTape, like tf.Varaible().assign(), .assign_add(), .assign_sub(), tf.keras.Layers.Layer.set_weights(), .etc.
Upvotes: 1