Reputation: 2124
How exactly does tf.gradients
behave when passed a list of tensors as its first argument? Take this very small example:
a = tf.constant(5)
b = tf.constant(7)
c = a + 2 * b
If I compute the gradients of a single tensor, c
, with respect to [a,b]
, I get the expected answer:
grads = tf.gradients(c, [a, b])
with tf.Session() as sess:
sess.run(grads) # returns (1, 2)
According to the Tensorflow documentation, if you pass in a list of tensors as your first argument ys
, tf.gradients
will sum the gradients over that list, returning sum_over_ys(dy/dx)
for each x
in your second argument. So I would expect:
tf.gradients([a, b, c], [a, b])
to behave the same way as:
tf.gradients(a + b + c, [a, b])
Am I reading the docs wrong? When I test this code, I get the expected result [2, 3]
for the second expression (explicitly summing a + b + c
), but [2, 1]
for the first. Where is this [2, 1]
coming from?
Upvotes: 0
Views: 539
Reputation: 24581
This is due to the fact that you are using tf.constant
, which in theory shouldn't be affected by inputs.
If you replace your experiments with anything else (e.g. Variables
) it works as expected.
When you apply an operator to the constant (be it addition, or even identity), you obtain a new tensor that is not constant
, even though they depand on constant
s only -- and therefore you obtain the expected behavior.
Upvotes: 1