Alex Lew
Alex Lew

Reputation: 2124

How does tf.gradients behave when passed a list of `ys` tensors?

How exactly does tf.gradients behave when passed a list of tensors as its first argument? Take this very small example:

a = tf.constant(5)
b = tf.constant(7)
c = a + 2 * b

If I compute the gradients of a single tensor, c, with respect to [a,b], I get the expected answer:

grads = tf.gradients(c, [a, b])
with tf.Session() as sess:
    sess.run(grads) # returns (1, 2)

According to the Tensorflow documentation, if you pass in a list of tensors as your first argument ys, tf.gradients will sum the gradients over that list, returning sum_over_ys(dy/dx) for each x in your second argument. So I would expect:

tf.gradients([a, b, c], [a, b])

to behave the same way as:

tf.gradients(a + b + c, [a, b])

Am I reading the docs wrong? When I test this code, I get the expected result [2, 3] for the second expression (explicitly summing a + b + c), but [2, 1] for the first. Where is this [2, 1] coming from?

Upvotes: 0

Views: 539

Answers (1)

P-Gn
P-Gn

Reputation: 24581

This is due to the fact that you are using tf.constant, which in theory shouldn't be affected by inputs.

If you replace your experiments with anything else (e.g. Variables) it works as expected.

When you apply an operator to the constant (be it addition, or even identity), you obtain a new tensor that is not constant, even though they depand on constants only -- and therefore you obtain the expected behavior.

Upvotes: 1

Related Questions