Reputation: 6159
It seems that tf.gradients
allows to compute also Jacobians, i.e. the partial derivatives of each entry of one tensor wrt. each entry of another tensor, while tf.train.Optimizer.compute_gradient
only computes actual gradients, e.g. the partial derivatives of a scalar value wrt. each entry of a particular tensor or wrt. one particular scalar. Why is there a separate function if tf.gradients
also implements that functionality?
Upvotes: 4
Views: 2252
Reputation: 59731
tf.gradients
does not allow you to compute Jacobians, it aggregates the gradients of each input for every output (something like the summation of each column of the actual Jacobian matrix). In fact, there is no "good" way of computing Jacobians in TensorFlow (basically you have to call tf.gradients
once per output, see this issue).
With respect to tf.train.Optimizer.compute_gradients
, yes, its result is basically the same, but taking care of some details automatically and with slightly more convenient output format. If you look at the implementation, you will see that, at its core, is a call to tf.gradients
(in this case aliased to gradients.gradients
), but it is useful for optimizer implementations to have the surrounding logic already implemented. Also, having it as a method allows for extensible behaviour in subclasses, either to implement some kind of optimization strategy (not very likely at the compute_gradients
step, really) or for auxiliary purposes, like tracing or debugging.
Upvotes: 5