How can I assemble a new Operation from a subgraph, and apply gradient to the newly assembled operation

Question

I want to assemble a new Operation from a sub-graph (that made of several connected operation nodes). and then apply self-designed gradient on the new Operation. (the point is to ignore the gradient flow in the sub-graph, and bridge gradients from output tensor of the NEW OP to the input tensor of the NEW OP). Hope somebody can help!!

Yaroslav Bulatov · Accepted Answer

You can wrap your subgraph into a TensorFlow function, and specify custom gradient for that function, as done in

python/framework/function_test.py

    @function.Defun(dtype, dtype, dtype)
    def XentLossGrad(logits, labels, dloss):
      dlogits = array_ops.reshape(dloss, [-1, 1]) * (
          nn_ops.softmax(logits) - labels)
      dlabels = array_ops.zeros_like(labels)
      # Takes exp(dlogits) to differentiate it from the "correct" gradient.
      return math_ops.exp(dlogits), dlabels

    @function.Defun(dtype, dtype, grad_func=XentLossGrad)
    def XentLoss(logits, labels):
      return math_ops.reduce_sum(labels * math_ops.log(nn_ops.softmax(logits)),
                                 1)

How can I assemble a new Operation from a subgraph, and apply gradient to the newly assembled operation

Answers (1)

Related Questions