Custom Gradients in Tensor Flow - Unable to understand this example

Question

I keep thinking that I am about to understand custom gradients but then I test it out this example and I can just not figure out what is going on. I am hoping somebody can walk me through what exactly is happening below. I think this essentially is down to me not understanding specifically what "dy" is in the backward function.

v = tf.Variable(2.0)
with tf.GradientTape() as t:
    x = v*v 
    output = x**2
print(t.gradient(output, v)) 
**tf.Tensor(32.0, shape=(), dtype=float32)**

Everything is good here and the gradient is as one would expect. I then test out this example using custom gradients which (given my understanding) could not possibly affect the gradient given I have put in this massive threshold in clip_by_norm

@tf.custom_gradient
def clip_gradients2(y):
    def backward(dy):
        return tf.clip_by_norm(dy, 20000000000000000000000000)
    return y**2, backward

v = tf.Variable(2.0) 
with tf.GradientTape() as t: 
    x=v*v
    
    output = clip_gradients2(x) 


print(t.gradient(output, v))

tf.Tensor(4.0, shape=(), dtype=float32)

But it is reduced to 4, so this is somehow having an effect. How exactly is this resulting in a smaller gradient?

Custom Gradients in Tensor Flow - Unable to understand this example

Answers (1)

Related Questions