How do tf.gradients() work?

Question

I am fairly new to tensorflow, i have seen some tutorials but i dont know how tf.gradients() works. if i give it an input of two 2D matrices, how will it compute the partial derivatives? i am really confused ,please help me if you anyone could, it would be of a great help.

import tensorflow as tf
import numpy as np

X = np.random.rand(3,3)
y = np.random.rand(2,2)

grad = tf.gradients(X,y)

with tf.Session() as sess:
    sess.run(grad)
    print(grad)

this gives an error:

Traceback (most recent call last): File "C:/Users/Sandeep IPK/PycharmProjects/tests/samples2.py", line 10, in sess.run(grad) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 767, in run run_metadata_ptr) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 952, in _run fetch_handler = _FetchHandler(self._graph, fetches, feed_dict_string) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 408, in init self._fetch_mapper = _FetchMapper.for_fetch(fetches) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 230, in for_fetch return _ListFetchMapper(fetch) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 337, in init self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches] File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 337, in self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches] File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages ensorflow\python\client\session.py", line 227, in for_fetch (fetch, type(fetch))) TypeError: Fetch argument None has invalid type

Process finished with exit code 1

Iakov Davydov · Accepted Answer

TensorFlow uses reverse accumulation which is based on the chain rule, to compute the gradient value at point. In order to compute gradient of function with respect to a variable you have to define both. Also you have to specify value at which you want to compute the gradient. In this example you compute gradient of y=x**2+x+1 with respect to x at 2:

#!/usr/bin/env python3
import tensorflow as tf

x = tf.Variable(2.0)
y = x**2 + x - 1

grad = tf.gradients(y, x)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    grad_value = sess.run(grad)
    print(grad_value)

# output: [5.0]

It is also possible to compute a gradient in case your variable is a matrix. In such case the gradient will be also a matrix. Here we use a simple case when the function depends on the sum of all matrix elements:

#!/usr/bin/env python3
import tensorflow as tf

X = tf.Variable(tf.random_normal([3, 3]))
X_sum = tf.reduce_sum(X)
y = X_sum**2 + X_sum - 1

grad = tf.gradients(y, X)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    grad_value = sess.run(grad)
    print(grad_value)

# output: [array([[ 9.6220665,  9.6220665,  9.6220665],
#   [ 9.6220665,  9.6220665,  9.6220665],
#   [ 9.6220665,  9.6220665,  9.6220665]], dtype=float32)]

How do tf.gradients() work?

Answers (1)

Related Questions