Reputation: 73
I am fairly new to tensorflow, i have seen some tutorials but i dont know how tf.gradients() works. if i give it an input of two 2D matrices, how will it compute the partial derivatives? i am really confused ,please help me if you anyone could, it would be of a great help.
import tensorflow as tf
import numpy as np
X = np.random.rand(3,3)
y = np.random.rand(2,2)
grad = tf.gradients(X,y)
with tf.Session() as sess:
sess.run(grad)
print(grad)
this gives an error:
Traceback (most recent call last): File "C:/Users/Sandeep IPK/PycharmProjects/tests/samples2.py", line 10, in sess.run(grad) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 767, in run run_metadata_ptr) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 952, in _run fetch_handler = _FetchHandler(self._graph, fetches, feed_dict_string) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 408, in init self._fetch_mapper = _FetchMapper.for_fetch(fetches) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 230, in for_fetch return _ListFetchMapper(fetch) File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 337, in init self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches] File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 337, in self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches] File "C:\Users\Sandeep IPK\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 227, in for_fetch (fetch, type(fetch))) TypeError: Fetch argument None has invalid type
Process finished with exit code 1
Upvotes: 2
Views: 6530
Reputation: 848
TensorFlow uses reverse accumulation which is based on the chain rule, to compute the gradient value at point. In order to compute gradient of function with respect to a variable you have to define both. Also you have to specify value at which you want to compute the gradient. In this example you compute gradient of y=x**2+x+1
with respect to x
at 2
:
#!/usr/bin/env python3
import tensorflow as tf
x = tf.Variable(2.0)
y = x**2 + x - 1
grad = tf.gradients(y, x)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
grad_value = sess.run(grad)
print(grad_value)
# output: [5.0]
It is also possible to compute a gradient in case your variable is a matrix. In such case the gradient will be also a matrix. Here we use a simple case when the function depends on the sum of all matrix elements:
#!/usr/bin/env python3
import tensorflow as tf
X = tf.Variable(tf.random_normal([3, 3]))
X_sum = tf.reduce_sum(X)
y = X_sum**2 + X_sum - 1
grad = tf.gradients(y, X)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
grad_value = sess.run(grad)
print(grad_value)
# output: [array([[ 9.6220665, 9.6220665, 9.6220665],
# [ 9.6220665, 9.6220665, 9.6220665],
# [ 9.6220665, 9.6220665, 9.6220665]], dtype=float32)]
Upvotes: 9