Sharan Narasimhan
Sharan Narasimhan

Reputation: 47

How to work with gradients obtained from Optimizer.compute_gradients?

Please excuse my vague explanation as I am very new to TensorFlow. Thanks a lot in advance for any help!

I want to compute the gradients w.r.t input variable using the compute_gradients() function in the optimizer class and seem to have been able to run the Op to do so without errors. The gradients tuple obtained after the Op is a list of Tensor objects that I want to evaluate and convert to a list.

def get_gradients(checkpoint, x_test):


    model, predicted_y = load_and_predict(checkpoint, x_test)

    optimizer_here = model.gradients

    cost_here = model.cost

    gradients, variables = zip(*optimizer_here.compute_gradients(cost_here))

    opt = optimizer_here.apply_gradients(list(zip(gradients, variables)))

    with tf.Session() as sess:

        init = tf.global_variables_initializer()

        sess.run(init)
        test_state = sess.run(model.initial_state)


        feed = {model.inputs: x_test,
                model.labels: predicted_y[:, None], #coverting 1d to 2d array
                model.keep_prob: dropout,
                model.initial_state: test_state}

        sess.run(opt, feed_dict=feed)



        for i in range(len(gradients)):
            if(i == 0): # first object of gradients tuple is always Indexed Slices"
                continue
            print(sess.run(gradients[i].eval()))

I think the opt operation is successfully being evaluated in my session and updating the gradients and variables are being updated but when I try and evaluate a tensor from my gradients list the following error is being produced:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'inputs/inputs' with dtype int32 and shape [?,?]
     [[{{node inputs/inputs}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/sharan/attention-inter/ltsm_baseline.py", line 499, in <module>
    get_gradients(checkpoint, x_test[0:250])
  File "/Users/sharan/attention-inter/ltsm_baseline.py", line 114, in get_gradients
    print(sess.run(gradients[i].eval()))
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 695, in eval
    return _eval_using_default_session(self, feed_dict, self.graph, session)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 5181, in _eval_using_default_session
    return session.run(tensors, feed_dict)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'inputs/inputs' with dtype int32 and shape [?,?]
     [[node inputs/inputs (defined at /Users/sharan/attention-inter/ltsm_baseline.py:131) ]]

I figured the error is trying to say that I need to supply the tensor with inputs to evaluate, but shouldn't they already be evaluated after running opt?

Upvotes: 0

Views: 506

Answers (1)

y.selivonchyk
y.selivonchyk

Reputation: 9900

There are a few things to be layed out here. First, there is a strong distinction between an operation (op) and a matrix (input, output, variable value etc.). An op is typically a meta construct describing how to calculate a value/matrix given some input. Optimizer.compute_gradients() despite having compute in it's name does not compute anything, but constructs a set of gradients ops. These ops, of cause, will depend on the input. To reinforce this point further it might also worth mentioning that no data is stored in between session.run() calls.

Ideally, you would define your gradient modifications in Tensorflow operations in between these 2 lines:

gradients, variables = zip(*optimizer_here.compute_gradients(cost_here))
gradients, variables = [g+1 for g in gradients], variables
opt = optimizer_here.apply_gradients(list(zip(gradients, variables)))

And later you can learn about how your gradients look like when running them along with the training:

grads, _ = sess.run([gradients, opt], feed...)

If you want to modify the gradients on each batch in more manual way (with numpy or smth) you will have to read grads in one session run (just like the one above but without opt) and then create a set of operations to add your modified values to the variables in a separate session.run(assigns, feed_dict={modified_grads}). The caviat with this approach is that you can not use optimizer logic anymore, since it depends on TF operations.

Upvotes: 1

Related Questions