Reputation: 47
Please excuse my vague explanation as I am very new to TensorFlow. Thanks a lot in advance for any help!
I want to compute the gradients w.r.t input variable using the compute_gradients() function in the optimizer class and seem to have been able to run the Op to do so without errors. The gradients tuple obtained after the Op is a list of Tensor objects that I want to evaluate and convert to a list.
def get_gradients(checkpoint, x_test):
model, predicted_y = load_and_predict(checkpoint, x_test)
optimizer_here = model.gradients
cost_here = model.cost
gradients, variables = zip(*optimizer_here.compute_gradients(cost_here))
opt = optimizer_here.apply_gradients(list(zip(gradients, variables)))
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
test_state = sess.run(model.initial_state)
feed = {model.inputs: x_test,
model.labels: predicted_y[:, None], #coverting 1d to 2d array
model.keep_prob: dropout,
model.initial_state: test_state}
sess.run(opt, feed_dict=feed)
for i in range(len(gradients)):
if(i == 0): # first object of gradients tuple is always Indexed Slices"
continue
print(sess.run(gradients[i].eval()))
I think the opt
operation is successfully being evaluated in my session and updating the gradients
and variables
are being updated but when I try and evaluate a tensor from my gradients list the following error is being produced:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'inputs/inputs' with dtype int32 and shape [?,?]
[[{{node inputs/inputs}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/sharan/attention-inter/ltsm_baseline.py", line 499, in <module>
get_gradients(checkpoint, x_test[0:250])
File "/Users/sharan/attention-inter/ltsm_baseline.py", line 114, in get_gradients
print(sess.run(gradients[i].eval()))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 695, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 5181, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'inputs/inputs' with dtype int32 and shape [?,?]
[[node inputs/inputs (defined at /Users/sharan/attention-inter/ltsm_baseline.py:131) ]]
I figured the error is trying to say that I need to supply the tensor with inputs to evaluate, but shouldn't they already be evaluated after running opt
?
Upvotes: 0
Views: 506
Reputation: 9900
There are a few things to be layed out here. First, there is a strong distinction between an operation (op) and a matrix (input, output, variable value etc.). An op is typically a meta construct describing how to calculate a value/matrix given some input. Optimizer.compute_gradients() despite having compute in it's name does not compute anything, but constructs a set of gradients ops. These ops, of cause, will depend on the input. To reinforce this point further it might also worth mentioning that no data is stored in between session.run() calls.
Ideally, you would define your gradient modifications in Tensorflow operations in between these 2 lines:
gradients, variables = zip(*optimizer_here.compute_gradients(cost_here))
gradients, variables = [g+1 for g in gradients], variables
opt = optimizer_here.apply_gradients(list(zip(gradients, variables)))
And later you can learn about how your gradients look like when running them along with the training:
grads, _ = sess.run([gradients, opt], feed...)
If you want to modify the gradients on each batch in more manual way (with numpy or smth) you will have to read grads in one session run (just like the one above but without opt) and then create a set of operations to add your modified values to the variables in a separate session.run(assigns, feed_dict={modified_grads}). The caviat with this approach is that you can not use optimizer logic anymore, since it depends on TF operations.
Upvotes: 1