Reputation: 48616
I understand roughly how a TensorFlow graph is evaluated when one of the Tensors
it contains is evaluated: the execution of run
or eval
for that tensor will trigger all of the cascading computations in the graph needed compute the value of that tensor and, as a result, any tensors that "lead to it" in the graph will also have been computed, and any operations connecting them will have been run.
As a consequence, if I have a graph containing the tensor out_a
the computation of which involves (perhaps among many other things) operations that use int_b
which in turn (eventually) requires execution of the operation an_op
which itself (eventually) uses in
, executing
a, b, o = sess.run([out_a, int_b, an_op], feed_dict={in: x})
will evaluate out_a
, int_b
and an_op
just once: the computation of out_a
and int_b
both use the same execution of an_op
; and the computation used to supply int_b
is the same one used in computing out_a
. (And if I later reference a
, for example, I'm using the value of the evaluated tensor out_a
, so no further execution occurs as a result.)
But what happens if I don't combine my operations in this way:
o = sess.run(an_op, feed_dict={in: x})
# ... and later yet, after I remember I need `int_b`:
b = sess.run(int_b, feed_dict={in: x})
# ... later, after I remember I need `out_a`:
a = sess.run(out_a, feed_dict={in: x})
Is there any optimization that TensorFlow performs in this case to avoid computing an_op
a second and third time, and int_b
a second time, possibly triggering side effects of those computations?
Upvotes: 0
Views: 249
Reputation: 222999
Is there any optimization that TensorFlow performs in this case to avoid computing an_op a second and third time, and int_b a second time, possibly triggering side effects of those computations?
No, it is up to a developer to remember what computations will need to be executed and put all of them in the list in your sess.run function (the way you described).
You can verify it by running the following code:
import tensorflow as tf
import numpy as np
from datetime import datetime
n = 3000
t = np.random.rand(n,n)
a = tf.Variable(t)
b = tf.matmul(a, a)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
startTime = datetime.now()
_ = sess.run(b)
print datetime.now() - startTime
startTime = datetime.now()
_ = sess.run(b)
print datetime.now() - startTime
startTime = datetime.now()
_ = sess.run(b)
print datetime.now() - startTime
which on my machine returns:
0:00:02.798704
0:00:02.940005
0:00:03.039798
If the data would be cached, the second run would return in no time.
Upvotes: 2