orome
orome

Reputation: 48616

Does TensorFlow optimize to avoid unnecessary re-execution of graphs?

I understand roughly how a TensorFlow graph is evaluated when one of the Tensors it contains is evaluated: the execution of run or eval for that tensor will trigger all of the cascading computations in the graph needed compute the value of that tensor and, as a result, any tensors that "lead to it" in the graph will also have been computed, and any operations connecting them will have been run.

As a consequence, if I have a graph containing the tensor out_a the computation of which involves (perhaps among many other things) operations that use int_b which in turn (eventually) requires execution of the operation an_op which itself (eventually) uses in, executing

a, b, o = sess.run([out_a, int_b, an_op], feed_dict={in: x})

will evaluate out_a, int_b and an_op just once: the computation of out_a and int_b both use the same execution of an_op; and the computation used to supply int_b is the same one used in computing out_a. (And if I later reference a, for example, I'm using the value of the evaluated tensor out_a, so no further execution occurs as a result.)

But what happens if I don't combine my operations in this way:

o = sess.run(an_op, feed_dict={in: x})
# ... and later yet, after I remember I need `int_b`:
b = sess.run(int_b, feed_dict={in: x})
# ... later, after I remember I need `out_a`:
a = sess.run(out_a, feed_dict={in: x})

Is there any optimization that TensorFlow performs in this case to avoid computing an_op a second and third time, and int_b a second time, possibly triggering side effects of those computations?

Upvotes: 0

Views: 249

Answers (1)

Salvador Dali
Salvador Dali

Reputation: 222999

Is there any optimization that TensorFlow performs in this case to avoid computing an_op a second and third time, and int_b a second time, possibly triggering side effects of those computations?

No, it is up to a developer to remember what computations will need to be executed and put all of them in the list in your sess.run function (the way you described).

You can verify it by running the following code:

import tensorflow as tf
import numpy as np
from datetime import datetime

n = 3000
t = np.random.rand(n,n)
a = tf.Variable(t)
b = tf.matmul(a, a)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    startTime = datetime.now()
    _ = sess.run(b)
    print datetime.now() - startTime

    startTime = datetime.now()
    _ = sess.run(b)
    print datetime.now() - startTime

    startTime = datetime.now()
    _ = sess.run(b)
    print datetime.now() - startTime

which on my machine returns:

0:00:02.798704
0:00:02.940005
0:00:03.039798

If the data would be cached, the second run would return in no time.

Upvotes: 2

Related Questions