rd11
rd11

Reputation: 3084

Caching Computations in TensorFlow

Is there a canonical way to reuse computations from a previously-supplied placeholder in TensorFlow? My specific use case:

Here is the goal in code, but which is defective because the same computations are carried out again and again:

X_in = some_fixed_data
combinations_in = large_set_of_combination_indices
for combination_batch_in in batches(combinations_in, batch_size=128):
    session.run(train_op, feed_dict={X: X_in, combinations: combination_batch_in})

Thanks.

Upvotes: 4

Views: 2064

Answers (2)

Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57893

This is the kind of thing that should be solved automatically with CSE (common subexpression elimination). Not sure what the support in TensorFlow right now, might be kind of spotty, but there's optimizer_do_cse flag for Graph options which is defaulting to false, and you can set it to true using GraphConstructorOptions. Here's a C++ example of using GraphConstructorOptions (sorry, couldn't find a Python one)

If that doesn't work, you could do "manual CSE", ie, figure out which part is being needlessly recomputed, factor it out into separate Tensor, and reference that tensor in all the calculations.

Upvotes: 1

Ian Goodfellow
Ian Goodfellow

Reputation: 2604

The canonical way to share computed values across sess.Run() calls is to use a Variable. In this case, you could set up your graph so that when the Placeholders are fed, they compute a new value of the representation that is saved into a Variable. A separate portion of the graph reads those Variables to compute the loss. This will not work if you need to compute gradients through the part of the graph that computes the representation. Computing those gradients will require recomputing every Op in the encoder.

Upvotes: 6

Related Questions