HY G
HY G

Reputation: 305

Does tensorflow create a new numpy array each time it calls compute_gradients()?

A typical training loop in tensorflow maybe as follows:

cg = opt.compute_gradients(loss)
grads = [None] * len(cg)
for gv in cg:
    grads[i] = gv[0]
# ... do some process to grads ...
apply_gradients = opt.apply_gradients(cg)
while (...):
    gradients = sess.run(grads)
    feed = dict()
    for i, grad_var in enumerate(cg)
        feed[grad_var[0]] = gradients[i]
    sess.run(apply_gradients, feed_dict=feed)

Each time it calls sess.run(grads), a new numpy array gradients (with new-allocated inner memory) is generated. I want to use a fixed numpy array for all the training iterations, how could I do that?

Upvotes: 0

Views: 144

Answers (1)

mrry
mrry

Reputation: 126154

The tf.Optimizer.compute_gradients() method should not create any new NumPy arrays: instead it builds a graph of TensorFlow operations for computing the gradients of the loss with respect to some or all of the variables in your model. The return value is not a NumPy array; it is a list of pairs of gradient tf.Tensor objects and the corresponding tf.Variable to which that gradient should be applied.

Nevertheless, it is usually wasteful of memory to call opt.compute_gradients() inside a loop. It's hard to say whether this will work exactly without seeing more of your code, but you should be able to move the call to opt.compute_gradients() before the loop, since it does not seem to depend on anything computed inside the loop. This will avoid building a new segment of TensorFlow graph in each loop iteration, and should reduce the memory cost.

Upvotes: 2

Related Questions