Reputation: 305
A typical training loop in tensorflow maybe as follows:
cg = opt.compute_gradients(loss)
grads = [None] * len(cg)
for gv in cg:
grads[i] = gv[0]
# ... do some process to grads ...
apply_gradients = opt.apply_gradients(cg)
while (...):
gradients = sess.run(grads)
feed = dict()
for i, grad_var in enumerate(cg)
feed[grad_var[0]] = gradients[i]
sess.run(apply_gradients, feed_dict=feed)
Each time it calls sess.run(grads)
, a new numpy array gradients
(with new-allocated inner memory) is generated. I want to use a fixed numpy array for all the training iterations, how could I do that?
Upvotes: 0
Views: 144
Reputation: 126154
The tf.Optimizer.compute_gradients()
method should not create any new NumPy arrays: instead it builds a graph of TensorFlow operations for computing the gradients of the loss
with respect to some or all of the variables in your model. The return value is not a NumPy array; it is a list of pairs of gradient tf.Tensor
objects and the corresponding tf.Variable
to which that gradient should be applied.
Nevertheless, it is usually wasteful of memory to call opt.compute_gradients()
inside a loop. It's hard to say whether this will work exactly without seeing more of your code, but you should be able to move the call to opt.compute_gradients()
before the loop, since it does not seem to depend on anything computed inside the loop. This will avoid building a new segment of TensorFlow graph in each loop iteration, and should reduce the memory cost.
Upvotes: 2