Reputation: 5753
self.solver = 'adam'
if self.solver == 'adam':
optimizer = tf.train.AdamOptimizer(self.learning_rate_init)
if self.solver == 'sgd_nestrov':
optimizer = tf.train.MomentumOptimizer(learning_rate = self.learning_rate_init, momentum = self.momentum, \
use_nesterov = True)
gradients, variables = zip(*optimizer.compute_gradients(self.loss))
clipped_gradients, self.global_norm = tf.clip_by_global_norm(gradients, self.max_grad_norm)
update_ops_ = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
optimizer_op = optimizer.apply_gradients(zip(clipped_gradients, variables))
control_ops = tf.group([self.ema_op] + update_ops_)
with tf.control_dependencies([optimizer_op]):
self.optimizer = control_ops
i call self.optimizer with the session
The code above is not updating the gradients. However if i change the control dependencies part of the code to the one below it works perfectly fine except that it misses out on a final exponential moving average (self.ema_op) update, which is not desirable to me:
self.solver = 'adam'
if self.solver == 'adam':
optimizer = tf.train.AdamOptimizer(self.learning_rate_init)
if self.solver == 'sgd_nestrov':
optimizer = tf.train.MomentumOptimizer(learning_rate = self.learning_rate_init, momentum = self.momentum, \
use_nesterov = True)
gradients, variables = zip(*optimizer.compute_gradients(self.loss))
clipped_gradients, self.global_norm = tf.clip_by_global_norm(gradients, self.max_grad_norm)
update_ops_ = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
optimizer_op = optimizer.apply_gradients(zip(clipped_gradients, variables))
control_ops = tf.group([self.ema_op] + update_ops_)
# with tf.control_dependencies(optimizer_op):
# self.optimizer = control_ops
with tf.control_dependencies([self.ema_op] + update_ops_):
self.optimizer = optimizer.apply_gradients(zip(clipped_gradients, variables))
Please tell me what am i missing?
Upvotes: 0
Views: 54
Reputation: 32071
You need to define the tensorflow operations under the with
statement, not just set the variable. Doing self.optimizer = control_ops
has no effect because you did not create any tensorflow operations.
Without fully understanding your problem I think you want something like this:
with tf.control_dependencies(optimizer_op):
control_ops = tf.group([self.ema_op] + update_ops_)
self.optimizer = control_ops
The with statement enters a block, under which any new ops you create in tensorflow will be dependent upon optimizer_op
in this case.
Upvotes: 1