How exactly is tensorflow.control_dependecy applied?

Question

        self.solver = 'adam'
        if self.solver == 'adam':
            optimizer = tf.train.AdamOptimizer(self.learning_rate_init)
        if self.solver == 'sgd_nestrov':
            optimizer = tf.train.MomentumOptimizer(learning_rate = self.learning_rate_init, momentum = self.momentum, \
                                                  use_nesterov = True)
        gradients, variables = zip(*optimizer.compute_gradients(self.loss))
        clipped_gradients, self.global_norm = tf.clip_by_global_norm(gradients, self.max_grad_norm)
        update_ops_ = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        optimizer_op = optimizer.apply_gradients(zip(clipped_gradients, variables))
        control_ops = tf.group([self.ema_op] + update_ops_)
        with tf.control_dependencies([optimizer_op]):
            self.optimizer = control_ops

i call self.optimizer with the session

The code above is not updating the gradients. However if i change the control dependencies part of the code to the one below it works perfectly fine except that it misses out on a final exponential moving average (self.ema_op) update, which is not desirable to me:

        self.solver = 'adam'
        if self.solver == 'adam':
            optimizer = tf.train.AdamOptimizer(self.learning_rate_init)
        if self.solver == 'sgd_nestrov':
            optimizer = tf.train.MomentumOptimizer(learning_rate = self.learning_rate_init, momentum = self.momentum, \
                                                  use_nesterov = True)
        gradients, variables = zip(*optimizer.compute_gradients(self.loss))
        clipped_gradients, self.global_norm = tf.clip_by_global_norm(gradients, self.max_grad_norm)
        update_ops_ = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        optimizer_op = optimizer.apply_gradients(zip(clipped_gradients, variables))
        control_ops = tf.group([self.ema_op] + update_ops_)
#         with tf.control_dependencies(optimizer_op):
#             self.optimizer = control_ops
        with tf.control_dependencies([self.ema_op] + update_ops_):
            self.optimizer = optimizer.apply_gradients(zip(clipped_gradients, variables))

Please tell me what am i missing?

David Parks · Accepted Answer

You need to define the tensorflow operations under the with statement, not just set the variable. Doing self.optimizer = control_ops has no effect because you did not create any tensorflow operations.

Without fully understanding your problem I think you want something like this:

with tf.control_dependencies(optimizer_op):
  control_ops = tf.group([self.ema_op] + update_ops_)

self.optimizer = control_ops

The with statement enters a block, under which any new ops you create in tensorflow will be dependent upon optimizer_op in this case.

How exactly is tensorflow.control_dependecy applied?

Answers (1)

Related Questions