R zu
R zu

Reputation: 2064

Strange ordering of evaluations of variables and loss

I want to use tf.identity to copy the loss and variables either before or after the optimization step:

Here is the before case:

  1. Copy the current loss and variables (save variables and associated loss)
  2. Run one step of optimization (changes loss and values)
  3. Repeat

Here is the after case:

  1. Run one step of optimization (changes loss and values)
  2. Copy the current loss and variables (save variables and associated loss)
  3. Repeat

By "copy", I mean to create nodes in the computation graph to store the current values of loss and variables with tf.identity.


Somehow, this is what actually happens:

  1. Copy loss
  2. Run one step of optimization (changes loss and values)
  3. Copy variable (this value doesn't correspond to the loss saved in step 1)
  4. Repeat

How to fix this?

I could evaluate the loss again right after step 3. But that means wasting one evaluation of the loss in every cycle.


Test:

  1. Copy loss and variables
  2. Run optimization step and copy loss and variables.

If copying of loss and variables always happen before the optimization step, then the copies made in step 1 and step 2 would be the same.

Otherwise, copies made in step 1 and step 2 can be different.

import numpy as np
import tensorflow as tf

x = tf.get_variable('x', initializer=np.array([1], dtype=np.float64))
loss = x * x

optim = tf.train.AdamOptimizer(1)

## Control Dependencies ##
loss_ident = tf.identity(loss)  # <-- copy loss
x_ident = tf.identity(x)  # <-- copy variable
with tf.control_dependencies([loss_ident, x_ident]):
    train_op = optim.minimize(loss)

## Run ##
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    for i in range(1000):
        # step 1
        a_, x1_ = sess.run([loss, x_ident])
        # step 2
        b_, x2_ = sess.run([loss_ident, x_ident, train_op])[:-1]
        print("loss:", a_, b_)
        assert np.allclose(a_, b_)
        print("variables:", x1_, x2_)
        assert np.allclose(x1_, x2_)

Result:

           step 1    step 2
loss:      [1.]      [1.]
variables: [1.]      [1.58114875e-07]  # <-- not the same
AssertionError

Unfortunately, the copies of the variable in step 1 and step 2 are different. Therefore, copying of the variable does not always happen before the optimization

Upvotes: 1

Views: 64

Answers (1)

thushv89
thushv89

Reputation: 11333

I am not entirely clear why the control dependencies don't work with the Tensors. But you can get it to work with Variables and tf.assign(). Here's my solution. From my understanding all you need is the copy to happen before train_op. From the quick few tests I did, this seems to work.

import tensorflow as tf

tf.reset_default_graph()

x = tf.get_variable('x', initializer=np.array([1], dtype=np.float64))
x_ident = tf.get_variable('x_ident', initializer=np.array([1], dtype=np.float64))
loss = x * x
loss_ident = tf.get_variable('loss', initializer=np.array([1.0]), dtype=tf.float64)
optim = tf.train.AdamOptimizer(1)

## Control Dependencies ##
loss_ident = tf.assign(loss_ident, loss, name='loss_assign')  # <-- copy loss
x_ident = tf.assign(x_ident, x, name='x_assign')  # <-- copy variable
with tf.control_dependencies([x_ident, loss_ident]):  
    train_op = optim.minimize(loss)

## Run ##
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    for i in range(10):
        # step 1

        a, x1 = sess.run([loss_ident, x_ident])                
        # step 2        
        b, x2, _ = sess.run([loss_ident, x_ident, train_op])        


        #print("loss:", a_, b_)
        print('ab',a,b)
        print('x1x2',x1, x2)
        assert np.allclose(a, b)
        #print("variables:", x1_, x2_)
        assert np.allclose(x1, x2)

Hopefully, this is what you're looking for.

Upvotes: 1

Related Questions