Reputation: 2064
I want to use tf.identity
to copy the loss and variables either before or after the optimization step:
Here is the before case:
Here is the after case:
By "copy", I mean to create nodes in the computation graph to store the current values of loss and variables with tf.identity
.
Somehow, this is what actually happens:
How to fix this?
I could evaluate the loss again right after step 3. But that means wasting one evaluation of the loss in every cycle.
Test:
If copying of loss and variables always happen before the optimization step, then the copies made in step 1 and step 2 would be the same.
Otherwise, copies made in step 1 and step 2 can be different.
import numpy as np
import tensorflow as tf
x = tf.get_variable('x', initializer=np.array([1], dtype=np.float64))
loss = x * x
optim = tf.train.AdamOptimizer(1)
## Control Dependencies ##
loss_ident = tf.identity(loss) # <-- copy loss
x_ident = tf.identity(x) # <-- copy variable
with tf.control_dependencies([loss_ident, x_ident]):
train_op = optim.minimize(loss)
## Run ##
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
for i in range(1000):
# step 1
a_, x1_ = sess.run([loss, x_ident])
# step 2
b_, x2_ = sess.run([loss_ident, x_ident, train_op])[:-1]
print("loss:", a_, b_)
assert np.allclose(a_, b_)
print("variables:", x1_, x2_)
assert np.allclose(x1_, x2_)
Result:
step 1 step 2
loss: [1.] [1.]
variables: [1.] [1.58114875e-07] # <-- not the same
AssertionError
Unfortunately, the copies of the variable in step 1 and step 2 are different. Therefore, copying of the variable does not always happen before the optimization
Upvotes: 1
Views: 64
Reputation: 11333
I am not entirely clear why the control dependencies don't work with the Tensors. But you can get it to work with Variables and tf.assign()
. Here's my solution. From my understanding all you need is the copy to happen before train_op
. From the quick few tests I did, this seems to work.
import tensorflow as tf
tf.reset_default_graph()
x = tf.get_variable('x', initializer=np.array([1], dtype=np.float64))
x_ident = tf.get_variable('x_ident', initializer=np.array([1], dtype=np.float64))
loss = x * x
loss_ident = tf.get_variable('loss', initializer=np.array([1.0]), dtype=tf.float64)
optim = tf.train.AdamOptimizer(1)
## Control Dependencies ##
loss_ident = tf.assign(loss_ident, loss, name='loss_assign') # <-- copy loss
x_ident = tf.assign(x_ident, x, name='x_assign') # <-- copy variable
with tf.control_dependencies([x_ident, loss_ident]):
train_op = optim.minimize(loss)
## Run ##
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
for i in range(10):
# step 1
a, x1 = sess.run([loss_ident, x_ident])
# step 2
b, x2, _ = sess.run([loss_ident, x_ident, train_op])
#print("loss:", a_, b_)
print('ab',a,b)
print('x1x2',x1, x2)
assert np.allclose(a, b)
#print("variables:", x1_, x2_)
assert np.allclose(x1, x2)
Hopefully, this is what you're looking for.
Upvotes: 1