Reputation: 663
After recently upgrading my TensorFlow version, I am encountering this error which I am not able to solve:
Traceback (most recent call last):
File "cross_train.py", line 177, in <module>
train_network(use_gpu=True)
File "cross_train.py", line 46, in train_network
with tf.control_dependencies([s_opt.apply_gradients(s_grads), s_increment_step]):
...
ValueError: Variable image-conv1-layer/weights/Adam/ already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "cross_train.py", line 34, in train_network
with tf.control_dependencies([e_opt.apply_gradients(e_grads), e_increment_step]):
File "cross_train.py", line 177, in <module>
train_network(use_gpu=True)
My model architecture is 3 different convolutional neural network branches: M, E, S. In training, I am trying to alternate steps where I propagate samples through M & E (dot product distance of their embeddings) and update with Adam; then propagate samples through M & S and update with Adam; and repeat. So basically M is fixed (getting updated every step), but E and S branches alternate getting updated.
As such I created two instances of AdamOptimizer (e_opt
and s_opt
) but I get the error because the weight variable M-conv1/weights/Adam/
already exists when I try to update the S branch.
This was not happening to me before I updated my TensorFlow version. I know how to set reuse of variables generally in TensorFlow, for example:
with tf.variable_scope(name, values=[input_to_layer]) as scope:
try:
weights = tf.get_variable("weights", [height, width, input_to_layer.get_shape()[3], channels], initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
bias = tf.get_variable("bias", [channels], initializer=tf.constant_initializer(0.0, dtype=tf.float32))
except ValueError:
scope.reuse_variables()
weights = tf.get_variable("weights", [height, width, input_to_layer.get_shape()[3], channels], initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
bias = tf.get_variable("bias", [channels], initializer=tf.constant_initializer(0.0, dtype=tf.float32))
But I'm not sure if I can do the same for Adam. Any ideas? Help would be much appreciated.
Upvotes: 4
Views: 2809
Reputation: 663
Turns out I didn't need to instantiate two different Adam optimizers. I just created a single instance and there was no name conflict or issue of trying to share variables. I use the same optimizer regardless of which network branches are being updated:
e_grads = opt.compute_gradients(e_loss)
with tf.control_dependencies([opt.apply_gradients(e_grads), e_increment_step]):
e_train = tf.no_op(name='english_train')
and...
s_grads = opt.compute_gradients(s_loss)
with tf.control_dependencies([opt.apply_gradients(s_grads), s_increment_step]):
s_train = tf.no_op(name='spanish_train')
Interestingly with the older version of Tensorflow there was no issue with using two Adam instances even though the M branch names conflicted...
Upvotes: 3