Reputation: 2312
I would like to know what happens in the background when we set the reuse to True in tensorflow.
Basically, when building models in tensorflow for training and testing, I have to create the model first within a method and then call that within a variable scope as follows:
def model(inputs, return_top=True):
#.... Here I have several conv layers
if return_top:
output = tf.layers.dense(output, units=8, name='outputs')
return output
with tf.variable_scope('model'):
output_train = model(inputs_train)
mse_train = cal_loss(output_train, labels_train) # This is a function that calculates the loss
train_step = optimize(mse_train) # This is a function that implements the optimizer
with tf.variable_scope('model', reuse=True):
output_validation = model(inputs_validation)
mse_validation = cal_loss(output_validation, labels_validation)
When creating models in tensorflow for training and testing, we usually create one model for training; and let's assume that we give it a name "model"; i.e. we created the whole model under a tf.variable_scope("model", reuse=False)
; and then we reused the model for testing where we set reuse to True. Thus, we use with tf.variable_scope("model", reuse=True)
. Now if I look into tensorboard, I find two copies for the whole model, one under name "model", and the other under "model_1". Also, I found that "model_1" references "model"; i.e., the weights of "model_1" are taken from "model" (That's my assumption; I would like to know if this is true). Also, I found that the "model" outputs goes into the optimizer, which is not the case with "model_1". I wonder why. In other words, if "model_1" references "model"; and the optimizer modifies the weights of "model"; should it modify the weights of "model_1"?
Any help is much appreciated!!
Upvotes: 0
Views: 4738
Reputation: 5206
First, reuse and variable scopes in general are deprecated and will be removed in tf2. They can be very confusing, as you see here. We instead recommend you use the tf.keras layers to build your model, which you can reuse by just reusing the objects.
tf.get_variable and tf.variable_scope together can be used to create and reuse the variables in your model. Inside a variable_scope, once you've called get_variable with a variable name, calling it again with the same variable name is problematic, in that TF cannot tell whether you mean to create a new variable or to reuse the existing one. If you pass reuse=False, the default option, we raise an error. If you pass reuse=True, we give you the same old variable back. However, if you call get_variable with a new variable name and pass reuse=True, we also raise an error since there's no variable to reuse. We also have reuse=tf.AUTO_REUSE which never raises an error (returns a variable if it exists and creates if not).
You can also pass reuse as a parameter to variable scopes, which means you'll pass it implicitly to all get_variable calls in that scope.
Upvotes: 4
Reputation: 425
First, you have namespace conflict for variable_scope. Since variable_scope 'model' is already present, the second variable_scope creation needs to be unique. Tensorflow automatically uniquify it as 'model_1'. You try repeating the definition again, if will create 'model_2' variable_scope.
Second, the reuse=True is not for variable_scope names. It is for the tf.Variable inside the tensorflow variable_scope.
Suppose you want to use a tf.Variable across 2 layers. In this case, you will use 2 python variables pointing to same TF variable.
Without reuse=True, it will throw an error saying something like Variable already exists. With reuse=True, it gives a pass.
Upvotes: 0