Reputation: 3536
I re-designed a GAN which I have previously made with Keras. No Problem so far, however I notice that my model doesn't train correctly depending on the way I implement the scope reuse. Maybe someone could help me understanding what is happening:
Working version:
with tf.variable_scope('g/h0') as scope:
reuse_scope = scope if reuse else None
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=reuse_scope,
scope='g/h0'
)
h0 = tf.nn.relu(h0)
Not Working version:
with tf.variable_scope('g/h0') as scope:
if reuse:
scope.reuse_variables()
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None
)
h0 = tf.nn.relu(h0)
Both version lead to a working network, however the second one lead to a network that never updates. And I don't understand why the first version lead to a correct situation.
In TensorBoard, the graph appears quite different depending on the version I pick. I suspect the gradient to backpropagate incorrectly.
Is there any way doing this with the second version ? I find it a lot more understandable.
Upvotes: 3
Views: 2898
Reputation: 1
I defined reuse=True first, then:
with tf.compat.v1.variable_scope('layer_1'): .... .... .... with tf.compat.v1.variable_scope('layer_2'):
Upvotes: 0
Reputation: 12411
I think that you should try this way of defining your scope:
reuse = ... # True or False
with tf.variable_scope('g/h0') as scope:
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=reuse,
scope='fully_connected',
)
h0 = tf.nn.relu(h0)
If you set reuse as False
, you fully connected layer will be created "as usual". If you set it at True
, no additional parameter will be created but weights and bias will be reused from another scope (with the same name and where variables with the same name have been created, of course).
reuse
parameter must be True
or False
(or None
, naturally)scope
parameter has nothing to do with reuse
. It is just the internal scope name. For example, if you set scope = 'g/h0'
, your weight parameter inside the fully connected layer will be 'g/h0/g/h0/weights:0'
but if you do not set it, it will be 'g/h0/fully_connected/weights:0'
.A similar concern is addressed in this answer. It is roughly the same context as in your question, except that a conv2d layer is used and that the scope is not set explicitely.
EDIT:
I do not know if it is a bug or something normal but to use the reuse=True
variable in tf.contrib.layers.fully_connected
, you need to specify the scope
...
The complete working example:
import tensorflow as tf
## A value for z that you did not specify in your question
z = tf.placeholder(tf.float32, (2,1))
## First fully-connected layer with ut result
with tf.variable_scope('g/h0'):
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=None
)
h0 = tf.nn.relu(h0)
tf.global_variables()
# Returns [<tf.Variable 'g/h0/fully_connected/weights:0' shape=(1, 8192) dtype=float32_ref>, <tf.Variable 'g/h0/fully_connected/biases:0' shape=(8192,) dtype=float32_ref>]
# Second layer ith resuse = True
with tf.variable_scope('g/h0'):
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=True, scope = 'fully_connected'
)
h0 = tf.nn.relu(h0)
tf.global_variables()
# Returns [<tf.Variable 'g/h0/fully_connected/weights:0' shape=(1, 8192) dtype=float32_ref>, <tf.Variable 'g/h0/fully_connected/biases:0' shape=(8192,) dtype=float32_ref>]
# => the same parameters are used for both layers
Upvotes: 2