Reputation: 91
I am looking to add dropout to the tensorflow CIFAR10 tutorial example code, but am having some difficulty.
The Deep MNIST tensorflow tutorial includes an example of dropout, however it uses an interactive graph, which is different to the approach used for the CIFAR10 tutorial. Also, the CIFAR10 tutorial does not make use of placeholders, nor does it use a feed_dict to pass variables to the optimizer, which is used by the MNIST model to pass the dropout probability for training.
What I am trying:
Within cifar10_train.train() I define the dropout probability placeholder, under the default graph; that is:
def train():
"""Train CIFAR-10 for a number of steps."""
with tf.Graph().as_default():
global_step = tf.Variable(0, trainable=False)
keep_drop_prob = = tf.placeholder(tf.float32)
Underneath, still within the train() module, when I build the compute graph by calling cifar10.inference() I also pass the keep_drop_prob placeholder, like so:
"""Build a Graph that computes the logits predictions from the
inference model."""
logits = cifar10.inference(images, keep_drop_prob)
Within the cifar10.inference() module, I now take the keep_drop_prob placeholder that was passed and use it to define my dropout layer, like so:
drop1 = tf.nn.dropout(norm1, keep_drop_prob)
Now I define and pass a value for keep_drop_prob when calculating the loss, still within the train() module, like so:
"""Calculate loss."""
loss = cifar10.loss(logits, labels, keep_drop_prob = 0.5)
Then within the cifar10.loss() module, I use the passed keep_drop_prob value, when calculating my cross entropy, like so:
"""Calculate the average cross entropy loss across the batch."""
labels = tf.cast(labels, tf.int64)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits, labels, keep_drop_prob, name='cross_entropy_per_example')
Now at this point, I am unsure if what I have done so far is correct, and what I need to do next.
Any help would be greatly appreciated!
Upvotes: 3
Views: 2680
Reputation: 91
I believe I have found a solution.
It seems I was on the right track, but went a bit overboard passing the keep_drop_prob placeholder around.
To add dropout I have done the following:
I added the keep_drop_prob placeholder within the cifar10_train.train() module as shown here:
def train():
"""Train CIFAR-10 for a number of steps."""
with tf.Graph().as_default():
global_step = tf.Variable(0, trainable=False)
keep_drop_prob = = tf.placeholder(tf.float32)
When building the graph in the cifar10_train.train() module I pass it the placeholder, but ALSO define its value
"""Build a Graph that computes the logits predictions from the
inference model."""
logits = cifar10.inference(images, keep_drop_prob=0.5)
Within the cifar10.inference() module, I now take the keep_drop_prob placeholder that was passed and use it to define my dropout layer, and also pass it to the activation summary for logging in tensorboard:
drop1 = tf.nn.dropout(norm1, keep_drop_prob)
_activation_summary(drop1)
When I look in the tensorboard graph I see my dropout op there. I can also interrogate the keep_prob variable in the dropout op and influence its value attribute by changing the value I pass when building the logits graph.
My next test will be to set keep_drop_prob to 1 and to 0 and ensure I get the expected results from my network.
I am not certain that this is the most efficient way of implementing dropout, but I am fairly certain it works.
Note, I only have one keep_drop_prob placeholder, that I pass to many many layers of dropout (one after each convolution atm). I figure tensorflow uses a unique distribution for each dropout op, rather than needing a unique placeholder.
Edit: Don't forget to make the necessary changes to the eval module, but pass a value of 1 for dropout.
Upvotes: 2