simongraham
simongraham

Reputation: 133

Udacity Deep Learning Convolutional Neural Networks- TensorFlow

I have been working on Udacity's course on deep learning- which I must add is great! I am very happy with the assignments so far. But there are two lines of code, that I am not quite understanding.

batch_size = 20
patch_size = 5
depth = 16
num_hidden = 64

graph = tf.Graph()

with graph.as_default():

  # Input data.
  tf_train_dataset = tf.placeholder(
    tf.float32, shape=(batch_size, image_size, image_size, num_channels))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  # Variables.
  layer1_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, num_channels, depth], stddev=0.1))
  layer1_biases = tf.Variable(tf.zeros([depth]))
  layer2_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, depth, depth], stddev=0.1))
  ***********************************************************
  layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))
  ***********************************************************
  layer3_weights = tf.Variable(tf.truncated_normal(
      [image_size // 4 * image_size // 4 * depth, num_hidden], stddev=0.1))
  ***********************************************************
  layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))
  layer4_weights = tf.Variable(tf.truncated_normal(
      [num_hidden, num_labels], stddev=0.1))
  layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))

  # Model.
  def model(data):
    conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
    hidden = tf.nn.relu(conv + layer1_biases)
    conv = tf.nn.conv2d(hidden, layer2_weights, [1, 2, 2, 1], padding='SAME')
    hidden = tf.nn.relu(conv + layer2_biases)
    shape = hidden.get_shape().as_list()
    reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
    hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
    return tf.matmul(hidden, layer4_weights) + layer4_biases

  # Training computation.
  logits = model(tf_train_dataset)
  loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))

  # Optimizer.
  optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)
  valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
  test_prediction = tf.nn.softmax(model(tf_test_dataset))

I have put asterisks around the parts of the code that I do not quite understand. First, I am not quite sure why the first set of biases between the input and convolutional layer are zeros, and then in the second layer they are then all ones.

Next, I do not understand the following line of code:

layer3_weights = tf.Variable(tf.truncated_normal(
  [image_size // 4 * image_size // 4 * depth, num_hidden], stddev=0.1))

Specifically, I don't get why we have used image_size // 4 * image_size // 4 * depth, and I especially don't understand why we have used 4.

If you need more information then please let me know. This is taken from Udacity's deep learning course, where the notebooks are able to be cloned from GitHub.

Many thanks :)

Upvotes: 1

Views: 599

Answers (1)

Anuj Gupta
Anuj Gupta

Reputation: 6562

To answer [image_size // 4 * image_size // 4 * depth] part:

The code applies convolution twice with SAME padding -

In each convolution the output image is half of input size (since stride = 2)
Therefore size of output after first 2 convolution layers is :
(image_size / 4) * (image_size / 4) * depth

Now next layer is fully connected layer so the input to this layer should be a flat image rather than 3D image, Hence, the weights of layer 3 are od dimensions:

((image_size / 4) * (image_size / 4) * depth), num_hidden

Upvotes: 0

Related Questions