user88383
user88383

Reputation: 197

Receptive Field Arithmetic on the TF CNN example

Below is the code tensorflow provides. I will describe my current understanding of the receptive field size changes and would greatly appreciate if someone could let me know where my misunderstanding is.

Overview: [28,28] -> 32 [24,24] -> 32 [12,12] -> 2048 [8,8]

Long version:

2048 [8,8]s is not what is represented in the subsequent code. What is my error here? All guidance is appreciated.

  # Input Layer
  input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])

  # Convolutional Layer #1
  conv1 = tf.layers.conv2d(
      inputs=input_layer,
      filters=32,
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)

  # Pooling Layer #1
  pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

  # Convolutional Layer #2 and Pooling Layer #2
  conv2 = tf.layers.conv2d(
      inputs=pool1,
      filters=64,
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)             
  pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

  # Dense Layer
  pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
  dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
  dropout = tf.layers.dropout(
      inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

Upvotes: 0

Views: 488

Answers (1)

Stephen
Stephen

Reputation: 824

The conv2d layers are using padding="same", which means the input is padded with zeros so that the output is the same size. To get the result you expect we would use padding="valid", which means no padding.

Upvotes: 1

Related Questions