Vajjhala
Vajjhala

Reputation: 160

MNIST Tensorflow example

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

This is the code from the Deep MNIST for experts tutorial on Tensorflow website.

I have two questions:

1) The documentation k-size is an integer list of length greater than 4 that refers to the size of the max-pool window. Shouldn't that be just [2,2] considering that it's a 2X2 window? I mean why is it [1, 2, 2, 1] instead of [2,2] ?

2) If we are taking a stride step on size one. Why do we need a vector of 4 values, wouldn't one value suffice?

strides = [1]

3) If padding = 'SAME' why does the image size decrease by half? ( from 28 X 28 to 14 X 14 in the first convolutional process )

Upvotes: 0

Views: 198

Answers (1)

Steven
Steven

Reputation: 5162

  1. I'm not sure which documentation you're referring to in this question. The maxpool window is indeed 2x2.

  2. The step size can be different depending on the dimensions. The 4 vector is the most general case where suppose you wanted to skip images in the batch, skip different height and width and potentially even skip based on channels. This is hardly used but has been left in.

  3. If you have a stride of 2 along each direction then you skip every other pixel that you could potentially use for max pooling. If you set the skip size to be [1,1,1,1] with padding same then you would indeed return a result of the same size. The padding "SAME" refers to zero padding the image such that you add a border of height kernel hieght and a width of size kernel width to the image.

Upvotes: 1

Related Questions