Reputation: 25924
I am confused how we define max-pooling
in Tensorflow
. The documentation is vague and does not explain the parameters well.
In the pooling documentation it only says:
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor. strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.
and
Each pooling op uses rectangular windows of size ksize separated by offset strides. For example, if strides is all ones every window is used, if strides is all twos every other window is used in each dimension, etc.
What is the equivalent of the following Caffe
's max-pooling
in Tensorflow
?
layer {
name: "pool"
type: "Pooling"
bottom: "relu"
top: "pool"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
I'm not sure whether they mean overlapping pooling by all one strides [1,1,1,1] and non-overlapping [2,2,2,2] by saying
if strides is all ones every window is used, if strides is all twos every other window is used in each dimension, etc.
Upvotes: 1
Views: 706
Reputation: 3673
To do max-pooling in Tensor-Flow use:
tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
where ksize defines the window used for max-pooling. Note that you must specify the window size for each dimension of your input. This is the biggest difference to caffe, where caffe does all the dimension calculations for you. Note that you may have varying dimensions depending on your number of outputs that come from the previous convolutional layer.
Stride has still the same effect has in caffe ("skipping" the inputs. However you must specify the stride again for each dimension of the input.
The dimensions are at least 4 or larger.
See the documentation here:
https://www.tensorflow.org/api_docs/python/nn/pooling
Upvotes: 1