Reputation: 892
I found this information in tensorflow's doc:
tf.layers.max_pooling1d?
Max Pooling layer for 1D inputs.
Arguments:
inputs: The tensor over which to pool. Must have rank 3.
And:
tf.layers.max_pooling2d?
Max pooling layer for 2D inputs (e.g. images).
Arguments:
inputs: The tensor over which to pool. Must have rank 4.
My confusion is why the inputs require rank 3 and rank 4, respectively?
Upvotes: 0
Views: 175
Reputation: 531
What might cause your confusion is the fact that one rank corresponds to the channels.
For 2D inputs (let's say images), the 4 ranks correspond to the following:
N
refers to the number of images in a batch.H
refers to the number of pixels in the vertical (height) dimension.W
refers to the number of pixels in the horizontal (width) dimension.C
refers to the channels. For example, 1 for black and white or grayscale and 3 for RGB.Depending on whether you want to have channels_first
or channels_last
, the ranks are ordered NCHW
or NHWC
, respectively.
For 1D-inputs there is only one of H
or W
(I prefer thinking about it as W
but that's up to you) and so you have NCW
(channels_first
) or NWC
(channels_last
).
For more information about how the ordering (channels_first
or channels_last
) can affect the computation speed, you might want to take a look at the TensorFlow Performance Guide where I got the above information from.
Upvotes: 2