hyang51
hyang51

Reputation: 75

what's the difference between tf.nn.conv2d with strides = 2 and tf.nn.max_pool with 2x2 pooling?

As mentioned above, both

tf.nn.conv2d with strides = 2

and

tf.nn.max_pool with 2x2 pooling

can reduce the size of input to half, and I know the output may be different, but what I don't know is that affect the final training result or not, any clue about this, thanks.

Upvotes: 0

Views: 990

Answers (2)

j314erre
j314erre

Reputation: 2827

In both your examples assume we have a [height, width] kernel applied with strides [2,2]. That means we apply the kernel to a 2-D window of size [height, width] on the 2-D inputs to get an output value, and then slide the window over by 2 either up or down to get the next output value.

In both cases you end up with 4x fewer outputs than inputs (2x fewer in each dimension) assuming padding='SAME'

The difference is how the output values are computed for each window:

conv2d

  • the output is a linear combination of the input values times a weight for each cell in the [height, width] kernel
  • these weights become trainable parameters in your model

max_pool

  • the output is just selecting the maximum input value within the [height, width] window of input values
  • there is no weight and no trainable parameters introduced by this operation

Upvotes: 2

Alex Mitrakow
Alex Mitrakow

Reputation: 11

The results of the final training could actually be different as the convolution multiplies the tensor by a filter, which you might not want to do as it takes up extra computational time and also can overfit your model as it will have more weights.

Upvotes: 1

Related Questions