Chris
Chris

Reputation: 31206

Tensorflow: How to Pool over Depth?

I have the following parameters defined for doing a max pool over the depth of the image (rgb) for compression before the dense layer and readout...and I am failing with an error that I cannot pool over depth and everything else:

sunset_poolmax_1x1x3_div_2x2x3_params = \
    {'pool_function':tf.nn.max_pool,
     'ksize':[1,1,1,3],
     'strides':[1,1,1,3],
     'padding': 'SAME'}

I changed the strides to [1,1,1,3] so that depth is the only dimension reduced by the pool...but it still doesn't work. I can't get good results with the tiny image I have to compress everything to in order to keep the colors...

Actual Error:

ValueError: Current implementation does not support pooling in the batch and depth dimensions.

Upvotes: 9

Views: 8811

Answers (5)

MiniQuark
MiniQuark

Reputation: 48436

You can use a custom Keras layer:

class DepthPool(tf.keras.layers.Layer):
    def __init__(self, pool_size=2, **kwargs):
        super().__init__(**kwargs)
        self.pool_size = pool_size
    
    def call(self, inputs):
        old_shape = tf.shape(inputs)
        num_channels = old_shape[-1]
        num_channel_groups = num_channels // self.pool_size
        new_shape = tf.concat(
            [old_shape[:-1], [num_channel_groups, self.pool_size]], axis=0)
        reshaped_inputs = tf.reshape(inputs, new_shape)
        return tf.reduce_max(reshaped_inputs, axis=-1)

Notes:

  • there's no strides argument: it is assumed to be equal to the pool size
  • TensorFlow's tf.nn.max_pool() operation supports depthwise pooling (see my other answer), but it only works on the CPU, so this custom layer is generally better

Upvotes: 0

devansh
devansh

Reputation: 89

This is an excerpt from the book Hands on Machine learning with scikit learn keras and tensorflow. Keras does not include a depthwise max pooling layer, but TensorFlow’s low-level Deep Learning API does: just use the tf.nn.max_pool() function, and specify the kernel size and strides as 4-tuples (i.e., tuples of size 4). The first three values of each should be 1: this indicates that the kernel size and stride along the batch, height, and width dimensions should be 1. The last value should be whatever kernel size and stride you want along the depth dimension—for example, 3 (this must be a divisor of the input depth; it will not work if the previous layer outputs 20 feature maps, since 20 is not a multiple of 3):

output = tf.nn.max_pool(images,
ksize=(1, 1, 1, 3),
strides=(1, 1, 1, 3),
padding="valid")

If you want to include this as a layer in your Keras models, wrap it in a Lambda layer (or create a custom Keras layer):

depth_pool = keras.layers.Lambda(
lambda X: tf.nn.max_pool(X, ksize=(1, 1, 1, 3), strides=(1, 1, 1, 3),
padding="valid"))

Upvotes: 0

MiniQuark
MiniQuark

Reputation: 48436

TensorFlow now supports depth-wise max pooling with tf.nn.max_pool(). For example, here is how to implement it using pooling kernel size 3, stride 3 and VALID padding:

import tensorflow as tf

output = tf.nn.max_pool(images,
                        ksize=(1, 1, 1, 3),
                        strides=(1, 1, 1, 3),
                        padding="VALID")

You can use this in a Keras model by wrapping it in a Lambda layer:

from tensorflow import keras

depth_pool = keras.layers.Lambda(
    lambda X: tf.nn.max_pool(X,
                             ksize=(1, 1, 1, 3),
                             strides=(1, 1, 1, 3),
                             padding="VALID"))

model = keras.models.Sequential([
    ..., # other layers
    depth_pool,
    ... # other layers
])

Alternatively, you can write a custom Keras layer:

class DepthMaxPool(keras.layers.Layer):
    def __init__(self, pool_size, strides=None, padding="VALID", **kwargs):
        super().__init__(**kwargs)
        if strides is None:
            strides = pool_size
        self.pool_size = pool_size
        self.strides = strides
        self.padding = padding
    def call(self, inputs):
        return tf.nn.max_pool(inputs,
                              ksize=(1, 1, 1, self.pool_size),
                              strides=(1, 1, 1, self.pool_size),
                              padding=self.padding)

You can then use it like any other layer:

model = keras.models.Sequential([
    ..., # other layers
    DepthMaxPool(3),
    ... # other layers
])

Upvotes: 2

xen
xen

Reputation: 88

Here is a brief example to the original question for tensorflow. I tested it on a stock RGB image of size 225 x 225 with 3 channels.

Import the standard libraries, enable eager_execution to quickly view results

import tensorflow as tf
from scipy.misc import imread
import matplotlib.pyplot as plt
import numpy as np
tf.enable_eager_execution()

Read image and cast from uint8 to tf.float32

x = tf.cast(imread('tiger.jpeg'), tf.float32)
x = tf.reshape(x, shape=[-1, x.shape[0], x.shape[1], x.shape[2]])
print(x.shape)
input_channels = x.shape[3]

Create the filter for depthwise convolution

filters = tf.contrib.eager.Variable(tf.random_normal(shape=[3, 3, input_channels, 4]))
print(x.shape)

Perform depthwise convolution with channel multiplier 4. Note the the padding has been kept to 'SAME'. It can be changed at will.

x = tf.nn.depthwise_conv2d(input=x, filter=filters, strides=[1, 1, 1, 1], padding='SAME', name='conv_1')
print(x.shape)

Perform the max_pooling2d. Since the output of the pooling layer is (input_size - pool_size + 2 * padding)/stride + 1 and the padding is 'valid', we should get an output of (225 - 2 + 0)/1 + 1 = 223.

x = tf.layers.max_pooling2d(inputs=x, pool_size=2, strides=1,padding='valid', name='maxpool1')
print(x.shape)

Plot the figures to confirm.

fig, ax = plt.subplots(nrows=4, ncols=3)
q = 0
for ii in range(4):
    for jj in range(3):
        ax[ii, jj].imshow(np.squeeze(x[:,:,:,q]))
        ax[ii,jj].set_axis_off()
        q += 1
plt.tight_layout()
plt.show()

Upvotes: 1

Benoit Steiner
Benoit Steiner

Reputation: 1469

tf.nn.max_pool does not support pooling over the depth dimension which is why you get an error.

You can use a max reduction instead to achieve what you're looking for:

tf.reduce_max(input_tensor, reduction_indices=[3], keep_dims=True)

The keep_dims parameter above ensures that the rank of the tensor is preserved. This ensures that the behavior of the max reduction will be consistent with what the tf.nn.max_pool operation would do if it supported pooling over the depth dimension.

Upvotes: 11

Related Questions