Dmytro Prylipko
Dmytro Prylipko

Reputation: 5064

How to mimic Caffe's max pooling behavior in Keras/Tensorflow?

If I have MaxPooling2D layer with pool_size=(2,2), strides=(2,2) in Keras. Applied to a 3x3 input feature map, it will result in 1x1 spatial output size. The same operation in Caffe (pool: MAX; kernel_size: 2; stride: 2) will result in output of size 2x2.

It is well-known that Caffe and Tensorflow/Keras behave differently when applying max pooling.

There is workaround for 2D convolution: To avoid asymmetric padding of Conv2D in TensorFlow one can prepend it with an explicit zero padding and change the padding type from same to valid

Is there any similar workaround to change the MaxPooling2D behavior in Keras such that it performs similar to Caffe? More precisely, I am seeking for a wrapper around MaxPooling2D that will be equal to the max pooling 2D 2x2 in Caffe.

Maybe, pad the MaxPooling2D input with one pixel left and top?

I am using tf.keras from TensorFlow.

Upvotes: 0

Views: 624

Answers (1)

Dmytro Prylipko
Dmytro Prylipko

Reputation: 5064

Ok, I found the answer, let me save it here. One has to pad the input bottom/right with zeros. Here is working minimal example:

import os
import math
import numpy as np

import tensorflow as tf
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Input, MaxPool2D
from tensorflow.python.keras import backend as K

import caffe
from caffe.model_libs import P
from caffe import layers as L
from caffe.proto import caffe_pb2


def MaxPooling2DWrapper(pool_size=(2, 2), strides=None, padding='valid', data_format=None, **kwargs):

    def padded_pooling(inputs):
        _, h, w, _ = K.int_shape(inputs)
        interm_input = inputs
        if h % 2 != 0 or w % 2 != 0:
            interm_input = tf.keras.layers.Lambda(lambda x: tf.pad(inputs, [[0, 0], [0, 1], [0, 1], [0, 0]]),
                                                  name='input_pad')(inputs)
        return MaxPool2D(pool_size, strides, padding, data_format, **kwargs)(interm_input)

    return padded_pooling


def build_caffe_model(h, w):
    caffe_spec = caffe.NetSpec()

    pool_config = {                                                                                                                                                                                                                                                   
        'pool': P.Pooling.MAX,                                                                                                                                                                                                                                        
        'kernel_size': 2,                                                                                                                                                                                                                                             
        'stride': 2                                                                                                                                                                                                                                                   
    }                                                                                                                                                                                                                                                                 

    caffe_spec['input'] = L.Input(shape=caffe_pb2.BlobShape(dim=(1, 1, h, w)))                                                                                                                                                                                        
    caffe_spec['max_pool'] = L.Pooling(caffe_spec['input'], **pool_config)                                                                                                                                                                                            

    proto = str(caffe_spec.to_proto())                                                                                                                                                                                                                                
    with open('deploy.prototxt', 'w') as f:                                                                                                                                                                                                                           
        f.write(proto)                                                                                                                                                                                                                                                
    net = caffe.Net('deploy.prototxt', caffe.TEST)                                                                                                                                                                                                                    

    return net                                                                                                                                                                                                                                                        


def build_keras_model(h, w):                                                                                                                                                                                                                                          
    inputs = Input(shape=(h, w, 1))                                                                                                                                                                                                                                   

    maxpool = MaxPooling2DWrapper()(inputs)                                                                                                                                                                                                                           
    return Model(inputs, maxpool)                                                                                                                                                                                                                                     


def main():                                                                                                                                                                                                                                                           
    caffe.set_mode_cpu()                                                                                                                                                                                                                                              
    os.environ['GLOG_minloglevel'] = '2'                                                                                                                                                                                                                              
    h = 3                                                                                                                                                                                                                                                             
    w = 3                                                                                                                                                                                                                                                             
    size_input = h * w                                                                                                                                                                                                                                                

    caffe_net = build_caffe_model(h, w)                                                                                                                                                                                                                               
    keras_model = build_keras_model(h, w)                                                                                                                                                                                                                             
    keras_model.summary()                                                                                                                                                                                                                                             

    keras_out = keras_model.predict(np.arange(size_input).reshape(1, h, w, 1))
    caffe_net.blobs['input'].data[...] = np.arange(size_input).reshape(1, 1, h, w)
    caffe_out = caffe_net.forward()['max_pool']

    print('Input:')
    print(np.arange(size_input).reshape(h, w))

    print('Caffe result:')
    print(np.squeeze(caffe_out))

    print('Keras result:')
    print(np.squeeze(keras_out))


if __name__ == '__main__':
    main()

The wrapper will add the padding only when needed. The output of this code:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 3, 3, 1)           0         
_________________________________________________________________
input_pad (Lambda)           (None, 4, 4, 1)           0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 2, 2, 1)           0         
=================================================================


Input:
[[0 1 2]
 [3 4 5]
 [6 7 8]]
Caffe result:
[[4. 5.]
 [7. 8.]]
Keras result:
[[4. 5.]
 [7. 8.]]

Upvotes: 1

Related Questions