Reputation: 5064
If I have MaxPooling2D
layer with pool_size=(2,2), strides=(2,2)
in Keras.
Applied to a 3x3
input feature map, it will result in 1x1
spatial output size. The same operation in Caffe (pool: MAX; kernel_size: 2; stride: 2
) will result in output of size 2x2
.
It is well-known that Caffe and Tensorflow/Keras behave differently when applying max pooling.
There is workaround for 2D convolution: To avoid asymmetric padding of Conv2D in TensorFlow one can prepend it with an explicit zero padding and change the padding type from same
to valid
Is there any similar workaround to change the MaxPooling2D
behavior in Keras such that it performs similar to Caffe? More precisely, I am seeking for a wrapper around MaxPooling2D
that will be equal to the max pooling 2D 2x2 in Caffe.
Maybe, pad the MaxPooling2D
input with one pixel left and top?
I am using tf.keras
from TensorFlow.
Upvotes: 0
Views: 624
Reputation: 5064
Ok, I found the answer, let me save it here. One has to pad the input bottom/right with zeros. Here is working minimal example:
import os
import math
import numpy as np
import tensorflow as tf
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Input, MaxPool2D
from tensorflow.python.keras import backend as K
import caffe
from caffe.model_libs import P
from caffe import layers as L
from caffe.proto import caffe_pb2
def MaxPooling2DWrapper(pool_size=(2, 2), strides=None, padding='valid', data_format=None, **kwargs):
def padded_pooling(inputs):
_, h, w, _ = K.int_shape(inputs)
interm_input = inputs
if h % 2 != 0 or w % 2 != 0:
interm_input = tf.keras.layers.Lambda(lambda x: tf.pad(inputs, [[0, 0], [0, 1], [0, 1], [0, 0]]),
name='input_pad')(inputs)
return MaxPool2D(pool_size, strides, padding, data_format, **kwargs)(interm_input)
return padded_pooling
def build_caffe_model(h, w):
caffe_spec = caffe.NetSpec()
pool_config = {
'pool': P.Pooling.MAX,
'kernel_size': 2,
'stride': 2
}
caffe_spec['input'] = L.Input(shape=caffe_pb2.BlobShape(dim=(1, 1, h, w)))
caffe_spec['max_pool'] = L.Pooling(caffe_spec['input'], **pool_config)
proto = str(caffe_spec.to_proto())
with open('deploy.prototxt', 'w') as f:
f.write(proto)
net = caffe.Net('deploy.prototxt', caffe.TEST)
return net
def build_keras_model(h, w):
inputs = Input(shape=(h, w, 1))
maxpool = MaxPooling2DWrapper()(inputs)
return Model(inputs, maxpool)
def main():
caffe.set_mode_cpu()
os.environ['GLOG_minloglevel'] = '2'
h = 3
w = 3
size_input = h * w
caffe_net = build_caffe_model(h, w)
keras_model = build_keras_model(h, w)
keras_model.summary()
keras_out = keras_model.predict(np.arange(size_input).reshape(1, h, w, 1))
caffe_net.blobs['input'].data[...] = np.arange(size_input).reshape(1, 1, h, w)
caffe_out = caffe_net.forward()['max_pool']
print('Input:')
print(np.arange(size_input).reshape(h, w))
print('Caffe result:')
print(np.squeeze(caffe_out))
print('Keras result:')
print(np.squeeze(keras_out))
if __name__ == '__main__':
main()
The wrapper will add the padding only when needed. The output of this code:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 3, 3, 1) 0
_________________________________________________________________
input_pad (Lambda) (None, 4, 4, 1) 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 2, 2, 1) 0
=================================================================
Input:
[[0 1 2]
[3 4 5]
[6 7 8]]
Caffe result:
[[4. 5.]
[7. 8.]]
Keras result:
[[4. 5.]
[7. 8.]]
Upvotes: 1