Why replacing max pool by average pool using Keras APIs fails?

Question

I'm trying to replace max pooling layers in a pre-trained network with average pooling layers using Keras APIs. Somehow it doesn't work for me. I would highly appreciate if you could help me to figure out how to implement it.

Below is my current solution:

def replace_max_by_average_pooling(model):

    input_layer, *other_layers = model.layers
    assert isinstance(input_layer, keras.layers.InputLayer)

    x = input_layer.output
    for layer in other_layers:
        if isinstance(layer, keras.layers.MaxPooling2D):
            layer = keras.layers.AveragePooling2D(
                pool_size=layer.pool_size,
                strides=layer.strides,
                padding=layer.padding,
                data_format=layer.data_format,
                name=f"{layer.name}_av",
            )
        x = layer(x)

    return keras.models.Model(inputs=input_layer.input, outputs=x)

When I try to use this function on VGG net:

vgg = keras.applications.vgg19.VGG19(include_top=False, weights="imagenet")
vgg_av = replace_max_by_average_pooling(vgg)

If I print the summary it looks good:

_________________________________________________________________ Layer (type) Output Shape Param #
================================================================= input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________ block1_pool_av (AveragePooli (None, None, None, 64) 0
_________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________ block2_pool_av (AveragePooli (None, None, None, 128) 0
_________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168
...

However, if I try to build a new model based on a few layers from vgg_av:

layer = vgg_av.get_layer("block3_conv1")
keras.models.Model(inputs=vgg_av.layers[0].input, outputs=layer.output).summary()

somehow the average pooling layers are again replaced by max pooling layers:

_________________________________________________________________ Layer (type) Output Shape Param #
================================================================= input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0
_________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0
_________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168
================================================================= Total params: 555,328 Trainable params: 555,328 Non-trainable params: 0

Am I doing something wrong? Why and where?

My guess is that on this line x = layer(x) new operations are added to the computational graph such that the new operations have name *name of an old operation*_1, and when I call vgg_av.get_layer("block3_conv1") it still fetches a subgraph from vgg. But if I print the layer names in vgg_av, the names are the same as in vgg. And why it only fails when I try to get a subset of layers? I thought to completely rebuild the computational graph, but maybe there are some Keras APIs that I'm missing or I'm missing something conceptually.

Daniel M&#246;ller · Accepted Answer

The cause is that whenever you reuse a layer (you're reusing layers when creating the new branch with avg poolings), you create a new node in the graph.

The original model still exists, and uses the nodes of index 0 for all layers, while your new model uses the nodes of index 1.

Layers should have the method get_output_at(index) or something similar to this, to which you pass the node you want to get the output from. I would guess from my past experience that simply layer.output would bring an error since you have more than one node (but surprisingly the code is accepting that - keras versions vary, I guess).

So, you should achieve your goal by using:

layer = vgg_av.get_layer("block3_conv1")
output = layer.get_output_at(1)
keras.models.Model(inputs=vgg_av.layers[0].input, outputs=output).summary()

It should be a good idea to count the number of .outputs of the last layer of your new models inside replace_max_by_average_pooling after calling the last layer, in case you're going to have more models like this from the same original model (meaning more nodes).

Hack suggestion

Saving and loading models in Keras offers a system (initially meant for custom layers and custom functions) in which you define what keras should use for class names and function names it doesn't know.

Loading a model is "creating a model again with saved parameters". So, if you use this system to "replace" an existing name, it should work replacing layers during the model reconstruction.

custom_objects = {'MaxPooling2D': AveragePooling2D} 
vgg.save_model(filename)
vgg_ag = keras.models.load_model(filename, custom_objects = custom_objects)

If this doesn't work, you can make a custom function that returns an avg pooling using the given parameters like:

def createAvgFromMax(**params):

    #study the params, choose what to keep and discard
    return AveragePooling2D(....)

And custom_objects = { 'MaxPooling2D': createAvgFromMax }

Why replacing max pool by average pool using Keras APIs fails?

Answers (1)

Hack suggestion

Related Questions