J. Doe
J. Doe

Reputation: 11

Adding layers to the top of resnet[keras]: ValueError: Input 0 is incompatible with layer conv2d_transpose_1: expected ndim=4, found ndim=2

We have imported a ResNet50 model pretrained on ImageNet and desire to add on the top of it some deconvolutional layers to achieve semantic segmentation.

We're using google colaboratory with Keras and Tensorflow as backend.

import keras
from keras.applications.resnet50 import ResNet50
from keras.layers import Dense, Activation, Conv2DTranspose, Reshape, UpSampling2D
from keras.regularizers import l2
from keras import backend as K; 

height = 224 #dimensions of image
width = 224
channel = 3

# Importing the ResNet architecture pretrained on ImageNet
resnet_model = ResNet50(weights = 'imagenet', input_shape=(height, width, channel))
# Removing the classification layer and the last average
resnet_model.layers.pop()   
resnet_model.layers.pop()
#resnet_model.summary() 


# Upsampling
conv1 = Conv2DTranspose(28, (3,3), strides=(2,2), activation = None, kernel_regularizer=l2(0.))(resnet_model.outputs)
model = Model(inputs=resnet_model.input, outputs=conv1)

We get the following error:

"ValueError: Input 0 is incompatible with layer conv2d_transpose_1: expected ndim=4, found ndim=2"

It seems like the output of our resnet model (without the last two layers) is a mono-dimensional vector but we expect it to be a three dimensional vector.

This is the final output part of "resnet_model.summary()" after the pop

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_10 (InputLayer)           (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_10[0][0]                   
__________________________________________________________________________________________________
.
.
.
.
.          
__________________________________________________________________________________________________
bn5c_branch2b (BatchNormalizati (None, 7, 7, 512)    2048        res5c_branch2b[0][0]             
__________________________________________________________________________________________________
activation_489 (Activation)     (None, 7, 7, 512)    0           bn5c_branch2b[0][0]              
__________________________________________________________________________________________________
res5c_branch2c (Conv2D)         (None, 7, 7, 2048)   1050624     activation_489[0][0]             
__________________________________________________________________________________________________
bn5c_branch2c (BatchNormalizati (None, 7, 7, 2048)   8192        res5c_branch2c[0][0]             
__________________________________________________________________________________________________
add_160 (Add)                   (None, 7, 7, 2048)   0           bn5c_branch2c[0][0]              
                                                                 activation_487[0][0]             
__________________________________________________________________________________________________
activation_490 (Activation)     (None, 7, 7, 2048)   0           add_160[0][0]                    
==================================================================================================
Total params: 23,587,712
Trainable params: 23,534,592
Non-trainable params: 53,120
__________________________________________________________________________________________________

How do we solve this?

Upvotes: 1

Views: 1615

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56377

Do not do this:

resnet_model.layers.pop()   

Pop is kind of meaningless for a functional model, because layers are not anymore sequential, specially with ResNet that uses residual connections. If you check that after pop, summary() confirms that the layers were removed, but the computational graph still has them:

>>> resnet_model.output
<tf.Tensor 'fc1000/Softmax:0' shape=(?, 1000) dtype=float32>

A supported way to have a model without the classification layers is to use include_top=False:

resnet_model = ResNet50(weights = 'imagenet', input_shape=(224,224,3), include_top=False)

You can confirm that by instantiating the model the output tensor has the expected shape and semantics:

>>> resnet_model.output
<tf.Tensor 'activation_98/Relu:0' shape=(?, 7, 7, 2048) dtype=float32>

One additional thing, prefer to use model.output instead of model.outputs, as this specific model has only a single output.

Upvotes: 2

Related Questions