Reputation: 11
We have imported a ResNet50 model pretrained on ImageNet and desire to add on the top of it some deconvolutional layers to achieve semantic segmentation.
We're using google colaboratory with Keras and Tensorflow as backend.
import keras
from keras.applications.resnet50 import ResNet50
from keras.layers import Dense, Activation, Conv2DTranspose, Reshape, UpSampling2D
from keras.regularizers import l2
from keras import backend as K;
height = 224 #dimensions of image
width = 224
channel = 3
# Importing the ResNet architecture pretrained on ImageNet
resnet_model = ResNet50(weights = 'imagenet', input_shape=(height, width, channel))
# Removing the classification layer and the last average
resnet_model.layers.pop()
resnet_model.layers.pop()
#resnet_model.summary()
# Upsampling
conv1 = Conv2DTranspose(28, (3,3), strides=(2,2), activation = None, kernel_regularizer=l2(0.))(resnet_model.outputs)
model = Model(inputs=resnet_model.input, outputs=conv1)
We get the following error:
"ValueError: Input 0 is incompatible with layer conv2d_transpose_1: expected ndim=4, found ndim=2"
It seems like the output of our resnet model (without the last two layers) is a mono-dimensional vector but we expect it to be a three dimensional vector.
This is the final output part of "resnet_model.summary()" after the pop
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_10 (InputLayer) (None, 224, 224, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 input_10[0][0]
__________________________________________________________________________________________________
.
.
.
.
.
__________________________________________________________________________________________________
bn5c_branch2b (BatchNormalizati (None, 7, 7, 512) 2048 res5c_branch2b[0][0]
__________________________________________________________________________________________________
activation_489 (Activation) (None, 7, 7, 512) 0 bn5c_branch2b[0][0]
__________________________________________________________________________________________________
res5c_branch2c (Conv2D) (None, 7, 7, 2048) 1050624 activation_489[0][0]
__________________________________________________________________________________________________
bn5c_branch2c (BatchNormalizati (None, 7, 7, 2048) 8192 res5c_branch2c[0][0]
__________________________________________________________________________________________________
add_160 (Add) (None, 7, 7, 2048) 0 bn5c_branch2c[0][0]
activation_487[0][0]
__________________________________________________________________________________________________
activation_490 (Activation) (None, 7, 7, 2048) 0 add_160[0][0]
==================================================================================================
Total params: 23,587,712
Trainable params: 23,534,592
Non-trainable params: 53,120
__________________________________________________________________________________________________
How do we solve this?
Upvotes: 1
Views: 1615
Reputation: 56377
Do not do this:
resnet_model.layers.pop()
Pop is kind of meaningless for a functional model, because layers are not anymore sequential, specially with ResNet that uses residual connections. If you check that after pop, summary()
confirms that the layers were removed, but the computational graph still has them:
>>> resnet_model.output
<tf.Tensor 'fc1000/Softmax:0' shape=(?, 1000) dtype=float32>
A supported way to have a model without the classification layers is to use include_top=False
:
resnet_model = ResNet50(weights = 'imagenet', input_shape=(224,224,3), include_top=False)
You can confirm that by instantiating the model the output tensor has the expected shape and semantics:
>>> resnet_model.output
<tf.Tensor 'activation_98/Relu:0' shape=(?, 7, 7, 2048) dtype=float32>
One additional thing, prefer to use model.output
instead of model.outputs
, as this specific model has only a single output.
Upvotes: 2