Reputation: 149
I'm using Keras to build a semantic segmentation model. The model is fully convolutional, so while I'm training on specific sized inputs i.e. (224,224,3)
, when predicting, the model should be able to take any sized input, right? However, when I try to predict on an image of a different resolution, I get an error about mismatched shapes in my merge layers. Here is the error:
(1, 896, 1200, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "predict.py", line 60, in main
n = base_model.predict(x)
File "/home/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1790, in predict
verbose=verbose, steps=steps)
File "/home/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1299, in _predict_loop
batch_outs = f(ins_batch)
File "/home/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2357, in __call__
**self.session_kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 778, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,56,74,512] vs. [1,56,75,512]
[[Node: add_1/add = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](conv2d_3/Relu, block5_conv3/Relu)]]
[[Node: conv2d_14/div/_813 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_157_conv2d_14/div", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op u'add_1/add', defined at:
File "<stdin>", line 1, in <module>
File "predict.py", line 42, in <module>
base_model = models.load_model('mod.h5', custom_objects={'loss':loss})
File "/home/.local/lib/python2.7/site-packages/keras/models.py", line 240, in load_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "/home/.local/lib/python2.7/site-packages/keras/models.py", line 314, in model_from_config
return layer_module.deserialize(config, custom_objects=custom_objects)
File "/home/.local/lib/python2.7/site-packages/keras/layers/__init__.py", line 55, in deserialize
printable_module_name='layer')
File "/home/.local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 140, in deserialize_keras_object
list(custom_objects.items())))
File "/home/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2500, in from_config
process_node(layer, node_data)
File "/home/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2459, in process_node
layer(input_tensors, **kwargs)
File "/home/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 603, in __call__
output = self.call(inputs, **kwargs)
File "/home/.local/lib/python2.7/site-packages/keras/layers/merge.py", line 146, in call
return self._merge_function(inputs)
File "/home/.local/lib/python2.7/site-packages/keras/layers/merge.py", line 210, in _merge_function
output += inputs[i]
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 821, in binary_op_wrapper
return func(x, y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 73, in add
result = _op_def_lib.apply_op("Add", x=x, y=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Incompatible shapes: [1,56,74,512] vs. [1,56,75,512]
[[Node: add_1/add = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](conv2d_3/Relu, block5_conv3/Relu)]]
[[Node: conv2d_14/div/_813 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_157_conv2d_14/div", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
My model architecture is the following: I use VGG16 and strip the top layer, and basically put layers in reverse order on top. I also have skip connections between the last convolutional layers of each block. Basically, I'm implementing SegNet. I don't really understand why I'm getting Incompatible shapes: [1,56,74,512] vs. [1,56,75,512]
. I understand that adding an extra connection on to a layer must change its dimensions, but why does Keras's padding not take care of this?
Here is also the code that builds my model:
input_tensor = Input(shape=(None,None,3))
vgg = VGG16(weights='imagenet', include_top=False, input_shape=(None, None,3))
# vgg.summary()
if vgg_train is False:
# Freeze VGG layers
for layer in vgg.layers:
layer.trainable = False
l1_1 = Model.get_layer(vgg, 'block1_conv1')
l1_2 = Model.get_layer(vgg, 'block1_conv2')
l1_p = Model.get_layer(vgg, 'block1_pool')
l2_1 = Model.get_layer(vgg, 'block2_conv1')
l2_2 = Model.get_layer(vgg, 'block2_conv2')
l2_p = Model.get_layer(vgg, 'block2_pool')
l3_1 = Model.get_layer(vgg, 'block3_conv1')
l3_2 = Model.get_layer(vgg, 'block3_conv2')
l3_3 = Model.get_layer(vgg, 'block3_conv3')
l3_p = Model.get_layer(vgg, 'block3_pool')
l4_1 = Model.get_layer(vgg, 'block4_conv1')
l4_2 = Model.get_layer(vgg, 'block4_conv2')
l4_3 = Model.get_layer(vgg, 'block4_conv3')
l4_p = Model.get_layer(vgg, 'block4_pool')
l5_1 = Model.get_layer(vgg, 'block5_conv1')
l5_2 = Model.get_layer(vgg, 'block5_conv2')
l5_3 = Model.get_layer(vgg, 'block5_conv3')
l5_p = Model.get_layer(vgg, 'block5_pool')
#Encoder: Basically re-building VGG layer by layer, because Keras's concat only takes tensors, not layers
x = l1_1(input_tensor)
o1 = l1_2(x)
x = l1_p(o1)
x = l2_1(x)
o2 = l2_2(x)
x = l2_p(o2)
x = l3_1(x)
x = l3_2(x)
o3 = l3_3(x)
x = l3_p(o3)
x = l4_1(x)
x = l4_2(x)
o4 = l4_3(x)
x = l4_p(o4)
x = l5_1(x)
x = l5_2(x)
o5 = l5_3(x)
x = l5_p(o5)
#Decoder layers: VGG architecture in reverse with skip connections and dropout layers
#Block 1
up1 = UpSampling2D()(x)
conv1 = Conv2D(512, 3, activation='relu', padding='same')(up1)
conv1 = Conv2D(512, 3, activation='relu', padding='same')(conv1)
conv1 = Conv2D(512, 3, activation='relu', padding='same')(conv1)
conv1 = add([conv1,o5])
batch1 = BatchNormalization()(conv1)
#Block 2
up2 = UpSampling2D()(batch1)
conv2 = Conv2D(512, 3, activation='relu', padding='same')(up2)
conv2 = Conv2D(512, 3, activation='relu', padding='same')(conv2)
conv2 = Conv2D(512, 3, activation='relu', padding='same')(conv2)
conv2 = add([conv2,o4])
batch2 = BatchNormalization()(conv2)
#Block 3
up3 = UpSampling2D()(batch2)
conv3 = Conv2D(256, 3, activation='relu', padding='same')(up3)
conv3 = Conv2D(256, 3, activation='relu', padding='same')(conv3)
conv3 = Conv2D(256, 3, activation='relu', padding='same')(conv3)
conv3 = add([conv3,o3])
batch3 = BatchNormalization()(conv3)
#Block 4
up4 = UpSampling2D()(batch3)
conv4 = Conv2D(128, 3, activation='relu', padding='same')(up4)
conv4 = Conv2D(128, 3, activation='relu', padding='same')(conv4)
conv4 = add([conv4,o2])
batch4 = BatchNormalization()(conv4)
#Block 5
up5 = UpSampling2D()(batch4)
conv5 = Conv2D(64, 3, activation='relu', padding='same')(up5)
conv5 = Conv2D(64, 3, activation='relu', padding='same')(conv5)
conv5 = add([conv5,o1])
batch5 = BatchNormalization()(conv5)
#Final prediction layer
soft5 = Conv2D(dims, kernel_size=8, strides=8, activation='softmax', padding='same')(batch5)
model = Model(input_tensor,soft5)
model.summary()
return model
Upvotes: 1
Views: 2173
Reputation: 149
Found the solution in case anyone else runs into this issue. Merge layers need to have the same dimensionality, for obvious reasons. The problem arises when downsampling/upsampling. An image with a width of 115 for example, if downsampling by a factor of 2, will be reduced to the ceiling of 57.5, i.e 58. When upsampling this, the resulting tensor has width 116, which causes problems when trying to merge the 115 width layer with one of width 116. The solution for my case was pretty simple. Since all of my training data is the same size, the issue only occurs during inference. At this time, if the image has a dimension not divisible by 32, I just resize and then crop so that it does.
Upvotes: 2