ValueError: Operands could not be broadcast together with shapes (54, 54, 128) (54, 54, 64)

Question

I wrote a ResNet block with three convolutional layers:

def res_net_block(input_data, filters, kernel_size):
kernel_middle = kernel_size + 2
filters_last_layer = filters * 2

x = Conv2D(filters, kernel_size, activation = 'relu', padding = 'same')(input_data)   #64, 1x1 
x = BatchNormalization()(x)

x = Conv2D(filters, kernel_middle, activation = 'relu', padding = 'same')(x)          #64, 3x3
x = BatchNormalization()(x)

x = Conv2D(filters_last_layer, kernel_size, activation = None, padding = 'same')(x)   #128, 1x1 
x = BatchNormalization()(x)

x = Add()([x, input_data])

x = Activation('relu')(x)
return x

When I add it to my model, I receive this error: ValueError: Operands could not be broadcast together with shapes (54, 54, 128) (54, 54, 64)

Here is my model so far:

inputs = Input(shape = (224, 224, 3))
model = Conv2D(filters = 64, kernel_size = 7, strides = 2, activation = 'relu')(inputs)
model = BatchNormalization()(model)
model = MaxPool2D(pool_size = 3, strides = 2)(model)
for i in range(num_res_net_blocks):
    model = res_net_block(model, 64, 1)

I believe the problem comes from this line in the ResNet block:

x = Add()([x, input_data])

The input data is with different dimensions from the x. But I don't know how to fix this issue. I would really appreciate some help here.

Balraj Ashwath · Accepted Answer

The error is due to adding two tensors with different dimensions - (54, 54, 128) & (54, 54, 64). In order to perform tensor addition, the input dimensions must be the same along all axes. Here's the same note from Keras Add() doc:

Quote: "keras.layers.Add() ... It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape)"

In order to perform residual addition, you need to ensure the two tensors - one along the identity path and one on the residual path, have the same dimensions. As a simple solution to debug the error, in the final Conv2D replace filters_last_layer with filters to get both residual (x) and identity tensor (input_data) to have the same shape (54, 54, 64).

Hope this helps! :)

ValueError: Operands could not be broadcast together with shapes (54, 54, 128) (54, 54, 64)

Answers (1)

Related Questions