How to get output_shape of layer in model with variable input image sizes

Question

I am tinkering with a Keras implementation (that I did not build myself) of a VGG16 convolutional network. Tensorflow backend. The input image sizes vary so I specified the first layer as such using None for the variable width and height.

model.add(ZeroPadding2D((1, 1), input_shape=(3, None, None)))

Problem is that at a point during the building of losses I need to get the output_shape of one of my convolutional layers and of course this comes out with some undefined dimensions.

I wonder if there is a way to set the input width and height of the first layer just for the purpose of calculating this output_shape from the middle of my layer stack. I am not good enough at the arithmetic to calculate this myself through the chain of layers.

I should say I am a noob at this and so will appreciate verbose answers.

Yu-Yang · Accepted Answer

Instead of using output_shape of a layer, you can use the shape of the output tensor from that layer. K.shape(x) gives you the shape of the tensor x. The dynamic axes (i.e., the None axes) will be filled with the corresponding width and height at runtime.

Here's an example showing how to use the output shape of a middle layer in a self-defined loss (the loss function itself is meaningless, just to show that shape evaluates to different values according to the input array):

input_tensor = Input(shape=(3, None, None))
middle_tensor = Conv2D(100, 1)(input_tensor)  # shape = (100, None, None)
output_tensor = GlobalMaxPooling2D()(middle_tensor)  # not important
model = Model(input_tensor, output_tensor)

def get_loss(shape):
    def dummy_loss(y_true, y_pred):
        return K.cast(K.prod(shape), K.floatx())
    return dummy_loss
dummy_loss = get_loss(K.shape(middle_tensor))
model.compile(loss=dummy_loss, optimizer='sgd')

print(model.evaluate(np.zeros((1, 3, 2, 2)), np.zeros((1, 1))))
=> 400.0

print(model.evaluate(np.zeros((1, 3, 224, 224)), np.zeros((1, 1))))
=> 5017600.0

As you can see, in the first call, K.shape(middle_tensor) evaluates to (100, 2, 2) so K.prod(shape) is 400. In the second call, K.shape(middle_tensor) evaluates to (100, 224, 224) so K.prod(shape) becomes 5017600.

How to get output_shape of layer in model with variable input image sizes

Answers (2)

Related Questions