Reputation: 127
I am tinkering with a Keras implementation (that I did not build myself) of a VGG16 convolutional network. Tensorflow backend. The input image sizes vary so I specified the first layer as such using None for the variable width and height.
model.add(ZeroPadding2D((1, 1), input_shape=(3, None, None)))
Problem is that at a point during the building of losses I need to get the output_shape of one of my convolutional layers and of course this comes out with some undefined dimensions.
I wonder if there is a way to set the input width and height of the first layer just for the purpose of calculating this output_shape from the middle of my layer stack. I am not good enough at the arithmetic to calculate this myself through the chain of layers.
I should say I am a noob at this and so will appreciate verbose answers.
Upvotes: 3
Views: 1920
Reputation: 14619
Instead of using output_shape
of a layer, you can use the shape of the output tensor from that layer. K.shape(x)
gives you the shape of the tensor x
. The dynamic axes (i.e., the None
axes) will be filled with the corresponding width and height at runtime.
Here's an example showing how to use the output shape of a middle layer in a self-defined loss (the loss function itself is meaningless, just to show that shape
evaluates to different values according to the input array):
input_tensor = Input(shape=(3, None, None))
middle_tensor = Conv2D(100, 1)(input_tensor) # shape = (100, None, None)
output_tensor = GlobalMaxPooling2D()(middle_tensor) # not important
model = Model(input_tensor, output_tensor)
def get_loss(shape):
def dummy_loss(y_true, y_pred):
return K.cast(K.prod(shape), K.floatx())
return dummy_loss
dummy_loss = get_loss(K.shape(middle_tensor))
model.compile(loss=dummy_loss, optimizer='sgd')
print(model.evaluate(np.zeros((1, 3, 2, 2)), np.zeros((1, 1))))
=> 400.0
print(model.evaluate(np.zeros((1, 3, 224, 224)), np.zeros((1, 1))))
=> 5017600.0
As you can see, in the first call, K.shape(middle_tensor)
evaluates to (100, 2, 2)
so K.prod(shape)
is 400. In the second call, K.shape(middle_tensor)
evaluates to (100, 224, 224)
so K.prod(shape)
becomes 5017600.
Upvotes: 3
Reputation: 115
If you want to use convolution layers (like VGG16) you have to resize your images to the right dimensions and if you want to use the pretrained weights you have to use the same sizes as they used to train (224x224 fort imagenet trained weights).
Your ImageDataGenerator() can do this resizing for you (img_height and omg_weight below)
train_datagen=ImageDataGenerator()
valid_datagen=ImageDataGenerator()
train_generator =train_datagen.flow_from_directory(
train_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
train_filenames = train_generator.filenames
train_samples = len(train_filenames)
validation_generator = validation_datagen.flow_from_directory(
valid_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted
validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
Upvotes: 0