user2489835
user2489835

Reputation: 37

Keras Flatten Layer - Invalid Argument Error, matrix not flattening?

Explanation, then code, then output, then error:

It looks like the flatten layer is not doing its job, and the output shape is dependent on batch size (when I set BATCH_SIZE=32 the [1,32768] becomes [1,16384]). I can not for the life of me understand what I'm doing wrong or how to fix it. I've looked at the Keras documentation for Flatten and Dense layers. Also I'm using Tensorflow backend and this is in fact reflected in the keras.json file.

This is my code:

BATCH_SIZE = 64
EPOCHS = 1000
EPOCH_STEP = 50

vgg = keras.applications.vgg16.VGG16(include_top=False,weights='imagenet',input_shape=(48,48,3))
vgg_input = vgg.inputs
vgg_output = vgg.outputs


#freeze the vgg layers
for layer in vgg.layers:
    layer.trainable = False

print('~~~~~~~~~~~~~~~~~~~~~~Tensors~~~~~~~~~~~~~~')
print('vgg_output tensor:')
print(vgg_output)
print()
model_tensor = Flatten()(vgg_output)
print('flattened vgg_output tensor:')
print(model_tensor)
print()
model_tensor = Dense(32, activation='relu')(model_tensor)
print('dense FC flattened vgg_output tensor:')
print(model_tensor)
print('~~~~~~~~~~~~~~~~~~~~~~Tensors~~~~~~~~~~~~~~')
model_tensor = Dense(2, activation='softmax')(model_tensor)

model = Model(inputs=vgg_input,outputs=model_tensor)
print('Model architecture made')
#CHOSEN ARBITRARILY FOR NOW
model.compile(optimizer='rmsprop',
            loss='binary_crossentropy',
            metrics=['accuracy'])
print('Model Compiled')             
print(model.summary())

#train top model
val_batch, val_labels = dataGenerator.generateDataBatch(256)
print('validation batch loaded')
batch, labels = dataGenerator.generateDataBatch(2048)
print('training batch loaded')
print('t-batch shape: ' + str(batch.shape))
print('t-batch lable shape: ' + str(labels.shape))
        model.fit(x=batch,y=labels,batch_size=BATCH_SIZE,epochs=EPOCHS,verbose=2,validati    on_data=(val_batch,val_labels),shuffle=True)

The print outputs:

Printed Info:
~~~~~~~~~~~~~~~~~~~~~~Tensors~~~~~~~~~~~~~~
vgg_output tensor:
[<tf.Tensor 'block5_pool/MaxPool:0' shape=(?, 1, 1, 512) dtype=float32>]

flattened vgg_output tensor:
Tensor("flatten_1/Reshape:0", shape=(?, ?), dtype=float32)

dense FC flattened vgg_output tensor:
Tensor("dense_1/Relu:0", shape=(?, 32), dtype=float32)
~~~~~~~~~~~~~~~~~~~~~~Tensors~~~~~~~~~~~~~~
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 48, 48, 3)         0
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 48, 48, 64)        1792
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 48, 48, 64)        36928
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 24, 24, 64)        0
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 24, 24, 128)       73856
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 24, 24, 128)       147584
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 12, 12, 128)       0
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 12, 12, 256)       295168
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 12, 12, 256)       590080
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 12, 12, 256)       590080
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 6, 6, 256)         0
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 6, 6, 512)         1180160
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 6, 6, 512)         2359808
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 6, 6, 512)         2359808
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 3, 3, 512)         0
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 3, 3, 512)         2359808
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 3, 3, 512)         2359808
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 3, 3, 512)         2359808
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 1, 1, 512)         0
_________________________________________________________________
flatten_1 (Flatten)          (None, 512)               0
_________________________________________________________________
dense_1 (Dense)              (None, 32)                16416
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 66
=================================================================
Total params: 14,731,170
Trainable params: 16,482
Non-trainable params: 14,714,688
_________________________________________________________________
None
Model Compiled
Model architecture made
Model Compiled
validation batch loaded
training batch loaded
t-batch shape: (2048, 48, 48, 3)
t-batch lable shape: (2048, 2)

Last part of the error message (I can add all of it but the rest seems unhelpful):

tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [1,32768], In[1]: [512,32]
     [[Node: dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](flatten_1/Reshape, dense_1/kernel/read)]]

Edit: Added the output of print(model.summary()) immediately after model.compile().

Upvotes: 1

Views: 902

Answers (1)

today
today

Reputation: 33410

The outputs (and inputs) attribute of a Keras model returns a list of (tensor) outputs of the model. You can confirm this in your logs:

vgg_output tensor:
[<tf.Tensor 'block5_pool/MaxPool:0' shape=(?, 1, 1, 512) dtype=float32>] <-- this is a list

VGG16 has one output (i.e. is a sequential model) and you need to explicitly pass this one output tensor (i.e. the first element of returned list) to the next layer which is Flatten:

model_tensor = Flatten()(vgg_output[0])  # pass the first element of output

As a side note, If you would like to see the progress bar of training, don't pass verbose argument to fit method, or pass it with value of 1 (i.e. verbose=1).

Upvotes: 1

Related Questions