B. Klein
B. Klein

Reputation: 55

Keras - Conv2D expects (1,1,1) and got (258, 540, 3)

I'm trying to recreate the results in the Cleaning Up Dirty Scanned Documents with Deep Learning, using the ImageDataGenerator to read in the images and model.fit_generator to train my model, but when I try to fit, I get the following error and stack trace:

---------------------------------------------------------------------------

ValueError Traceback (most recent call last)

<ipython-input-34-2dedade68c7a> in <module>()
      4     epochs=epochs,
      5     validation_data=validation_generator,
----> 6     validation_steps=nb_validation_samples // batch_size)

/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
     89                 warnings.warn('Update your `' + object_name + '` call to the ' +
     90                               'Keras 2 API: ' + signature, stacklevel=2)
---> 91             return func(*args, **kwargs)
     92         wrapper._original_function = func
     93         return wrapper

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   1656             use_multiprocessing=use_multiprocessing,
   1657             shuffle=shuffle,
-> 1658             initial_epoch=initial_epoch)
   1659 
   1660     @interfaces.legacy_generator_methods_support

/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
    213                 outs = model.train_on_batch(x, y,
    214                                             sample_weight=sample_weight,
--> 215                                             class_weight=class_weight)
    216 
    217                 outs = to_list(outs)

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight)
   1441             x, y,
   1442             sample_weight=sample_weight,
-> 1443             class_weight=class_weight)
   1444         if self._uses_dynamic_learning_phase():
   1445             ins = x + y + sample_weights + [1.]

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    793                 feed_output_shapes,
    794                 check_batch_axis=False,  # Don't enforce the batch size.
--> 795                 exception_prefix='target')
    796 
    797             # Generate sample-wise weight values given the `sample_weight` and

/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    139                             ': expected ' + names[i] + ' to have shape ' +
    140                             str(shape) + ' but got array with shape ' +
--> 141                             str(data_shape))
    142     return data
    143 

ValueError: Error when checking target: expected conv2d_10 to have shape (1, 1, 1) but got array with shape (258, 540, 3)

Here is my script:

from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Conv2D, Activation, BatchNormalization, LeakyReLU, MaxPooling2D, UpSampling2D
import numpy as np
from keras import backend as K

img_width, img_height = 258,540

train_data_dir = 'drive/My Drive/train'
validation_data_dir = 'drive/My Drive/train_cleaned'
nb_train_samples = 144
nb_validation_samples = 144
epochs = 200
batch_size = 20

if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

model = Sequential([
    Conv2D(input_shape=input_shape, filters=64, kernel_size=(258, 540), padding='same'),
    LeakyReLU(),
    BatchNormalization(),
    Conv2D(filters=64, kernel_size=(258, 540), padding='same'),
    LeakyReLU(),
    MaxPooling2D((2,2), padding='same'),
    Conv2D(filters=64, kernel_size=(129, 270), padding='same'),
    LeakyReLU(),
    BatchNormalization(),
    Conv2D(filters=64, kernel_size=(129, 270), padding='same'),
    LeakyReLU(),
    UpSampling2D((2, 2)),
    Conv2D(filters=1, kernel_size=(258,540), activation='sigmoid')
])
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    class_mode='input',
    batch_size=batch_size)

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    class_mode='input',
    batch_size=batch_size)

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

The output of model.summary() can be found below:

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 258, 540, 64)      26749504  
_________________________________________________________________
leaky_re_lu_5 (LeakyReLU)    (None, 258, 540, 64)      0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 258, 540, 64)      256       
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 258, 540, 64)      570654784 
_________________________________________________________________
leaky_re_lu_6 (LeakyReLU)    (None, 258, 540, 64)      0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 129, 270, 64)      0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 129, 270, 64)      142663744 
_________________________________________________________________
leaky_re_lu_7 (LeakyReLU)    (None, 129, 270, 64)      0         
_________________________________________________________________
batch_normalization_4 (Batch (None, 129, 270, 64)      256       
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 129, 270, 64)      142663744 
_________________________________________________________________
leaky_re_lu_8 (LeakyReLU)    (None, 129, 270, 64)      0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 258, 540, 64)      0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 1, 1, 1)           8916481   
=================================================================
Total params: 891,648,769
Trainable params: 891,648,513
Non-trainable params: 256
_________________________________________________________________
None

I'm not very familiar with Keras and it seems like the error is with my last layer's input shape, but I don't know how to change the input size that's expected. I tried passing in the parameter input_shape=input_shape to the last layer and that didn't change anything.

I'd appreciate any other critique on the code that I've written, even if it's not directly related to answering my question. Thanks!

EDIT: Here is an image of the network I'm trying to recreate: Neural Net To Recreate

Upvotes: 1

Views: 69

Answers (1)

Nicolas Gervais
Nicolas Gervais

Reputation: 36584

Change all these:

kernel_size=(258, 540)

to this:

kernel_size=(3, 3)

Keras is confused because your convolutional filters are the same size than your images. So after one iteration your input is effectively (1, 1, 1). That is because the filters slide onto the input image, multiplying every corresponding pixel, and then summing up. If the pixels and filters are the same size, all pixels and filter units will be multiplied, and summed to one number.

Upvotes: 2

Related Questions