piccolo
piccolo

Reputation: 2217

Keras value error for convolutional autoeconder

I am trying to build a convolutional autoencoder however I am having issues with the decoder part. My input images are 32 by 32 by 3 (RGB).

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Activation, Dropout

def deep_autoencoder(img_shape, code_size):

    #### encoder ######
    encoder = keras.models.Sequential()
    encoder.add(keras.layers.InputLayer(img_shape))

    encoder.add(Conv2D(32, kernel_size=(3, 3), strides=1,
                 activation='elu', padding ='same' ))
    encoder.add(MaxPooling2D(pool_size=(3, 3), padding = 'same'))

    encoder.add(Conv2D(64, kernel_size=(3, 3), strides=1,
                 activation='elu', padding ='same' ))
    encoder.add(MaxPooling2D(pool_size=(3, 3), padding = 'same'))

    encoder.add(Conv2D(128, kernel_size=(3, 3), strides=1,
                 activation='elu', padding ='same' ))
    encoder.add(MaxPooling2D(pool_size=(3, 3), padding = 'same') )   

    encoder.add(Conv2D(256, kernel_size=(3, 3), strides=1,
                 activation='elu', padding ='same' ))

    encoder.add(Flatten())
    encoder.add(Dense(code_size, activation='relu'))


    ##### decoder#####
    decoder = keras.models.Sequential()
    decoder.add(keras.layers.InputLayer((code_size,)))

    decoder.add(Dense(code_size, activation='relu'))
    decoder.add(keras.layers.Reshape([16,16])) #???

    decoder.add(keras.layers.Conv2DTranspose(filters=128, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
    decoder.add(keras.layers.Conv2DTranspose(filters=64, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
    decoder.add(keras.layers.Conv2DTranspose(filters=32, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
    decoder.add(keras.layers.Conv2DTranspose(filters=3, kernel_size=(3, 3), strides=2, padding='same'))


    return encoder, decoder

I assume that my decoder should start of with 16*16 as my dense network at the end of my encoder has 256 nodes. However when I run encoder, decoder = deep_autoencoder(IMG_SHAPE, code_size=32) I get the error:

---> 34     decoder.add(keras.layers.Reshape([16,16]))
.
.
.
ValueError: total size of new array must be unchanged

I can add the full error code if its helpful however I feel like I have got something very basic wrong. In order to apply the deconvolutional filters I need to convert the flattened output of the encoder into a matrix.

For ease of reading the network I have added the model summary for the encoder part - which I get if I comment out the decoder part and run encoder.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 11, 11, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 128)         73856     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 2, 128)         0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 2, 2, 256)         295168    
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 1, 1, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                8224      
=================================================================

Upvotes: 1

Views: 172

Answers (1)

DocDriven
DocDriven

Reputation: 3974

What bothers me about your model are mainly two things: first, the asymmetry of your autoencoder. You use conv and pool layers during encoding, but omit the use of an upsampling (inverse pooling) layer. This is already implemented in keras as UpSampling2D. Furthermore, you should also use the same strides in the conv and deconv layers.

Secondly, after pooling for the fourth time, you end up with a compressed representation of 1x1x256. Why would you try to convert this into a 16x16x1 representation for the decoding part? This is also about symmetry. There's no need to flatten the encoded layer, you can just use the 1x1x256 representation as input for the decoding model. As you are creating the encoder and decoder as separate models, you can stack them like this:

encoder = Sequential()
encoder.add ...
...

decoder = Sequential()
decoder.add(encoder)
decoder.add ...

There's also a tutorial on how to create autoencoders written by Francois Chollet (LINK). It might help you with your implementation.

Upvotes: 1

Related Questions