Myra
Myra

Reputation: 21

How to build an autoencoder in tensorflow using own dataset images?

I am beginner in Tensorflow and I want to create a simple autoencoder for images. I tried some examples that I found in the net, but all this are working on Mnist dataset which makes iteasy to prepocessing these images.

I want to create an autoencoder for my own dataset images. My question is:

how create a simple autoencoder in tensorflow using my own dataset images

Upvotes: 1

Views: 1173

Answers (1)

Rajith Thennakoon
Rajith Thennakoon

Reputation: 4130

I think this answer will helpful.I guess you can understand that,there is no general solution for each problem.you need to try different architectures,different combinations of hyper-parameters and different image processing techniques to train your network.

  1. For data preprocessing and loading images to the network You can use keras image processing,ImageGenerator class for image augmentation and load images to the network.

First you create ImageGenerator with required configurations.

 datagen = ImageDataGenerator(width_shift_range=0.1,  
                              height_shift_range=0.1,  
                              horizontal_flip=False,  
                              vertical_flip=False,  
                              rescale=1/255)

Second load Images from directory through the ImageGenerator

trainGene = datagen.flow_from_directory(train_path,
                                        color_mode="grayscale", #use grayscale images
                                        target_size=(image_height,image_width), #image size
                                        shuffle=True,
                                        class_mode="input",
                                        batch_size=batch_size,
                                        save_to_dir=None)

You can create validation data generator and load validation dataset from the directory.for example validation (valGene)

  1. Build the Convolution AutoEncoder model and fit to the Generators

this is depend on the use case and you need to try different layers and loss function and different architectures to achieve the required threshold. for example,start with simple architecture.

Layer (type)                 Output Shape              Param #   
=================================================================
input0 (InputLayer)          (None, 64, 32, 1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 32, 32)        320       
_________________________________________________________________
activation_1 (Activation)    (None, 64, 32, 32)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 32, 16, 32)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 16384)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 16)                262160    
_________________________________________________________________
dense_2 (Dense)              (None, 16384)             278528    
_________________________________________________________________
reshape_1 (Reshape)          (None, 32, 16, 32)        0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 64, 32, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 64, 32, 32)        9248      
_________________________________________________________________
activation_2 (Activation)    (None, 64, 32, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 64, 32, 1)         289       
_________________________________________________________________
activation_3 (Activation)    (None, 64, 32, 1)         0         

model.fit_generator(trainGene,
                      steps_per_epoch=trainGene.n/batch_size,
                      validation_data=valGene,
                      validation_steps=valGene.n/batch_size,
                      epochs=epochs, # number of epochs
                      verbose=True)
  1. predict the re-construct images create another generator for test set (testGene)
  restored = model.predict_generator(testGene, steps=testGene.n/batch_size)
  1. Get the difference

    now you have re-constructed images for given order.

    difference = reconstructed_image - original_image

    for example,

    if you want to get the mean squared error for each image

       RMSE = np.sqrt(np.square(restored - x_test)/dim) 
       #x_test is your original images that used to predict
    

    you can get the x_test through the testGene like this

        x_test = np.zeros((0,  image_height, image_width, image_channels), dtype=float)
        for x, _ in testGene:
            x_test = np.r_[x_test, x]
            if testGene.total_batches_seen > testGene.n/batch_size:
                break
    

Upvotes: 3

Related Questions