RandomGuy
RandomGuy

Reputation: 1207

Keras ImageDataGenerator : how to use data augmentation with images paths

I am working on a CNN model and I would like to use some data augmentation, but two problems arise :

  1. My labels are images (my model is some kind of autoencoder, but the expected output images are different from my input images), thus I cannot use functions such as ImageDataGenerator.flow_from_directory(). I was thinking of ImageDataGenerator.flow(train_list, y = labels_list), but there comes my second issue :
  2. Both my input and labels datasets being really huge, I'd prefer working with images paths (which are not handled correctly by the flow() function) rather than loading all my dataset in a single array and making my RAM explode.

How can I properly deal with these two issues? For what I've found, there might be two solutions :

  1. Create my own generator : I've heard of the Keras __getitem__ function in the Sequence class, but can it impact the ImageDataGenerator class?
  2. Work with TF DATA or TFRecords, but they seem pretty difficult to use, and the data augmentation is still to be implemented.

Is there an easiest way to overcome this simple problem? A mere trick would be to force ImageDataGenerator.flow() to work with a nparray of images paths rather than a nparray of images, but I fear that modifying the Keras/tensorflow files will have unexpected consequences (as some functions are called in other classes, a local change can soon result in a global change in all of my notebook library).

Upvotes: 2

Views: 3165

Answers (1)

RandomGuy
RandomGuy

Reputation: 1207

Ok so I finally found out how to deal with these issues thanks to this article. My mistake was that I kept using ImageDataGenerator despite its lack of flexibility, the solution is thus simple : use another data augmentation tool.

We can resume the author's method as following :

  1. First, create a personalized batch generator as a subclass of Keras Sequence class (which implies to implement a __getitem__ function that loads the images according to their respective paths).
  2. Use the data augmentation albumentations library. It has the advantages of offering more transformation functions as Imgaug or ImageDataGenerator, while being faster. Moreover, this website allows you to test some of its augmentation methods, even with your own images ! See this one for the exhaustive list.

The drawback of this library is that, as it is relatively new, few documentation can be found online, and I've spent several hours trying to resolve an issue I encountered.

Indeed, when I tried to visualize some augmentation functions, the results were entirely black images (strange fact : these would happen only when I was modifying the intensity of the pixels, with methods like RandomGamma or RandomBrightnessContrast. With transformation functions such as HorizontalFlip or ShiftScaleRotate, it would work normally).

After an entire half day of trying-to-find-what's-wrong, I eventually came up with this solution, that might help you if you were to try this library : the loading of images has to be done with OpenCV (I was using load_img and img_to_array functions from tf.keras.preprocessing.image for the loading and processing). If anyone has an explanation of why this doesn't work, I'd be glad to hear it.

Anyway, here is my final code to display an augmented image :

!pip install -U git+https://github.com/albu/albumentations > /dev/null && echo "All libraries are successfully installed!"
from albumentations import Compose, HorizontalFlip, RandomBrightnessContrast, ToFloat, RGBShift
import cv2
import matplotlib.pyplot as plt
import numpy as np
from google.colab.patches import cv2_imshow # I work on a Google Colab, thus I cannot use cv2.imshow()


augmentation = Compose([HorizontalFlip(p = 0.5),
                        RandomBrightnessContrast(p = 1),
                        ToFloat(max_value = 255) # Normalize the pixels values into the [0,1] interval
                        # Feel free to add more !
                        ])

img = cv2.imread('Your_path_here.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2.imread() loads the images in BGR format, thus you have to convert it to RGB before applying any transformation function.
img = augmentation(image = img)['image'] # Apply the augmentation functions to the image.
plt.figure(figsize=(7, 7))
plt.imshow((img*255).astype(np.uint8)) # Put the pixels values back to [0,255]. Replace by plt.imshow(img) if the ToFloat function is not used.
plt.show()


'''
If you want to display using cv2_imshow(), simply replace the last three lines by :

img = cv2.normalize(img, None, 255,0, cv2.NORM_MINMAX, cv2.CV_8UC1) # if the ToFloat argument is set up inside Compose(), you have to put the pixels values back to [0,255] before plotting them with cv2_imshow(). I couldn't try with cv2.imshow(), but according to the documentation it seems this line would be useless with this displaying function.
cv2_imshow(img)

I don't recommend it though, because cv2_imshow() plot the images in BGR format, thus some augmentation methods such as RGBShift will not work properly.
'''

EDIT :

I've encountered several issues with the albumentations library (that I described in this question on Github, but for now I still have had no answers) thus I'd better recommend using Imgaug for your data augmentation : it works just fine and is almost as easy to use as albumentations, even though there is a little bit less available transformation functions.

Upvotes: 3

Related Questions