Reputation: 15
How exactly do the preprocessing layers in keras work, especially in the context of as a part of the model itself? This being compared to preprocessing being applied outside the model then inputting the results for training.
I'm trying to understand running data augmentation in keras models. Lets say I have 1000 images for training. Out of model I can apply augmentation 10x and get 10000 resultant images for training.
But I don't understand what's happening when you use a preprocess layer for augmentation. Does this (or these if you use many) layers take each image and apply the transformations before training? Does this mean the total number of images used for training (and validation I assume) to be the number of epochs*the original number of images?
Is one option better than the other? Does that depend on the number of images one originally has before augmentation?
Upvotes: 0
Views: 1269
Reputation: 376
I had the same question. Essentially there are two recommended ways of applying augmentation:
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(512,512,3)),
tf.keras.layers.Rescaling(scale=1/255.0),
tf.keras.layers.RandomFlip(),
tf.keras.layers.RandomRotation(factor=0.5),
tf.keras.layers.Conv2D(256, kernel_size=(5, 5), padding='valid', activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
tf.keras.layers.Dropout(rate=0.2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
In this method, the number of images remains the same as before. Augmenting changes the image style, but each layer only runs once per image. I couldn't find any concrete documentation for it, but creating a model only consisting of augmentation proved the same.
augmentation_layer = tf.keras.models.Sequential([tf.keras.layers.RandomRotation(factor=0.5)])
nrows = 4
ncols = 4
fig = plt.gcf()
for images, labels in training_set.as_numpy_iterator():
augmented_images = augmentation_layer(images)
break
for i, image in enumerate(augmented_images):
sp = plt.subplot(nrows, ncols, i+1)
plt.imshow(image/255.0)
plt.suptitle('Augmented Images')
for i, image in enumerate(images):
sp = plt.subplot(nrows, ncols, i+1)
plt.imshow(image/255.0)
plt.suptitle('Actual Images')
PS: The training_set
is a _PrefetchDataset
created from tf.keras.utils.image_dataset_from_directory
. You may have to edit the code to get it to work with your dataset.
The results show that for each actual image, only one transformation is applied.
This method is great if you have lots of data and so do not need to increase the dataset size. If however, you wish to create additional data, then option 2 is better:
Upvotes: 1
Reputation: 398
The benefit of preprocessing layers is that the model is truly end-to-end, i.e. raw data comes in and a prediction comes out. It makes your model portable since the preprocessing procedure is included in the SavedModel.
However, it will run everything on the GPU. Usually it makes sense to load the data using CPU worker(s) in the background while the GPU optimizes the model.
Alternatively, you could use a preprocessing layer outside of the model and inside a Dataset. The benefit of that is that you can easily create an inference-only model including the layers, which then gives you the portability at inference time but still the speedup during training.
For more information, see the Keras guide.
Upvotes: 3