How to Apply Image Augmentations in TensorFlow Pipeline for Large Dataset?

I have a dataset of images, each containing a 1 to 5letter word. I want to use deep learning to classify the characters that make up the word in each image. The labels for these images are formatted as follows: totalcharacter_indexoffirstchar_indexofsecondchar_.._indexoflastchar I'm trying to load these images into TensorFlow pipelines to reduce complexity due to memory constraints. Below is my code for loading and processing images and labels from directory:

def process_img(file_path):
    label = get_label(file_path)
    image = tf.io.read_file(file_path)
    image = tf.image.decode_png(image, channels=1) 
    image = tf.image.convert_image_dtype(image, tf.float32) 
    target_shape = [695, 1204]
    image = tf.image.resize_with_crop_or_pad(image, target_shape[0], target_shape[1])
    
    # Encode the label
    encoded_label = tf.py_function(func=encode_label, inp=[label], Tout=tf.float32)
    encoded_label.set_shape([5, len(urdu_alphabets)])
    
    return image, encoded_label
input_dir = '/kaggle/input/dataset/Data/*'
images_ds = tf.data.Dataset.list_files(input_dir, shuffle=True)

train_count = int(tf.math.round(len(images_ds) * 0.8))
train_ds = images_ds.take(train_count)
test_ds = images_ds.skip(train_count)
train_ds = train_ds.map(process_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
test_ds = test_ds.map(process_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
test_ds = test_ds.batch(32)
train_ds = train_ds.cache()
test_ds = test_ds.cache()
train_ds = train_ds.shuffle(len(train_ds))
test_ds = test_ds.prefetch(tf.data.AUTOTUNE)
print(train_ds)
print(test_ds)

The train_ds looks like this: <_PrefetchDataset element_spec=(TensorSpec(shape=(None, 695, 1204, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 5, 39), dtype=tf.float32, name=None))>

Now, I want to apply simple augmentations on the images such as rotation, shear, erosion, and dilation. I initially used the following function:

def augment(image, label):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_flip_up_down(image)
    image = tf.keras.preprocessing.image.random_rotation(image, rg=15, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest', cval=0.0, interpolation_order=1)
    image = tf.image.random_zoom(image, [0.85, 0.85])
    image = tf.image.random_shear(image, 0.3)
    image = tf.image.random_shift(image, 0.1, 0.1)
    return image, label

train_augmented_ds = train_ds.map(augment, num_parallel_calls=tf.data.AUTOTUNE)
train_augmented_ds = train_augmented_ds.prefetch(buffer_size=tf.data.AUTOTUNE)

However, many of these functions in tf.image are deprecated. How can I apply these augmentations on images in a TensorFlow pipeline in an efficient way?

Note: I can perform these augmentations by loading images without TensorFlow pipelines using NumPy arrays, but my dataset is very large (1.1 million images), so I need an efficient way to do this.

Upvotes: 0

Answers (2)

Muhammad Umer Farooq

Reputation: 1

You can use ImageDataGenerator:

from keras.preprocessing.image import ImageDataGenerator

Upvotes: 0

mhenning

Reputation: 1833

You can use layers, e.g. the RandomRotation layer. I think every operation you listed expect for random shear is available as a TensorFlow layer now. Random shear is available as a layer in the keras-cv package.

You could add these layers at the beginning of your model directly, or create a separate model with these preprocessing layers, which you can add as a sub-model. By default, these augmentations are only applied in training, so your test set (or training set in model.evaluate(train)) will not be affected.

Upvotes: 0

How to Apply Image Augmentations in TensorFlow Pipeline for Large Dataset?

Answers (2)

Related Questions