Reputation: 11
I'm trying to find an easy way to get my data into a TensorFlow dataset without having to load it before and process it as a NumPy array.
In this case, I'm working on a segmentation model and my data is structured so one directory has the training data images and a different directory has the 'masks', essentially also images.
With tf.keras.preprocessing.image_dataset_from_directory
, I can load an image dataset and it will either set the labels as the name of the directory or let me set them myself through a function argument but it won't let me set an argument so it takes labels from a different directory. I'm reading through the documentation but I don't see an easy way to load these types of datasets where the labels are images on their own.
Upvotes: 1
Views: 2205
Reputation: 190
This is one possible way to do it in Keras:
train_generator_images = image_data_generator_train.flow_from_dataframe(
dataframe=train,
directory='..//VOCdevkit/VOC2009/JPEGImages',
x_col='filename',
class_mode=None,
color_mode="rgb",
target_size=(image_size[1],image_size[0]),
batch_size=batchSize,
seed=seed)
train_generator_mask = mask_data_generator_train.flow_from_dataframe(
dataframe=train,
directory='..//VOCdevkit/VOC2009/SegmentationClass',
x_col='segmentation',
class_mode=None,
color_mode="grayscale",
target_size=(image_size[1],image_size[0]),
batch_size=batchSize,
seed=seed)
train_generator = zip(train_generator_images, train_generator_mask)
Important here is to set the same seed so the images and labels match. I copied it from an older project of mine so it might not be up to date.
Upvotes: 1