André Marques
André Marques

Reputation: 23

How to extract data without label from tensorflow dataset

I have a tf dataset called train_ds:

directory = 'Data/dataset_train'

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  directory,
  validation_split=0.2,
  subset="training",
    color_mode='grayscale',
  seed=123,
  image_size=(28, 28),
  batch_size=32)

This dataset is composed of 20000 images of "Fake" images and 20000 "Real" images and I want to extract X_train and y_train in numpy form from this tf dataset but I have only managed to get the labels out with

y_train = np.concatenate([y for x, y in train_ds], axis=0)

I also tried with this but it doesn't seem like it's iterating through the 20000 images:

for images, labels in train_ds.take(-1):  
    X_train = images.numpy()
    y_train = labels.numpy()

I really want to extract the images to X_train and the labels to y_train but I can't figure it out! I apologize in advance for any mistake I've made and appreciate all the help I can get :)

Upvotes: 2

Views: 2110

Answers (2)

Youcef4k
Youcef4k

Reputation: 396

You can use TF Dataset method unbatch() to unbatch the dataset, then you can easily retrieve the data and the labels from it:

data=[]
for images, labels in ds.unbatch():
    data.append(images)

Upvotes: 1

Frightera
Frightera

Reputation: 5079

If you did not apply further transformations to the dataset it will be a BatchDataset. You can create two lists to iterate over dataset. Here in total I have 2936 images.

x_train, y_train = [], []

for images, labels in train_ds:
  x_train.append(images.numpy())
  y_train.append(labels.numpy())

np.array(x_train).shape >> (92,)

It was generating batches. You can use np.concatenate to concat them.

x_train = np.concatenate(x_train, axis = 0) 
x_train.shape >> (2936,28,28,3)

Or you can unbatch the dataset and iterate over it:

for images, labels in train_ds.unbatch():
  x_train.append(images.numpy())
  y_train.append(labels.numpy())

x_train = np.array(x_train)
x_train.shape >> (2936,28,28,3)

Upvotes: 2

Related Questions