niha
niha

Reputation: 11

How to resize MNIST images without running out of RAM?

I'm trying to preprocess my data to resize the training set images to 224 * 224 with 3 channels to use it as input to VGG 16 model and I'm running out of RAM. How do I resolve this?

new_size = (224,224)
new_x_train = []
for image in x_train:
  image = x_train[image]
  image = tf.constant(image)
  image = tf.expand_dims(image, axis = -1)
  image = tf.concat([image, image, image], axis = -1)
  image = tf.image.resize(image,new_size)
  new_x_train.append(image)

new_x_train = tf.stack(new_x_train)

This works for a single image. However, when i try to do the same thing for all the 60000 images using a loop, I run out of RAM

Upvotes: 1

Views: 393

Answers (1)

Naufal Suryanto
Naufal Suryanto

Reputation: 92

Your current approach will load all your images into your memory, which is inefficient. Try to learn using the python generator or using TensorFlow Dataset to preprocess your data on the fly.

For your current case, here is the example for using TensorFlow Dataset if your x_train is a NumPy array:

new_size = (224, 224)

def resize_image(image):
    image = tf.expand_dims(image, axis=-1)
    image = tf.repeat(image, 3, axis=-1)
    image = tf.image.resize(image, new_size)

x_train_ds = tf.data.Dataset.from_tensor_slices(x_train)
x_train_ds = x_train_ds.map(resize_image)

Using tf.data the resize_image function will be called for every iteration instead of loading all of it directly to your memory. But if you want to store it directly in your memory, you can still do it by calling x_train_ds = x_train_ds.cache(), but I won't recommend it if you have limited memory.

Furthermore, I encourage you to learn it in more detail from this link:

Upvotes: 1

Related Questions