jruivo
jruivo

Reputation: 497

Out of memory converting image files to numpy array

I'm trying to run a loop that iterates through an image folder and returns two numpy arrays: x - stores the image as a numpy array y - stores the label.

A folder can easily have over 40.000 rgb images, with dimensions (224,224). I have around 12Gb of memory but after some iterations, the used memory just spikes up and everything stops.

What can I do to fix this issue?

def create_set(path, quality):
    x_file = glob.glob(path + '*')
    x = []

    for i, img in enumerate(x_file):
        image = cv2.imread(img, cv2.IMREAD_COLOR)
        x.append(np.asarray(image))
        if i % 50 == 0:
            print('{} - {} images processed'.format(path, i))

    x = np.asarray(x)
    x = x/255

    y = np.zeros((x.shape[0], 2))
    if quality == 0:
        y[:,0] = 1
    else:
        y[:,1] = 1 

    return x, y

Upvotes: 1

Views: 2375

Answers (1)

Tim Bradley
Tim Bradley

Reputation: 183

You just can't load that many images into memory. You're trying to load every file in a given path to memory, by appending them to x.

Try processing them in batches, or if you're doing this for a tensorflow application try writing them to .tfrecords first.

If you want to save some memory, leave the images as np.uint8 rather than casting them to float (which happens automatically when you normalise them in this line > x = x/255)

You also don't need np.asarray in your x.append(np.asarray(image)) line. image is already an array. np.asarray is for converting lists, tuples, etc to arrays.

edit:

a very rough batching example:

def batching function(imlist, batchsize):
    ims = []
    batch = imlist[:batchsize]

    for image in batch:
        ims.append(image)
        other_processing()

    new_imlist = imlist[batchsize:]
    return x, new_imlist

def main():
    imlist = all_the_globbing_here()
    for i in range(total_files/batch_size):
        ims, imlist = batching_function(imlist, batchsize)
        process_images(ims)

Upvotes: 3

Related Questions