ZSS
ZSS

Reputation: 43

running out of ram in google colab while importing dataset in array

I want to store about 2400 images of size 2000**2000*3 in an array to feed a convolutional neural net. but, Google Colab session keeps crashing due to running out of ram.

My code to importing image dataset:

Train_data = []
for img in sorted(glob.glob("path/*.jpg")):
    image= mpimg.imread(img)
    image=np.array(image , dtype='float32') 
    image /= 255.
    Train_data.append(image) 
Train_data = np.array(Train_data)

Upvotes: 4

Views: 902

Answers (2)

ZSS
ZSS

Reputation: 43

Thanks a lot for your great answer. I try generator and it's ok. However when I try below code, I do not face crashing:

Train_data =np.empty(shape=(num,m,n,c), dtype=np.float32)
i=0
for img in sorted(glob.glob("path/*.jpg")):
 image= mpimg.imread(img)
 image=np.array(image , dtype='float32') 
 image /= 255.
 Train_data[i,:,:,:] = image  
 i+=1

can anyone compare this code and my first code which use append in terms of space complexity?

Upvotes: 0

Prakash Dahal
Prakash Dahal

Reputation: 4875

There are two possible ways you can do to avoid RAM error:

First option: Resize the images to lower size

import cv2

Train_data = []
for img in sorted(glob.glob("path/*.jpg")):
    image= mpimg.imread(img)
    image=np.array(image , dtype='float32') 
    image = cv2.resize(image, (150,150))
    image /= 255.
    Train_data.append(image) 
Train_data = np.array(Train_data)

Second option: You can use generators which consumes less memory than iterators as it do not store the whole list.

Train_data = []

def gen_images():
    for img in sorted(glob.glob("path/*.jpg")):
        image= mpimg.imread(img)
        image=np.array(image , dtype='float32') 
        image /= 255.
        yield image

for image in gen_images():
    Train_data.append(image)

Upvotes: 4

Related Questions