Reputation: 43
I want to store about 2400 images of size 2000**2000*3 in an array to feed a convolutional neural net. but, Google Colab session keeps crashing due to running out of ram.
My code to importing image dataset:
Train_data = []
for img in sorted(glob.glob("path/*.jpg")):
image= mpimg.imread(img)
image=np.array(image , dtype='float32')
image /= 255.
Train_data.append(image)
Train_data = np.array(Train_data)
Upvotes: 4
Views: 902
Reputation: 43
Thanks a lot for your great answer. I try generator and it's ok. However when I try below code, I do not face crashing:
Train_data =np.empty(shape=(num,m,n,c), dtype=np.float32)
i=0
for img in sorted(glob.glob("path/*.jpg")):
image= mpimg.imread(img)
image=np.array(image , dtype='float32')
image /= 255.
Train_data[i,:,:,:] = image
i+=1
can anyone compare this code and my first code which use append in terms of space complexity?
Upvotes: 0
Reputation: 4875
There are two possible ways you can do to avoid RAM error:
First option: Resize the images to lower size
import cv2
Train_data = []
for img in sorted(glob.glob("path/*.jpg")):
image= mpimg.imread(img)
image=np.array(image , dtype='float32')
image = cv2.resize(image, (150,150))
image /= 255.
Train_data.append(image)
Train_data = np.array(Train_data)
Second option: You can use generators which consumes less memory than iterators as it do not store the whole list.
Train_data = []
def gen_images():
for img in sorted(glob.glob("path/*.jpg")):
image= mpimg.imread(img)
image=np.array(image , dtype='float32')
image /= 255.
yield image
for image in gen_images():
Train_data.append(image)
Upvotes: 4