assad rasheed
assad rasheed

Reputation: 1

loading dataset to avoid memory consumption

I have a dataset that has 60,000 images of 227X227X3. I ran into a shortage of memory while loading these images into memory. I need suggestions to load images in order avoid running out of memory. Below is the python code which I am using to load images. Can any one tell me how can I improve the below snipet.

def loadImages(fnames,is_test):
    path = '/home/assad/Desktop/grandfinal/grandfinalv2/dataset/test_images/'
    if is_test:
        path = '/home/assad/Desktop/grandfinal/grandfinalv2/dataset/test_images/'
    loadedImages = []
     #loadedImages = np.empty((N, 3, 227, 227), dtype=np.uint8)    
    for image in fnames:
        tmp = Image.open(path + image)
        img = tmp.copy()
        loadedImages.append(img)
        tmp.close()
    return loadedImages



def get_pixels(fnames,is_test):
    imgs = loadImages(fnames, is_test)
    #print imgs
    pixel_list = []
    for img in imgs:
        img = img.resize((227, 227), Image.ANTIALIAS)
        arr = np.array(img, dtype="uint8")
        arr=np.rollaxis(arr,2)
        arr=arr.reshape(-1)
        pixel_list.append(list(arr))
    return np.array(pixel_list)


def label_from_category(category_id=None):
    label_list = np.zeros(4)
    label_list[category_id]=1
    return list(label_list)
#print(label_from_category())


def features_from_data(data, is_test=True):
    pixels = get_pixels(data.FILENAME, is_test)
    labels = data["CATEGORY_ID"]
    return pixels, labels

test_data = get_data(is_test=True)



iX_test, iY_test = features_from_data(test_data, is_test=True)
iY_test=iY_test.tolist()
iX_test, iY_test = features_from_data(test_data, is_test=True)
print (iX_test.shape)
iY_test=iY_test.tolist()
print(iY_test)

Upvotes: 0

Views: 36

Answers (1)

salparadise
salparadise

Reputation: 5805

This looks like a textbook usecase for a generator to me.

Change the loadImages function to yield an image, instead of loading all of them into a list.

Try this:

def loadImages(fnames,is_test):
    path = '/home/assad/Desktop/grandfinal/grandfinalv2/dataset/test_images/'
    if is_test:
        path = '/home/assad/Desktop/grandfinal/grandfinalv2/dataset/test_images/'
    for image in fnames:
        tmp = Image.open(path + image)
        img = tmp.copy()
        tmp.close()
        yield img

And the rest of your code should stay the same.

Upvotes: 1

Related Questions