Memory error pickle dump while saving/loading data from/into disk

Question

l have a dataset of 40,000 examples dataset=(40.000,2048). After a process l would like to store and load dataset efficiently. Dataset is in an numpy format

l used pickle to store this dataset but it takes time to store and more time to load it. I even get memory error.

l tried to split the dataset into several sample as follow :

with open('dataset_10000.sav', 'wb') as handle:
    pickle.dump(train_frames[:10000], handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dataset_20000.sav', 'wb') as handle:
    pickle.dump(train_frames[10000:20000], handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dataset_30000.sav', 'wb') as handle:
    pickle.dump(train_frames[20000:30000], handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dataset_35000.sav', 'wb') as handle:
    pickle.dump(train_frames[30000:35000], handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dataset_40000.sav', 'wb') as handle:
    pickle.dump(train_frames[35000:], handle, protocol=pickle.HIGHEST_PROTOCOL)

However l get a memory error and its too heavy.

What is the best/optimized way to save/load such a huge data from/into disk ?

Memory error pickle dump while saving/loading data from/into disk

Answers (1)

Related Questions