Reputation: 6787
list- 100000 cases each having 42 rows and 400 columns.
I tried saving it using numpy.save, but it gave me a memory error. I tried pickle and it hung my computer. It took forever, i had to restart it. H5py is not available for 64 bit python 3.3.5
I want to save the whole list as it is on the disk and later load it completely into a list for further processing. I dont intend on accessing a specific index from the memory.
Is there an efficient way to store the list...
Or will it be better to extract indexes of ones from a row and store those in the memory. (there would be around 8 1's in a row of 400 bits). If i store just index of ones, later i will again have to convert those indexes in 400 bits arrays.
Upvotes: 1
Views: 216
Reputation: 5551
To minimize overhead, you could dump the raw binary data from memory to disk with:
import numpy as np
fname = "/tmp/aa.bin"
shape = (100, 100)
aa = np.random.randn(*shape) # make an array
dtyp = aa.dtype # store data type (here: np.float64)
aa.tofile(fname) # dump to file
with open(fname, 'rb') as f: # read from file
bb = np.fromfile(file=f, dtype=np.dtyp).reshape(shape)
print(np.all(aa == bb)) # prints True
Be aware of compatability topics like endianess, storage order etc. See Scipy's Cookbook / InputOutput for more information.
Upvotes: 0
Reputation: 443
numpy.save should work for this. Maybe you are calling it wrong? The following code works for me:
a = np.ones((100000, 400))
np.save('output', a)
Upvotes: 1