XDXK
XDXK

Reputation: 21

How to write lots of ndarray into a file, where each ndarray is saved in a single line?

I have lots of ndarray objects to save. And I want to save these ndarray one by one, which means each ndarray is saved in a single line within a file. I found that np.savez seems not useful for this case. How can I do this? Thanks!

I have tried the way like this:

When save these ndarrays,

with open(file, 'a') as f:
  for i in range(n)
    f.write(str(ndarry[i].tostring()) + '\n')

And when load and recover them,

list_array = []
with open(file, 'a') as f:
  line = f.reanline().strip('\n')
  while line
    ndarray = np.fromstring(line, dtype=np.int64).reshape((2,3))
    list_array.append(ndarray)
    line = f.reanline().strip('\n')

But I got "ValueError: string size must be a multiple of element size"

Upvotes: 2

Views: 53

Answers (1)

hpaulj
hpaulj

Reputation: 231385

Did you try to debug the line write/read for just one array? Look at the steps in detail?

In [568]: arr = np.arange(3)                                                                           
In [569]: arr                                                                                          
Out[569]: array([0, 1, 2])
In [570]: arr.tostring()                                                                               
Out[570]: b'\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'

The b tells us this is a bytestring (like old py2 strings). Then you 'wrap' it in py3 string:

In [571]: str(arr.tostring())+'\n'                                                                     
Out[571]: "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\n"

Now try the read:

In [572]: _.strip('\n')                                                                                
Out[572]: "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00'"
In [573]: np.fromstring(_, np.int64)                                                                   
/usr/local/bin/ipython3:1: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
  #!/usr/bin/python3
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-573-fa8feb7879b7> in <module>
----> 1 np.fromstring(_, np.int64)

ValueError: string size must be a multiple of element size

I can recover the original array from the tostring output:

In [574]: np.frombuffer(arr.tostring(), np.int64)                                                      
Out[574]: array([0, 1, 2])

All that str and newline stuff to write a binary string to the file is messing up the read.

Upvotes: 0

Related Questions