Reputation: 533
In my program I'm working with various numpy arrays of varying sizes. I need to store them into XML files for later use. I did not write them to binary files so I have all my data in one place (the XML file) and not scattered through 200 files.
So I tried to use numpy's array_str() method to transform an array into a String. The resulting XML looks like this:
-<Test date="2013-07-10-17:19">
<Neurons>5</Neurons>
<Errors>[7.7642140551985428e-06, 7.7639131137987232e-06]</Errors>
<Iterations>5000</Iterations>
<Weights1>[[ 0.99845902 -0.70780512 0.26981375 -0.6077122 0.09639695] [ 0.61856711 -0.74684913 0.20099992 0.99725171 -0.41826754] [ 0.79964397 0.56620812 -0.64055346 -0.50572793 -0.50100635]]</Weights1>
<Weights2>[[-0.1851452 -0.22036027] [ 0.19293429 -0.1374252 ] [-0.27638478 -0.38660974] [ 0.30441414 -0.01531598] [-0.02478953 0.01823584]]</Weights2>
</Test>
The Weights are the values I want to store. Now the problem is that numpy's fromstring() method can't reload these apparently... I get "ValueError: string size must be a multiple of element size"
I wrote them with "np.array_str(w1)" and try to read them with "np.fromstring(w_str1)". Apparently the result is only a 1D array even if it works, so I have to restore the shape manually. Ugh, that is a pain already since I'll also have to store it somehow too.
What is the best way to do this properly? Preferably one that also saves my array's shape and datatype without manual housekeeping for every little thing.
Upvotes: 8
Views: 7756
Reputation: 521
You can use numpy.ndarray.tostring()
to convert the array into string (bytes array actually).
Numpy.ndarray.tostring()
Then this can be later used to read back the array using numpy.fromstring().
In [138]: x = np.arange(12).reshape(3,4)
In [139]: x.tostring()
Out[139]: '\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00'
In [140]: np.fromstring(x.tostring(), dtype=x.dtype).reshape(x.shape)
Out[140]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Upvotes: 0
Reputation: 27575
My suggestion if you really want to preserve the initial XML formatting you had, is to use json
module to convert between ndarray
and string
.
Check the following:
import json, numpy
w1 = numpy.array([[ 0.99845902, -0.70780512, 0.26981375, -0.6077122, 0.09639695],
[ 0.61856711, -0.74684913, 0.20099992, 0.99725171, -0.41826754],
[ 0.79964397, 0.56620812, -0.64055346, -0.50572793, -0.50100635]])
print w1
print
#####
w1string = json.dumps(w1.tolist())
## NOW YOU COULD PASS "w1string" TO/FROM XML
#####
print w1string
print
w1back = numpy.array(json.loads(w1string))
print w1back
print
Upvotes: 1
Reputation: 67427
Unfortunately there is no easy way to read your current output back into numpy. The output won't look as nice on your xml file, but you could create a readable version of your arrays as follows:
>>> import cStringIO
>>> a = np.array([[ 0.99845902, -0.70780512, 0.26981375, -0.6077122, 0.09639695], [ 0.61856711, -0.74684913, 0.20099992, 0.99725171, -0.41826754], [ 0.79964397, 0.56620812, -0.64055346, -0.50572793, -0.50100635]])
>>> out_f = cStringIO.StringIO()
>>> np.savetxt(out_f, a, delimiter=',')
>>> out_f.getvalue()
'9.984590199999999749e-01,-7.078051199999999543e-01,2.698137500000000188e-01,-6.077122000000000357e-01,9.639694999999999514e-02\n6.185671099999999756e-01,-7.468491299999999722e-01,2.009999199999999986e-01,9.972517100000000134e-01,-4.182675399999999932e-01\n7.996439699999999817e-01,5.662081199999999814e-01,-6.405534600000000189e-01,-5.057279300000000477e-01,-5.010063500000000447e-01\n'
And load it back as:
>>> in_f = cStringIO.StringIO(out_f.getvalue())
>>> np.loadtxt(in_f, delimiter=',')
array([[ 0.99845902, -0.70780512, 0.26981375, -0.6077122 , 0.09639695],
[ 0.61856711, -0.74684913, 0.20099992, 0.99725171, -0.41826754],
[ 0.79964397, 0.56620812, -0.64055346, -0.50572793, -0.50100635]])
Upvotes: 3
Reputation: 58885
Numpy provides an easy way to store many arrays in a compressed file:
a = np.arange(10)
b = np.arange(10)
np.savez_compressed('file.npz', a=a, b=b)
You can even change the array names when saving, by doing for example: np.savez_compressed('file.npz', newa=a, newb=b)
.
To read the saved file use np.load()
, which returns a NpzFile
instance that works like a dictionary:
loaded = np.load('file.npz')
To load the arrays:
a_loaded = loaded['a']
b_loaded = loaded['b']
or:
from operator import itemgetter
g = itemgetter( 'a', 'b' )
a_loaded, b_loaded = g(np.load('file.npz'))
Upvotes: 16