Richard
Richard

Reputation: 61259

Save & Retrieve Numpy Array From String

I would like to convert a multi-dimensional Numpy array into a string and, later, convert that string back into an equivalent Numpy array.

I do not want to save the Numpy array to a file (e.g. via the savetxt and loadtxt interface).

Is this possible?

Upvotes: 6

Views: 12543

Answers (4)

Adrian Fischer
Adrian Fischer

Reputation: 387

I use JSON to do that:

1. Encode to JSON
The first step is to encode it to JSON:

import json
import numpy as np
np_array = np.array(
      [[[0.2123842 , 0.45560746, 0.23575005, 0.40605248],
        [0.98393952, 0.03679023, 0.6192098 , 0.00547201],
        [0.13259942, 0.69461942, 0.8781533 , 0.83025555]],

       [[0.8398132 , 0.98341709, 0.25296835, 0.84055815],
        [0.27619265, 0.55544911, 0.56615598, 0.058845  ],
        [0.76205113, 0.18001961, 0.68206229, 0.47252472]]])

json_array = json.dumps(np_array.tolist())
print("converted to: " + str(type(json_array)))
print("looks like:")
print(json_array)

Which results in this:

converted to: <class 'str'>
looks like:
[[[0.2123842, 0.45560746, 0.23575005, 0.40605248], [0.98393952, 0.03679023, 0.6192098, 0.00547201], [0.13259942, 0.69461942, 0.8781533, 0.83025555]], [[0.8398132, 0.98341709, 0.25296835, 0.84055815], [0.27619265, 0.55544911, 0.56615598, 0.058845], [0.76205113, 0.18001961, 0.68206229, 0.47252472]]]

2. Decode back to Numpy
To convert it back to a numpy array you can use:

list_from_json = json.loads(json_array)
np.array(list_from_json)
print("converted to: " + str(type(list_from_json)))
print("converted to: " + str(type(np.array(list_from_json))))
print(np.array(list_from_json))

Which give you:

converted to: <class 'list'>
converted to: <class 'numpy.ndarray'>
[[[0.2123842  0.45560746 0.23575005 0.40605248]
  [0.98393952 0.03679023 0.6192098  0.00547201]
  [0.13259942 0.69461942 0.8781533  0.83025555]]

 [[0.8398132  0.98341709 0.25296835 0.84055815]
  [0.27619265 0.55544911 0.56615598 0.058845  ]
  [0.76205113 0.18001961 0.68206229 0.47252472]]]

I like this method because the string is easy to read and, although for this case you didn't need storing it in a file or something, this can be done as well with this format.

Upvotes: 0

Soum
Soum

Reputation: 63

np.tostring and np.fromstring does NOT work anymore. They use np.tobyte but it parses the np.array as bytes and not string. To do that use ast.literal_eval.

if elements of lists are 2D float. ast.literal_eval() cannot handle a lot very complex list of list of nested list while retrieving back.

Therefore, it is better to parse list of list as dict and dump the string.

while loading a saved dump, ast.literal_eval() handles dict as strings in a better way. convert the string to dict and then dict to list of list

k = np.array([[[0.09898942, 0.22804536],[0.06109612, 0.19022354],[0.93369348, 0.53521671],[0.64630094, 0.28553219]],[[0.94503154, 0.82639528],[0.07503319, 0.80149062],[0.1234832 , 0.44657691],[0.7781163 , 0.63538195]]])

d = dict(enumerate(k.flatten(), 1))
d = str(d) ## dump as string  (pickle and other packages parse the dump as bytes)

m = ast.literal_eval(d) ### convert the dict as str to  dict

m = np.fromiter(m.values(), dtype=float) ## convert m to nparray

Upvotes: 1

unutbu
unutbu

Reputation: 879361

You could use np.tostring and np.fromstring:

In [138]: x = np.arange(12).reshape(3,4)

In [139]: x.tostring()
Out[139]: '\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00'

In [140]: np.fromstring(x.tostring(), dtype=x.dtype).reshape(x.shape)
Out[140]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Note that the string returned by tostring does not save the dtype nor the shape of the original array. You have to re-supply those yourself.


Another option is to use np.save or np.savez or np.savez_compressed to write to a io.BytesIO object (instead of a file):

import numpy as np
import io

x = np.arange(12).reshape(3,4)
output = io.BytesIO()
np.savez(output, x=x)

The string is given by

content = output.getvalue()

And given the string, you can load it back into an array using np.load:

data = np.load(io.BytesIO(content))
x = data['x']

This method stores the dtype and shape as well.

For large arrays, np.savez_compressed will give you the smallest string.


Similarly, you could use np.savetxt and np.loadtxt:

import numpy as np
import io

x = np.arange(12).reshape(3,4)
output = io.BytesIO()
np.savetxt(output, x)
content = output.getvalue()
# '0.000000000000000000e+00 1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00\n4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00 7.000000000000000000e+00\n8.000000000000000000e+00 9.000000000000000000e+00 1.000000000000000000e+01 1.100000000000000000e+01\n'

x = np.loadtxt(io.BytesIO(content))
print(x)

Summary:

  • tostring gives you the underlying data as a string, with no dtype or shape
  • save is like tostring except it also saves dtype and shape (.npy format)
  • savez saves the array in npz format (uncompressed)
  • savez_compressed saves the array in compressed npz format
  • savetxt formats the array in a humanly readable format

Upvotes: 16

Max Linke
Max Linke

Reputation: 1735

If you want to save the dtype as well you can also use the pickle module from python.

import pickle
import numpy as np

a = np.ones(4)
string = pickle.dumps(a)
pickle.loads(string)

Upvotes: 2

Related Questions