Reputation: 61259
I would like to convert a multi-dimensional Numpy array into a string and, later, convert that string back into an equivalent Numpy array.
I do not want to save the Numpy array to a file (e.g. via the savetxt
and loadtxt
interface).
Is this possible?
Upvotes: 6
Views: 12543
Reputation: 387
I use JSON to do that:
1. Encode to JSON
The first step is to encode it to JSON:
import json
import numpy as np
np_array = np.array(
[[[0.2123842 , 0.45560746, 0.23575005, 0.40605248],
[0.98393952, 0.03679023, 0.6192098 , 0.00547201],
[0.13259942, 0.69461942, 0.8781533 , 0.83025555]],
[[0.8398132 , 0.98341709, 0.25296835, 0.84055815],
[0.27619265, 0.55544911, 0.56615598, 0.058845 ],
[0.76205113, 0.18001961, 0.68206229, 0.47252472]]])
json_array = json.dumps(np_array.tolist())
print("converted to: " + str(type(json_array)))
print("looks like:")
print(json_array)
Which results in this:
converted to: <class 'str'>
looks like:
[[[0.2123842, 0.45560746, 0.23575005, 0.40605248], [0.98393952, 0.03679023, 0.6192098, 0.00547201], [0.13259942, 0.69461942, 0.8781533, 0.83025555]], [[0.8398132, 0.98341709, 0.25296835, 0.84055815], [0.27619265, 0.55544911, 0.56615598, 0.058845], [0.76205113, 0.18001961, 0.68206229, 0.47252472]]]
2. Decode back to Numpy
To convert it back to a numpy array you can use:
list_from_json = json.loads(json_array)
np.array(list_from_json)
print("converted to: " + str(type(list_from_json)))
print("converted to: " + str(type(np.array(list_from_json))))
print(np.array(list_from_json))
Which give you:
converted to: <class 'list'>
converted to: <class 'numpy.ndarray'>
[[[0.2123842 0.45560746 0.23575005 0.40605248]
[0.98393952 0.03679023 0.6192098 0.00547201]
[0.13259942 0.69461942 0.8781533 0.83025555]]
[[0.8398132 0.98341709 0.25296835 0.84055815]
[0.27619265 0.55544911 0.56615598 0.058845 ]
[0.76205113 0.18001961 0.68206229 0.47252472]]]
I like this method because the string is easy to read and, although for this case you didn't need storing it in a file or something, this can be done as well with this format.
Upvotes: 0
Reputation: 63
np.tostring and np.fromstring does NOT work anymore. They use np.tobyte but it parses the np.array as bytes and not string. To do that use ast.literal_eval.
if elements of lists are 2D float. ast.literal_eval() cannot handle a lot very complex list of list of nested list while retrieving back.
Therefore, it is better to parse list of list as dict and dump the string.
while loading a saved dump, ast.literal_eval() handles dict as strings in a better way. convert the string to dict and then dict to list of list
k = np.array([[[0.09898942, 0.22804536],[0.06109612, 0.19022354],[0.93369348, 0.53521671],[0.64630094, 0.28553219]],[[0.94503154, 0.82639528],[0.07503319, 0.80149062],[0.1234832 , 0.44657691],[0.7781163 , 0.63538195]]])
d = dict(enumerate(k.flatten(), 1))
d = str(d) ## dump as string (pickle and other packages parse the dump as bytes)
m = ast.literal_eval(d) ### convert the dict as str to dict
m = np.fromiter(m.values(), dtype=float) ## convert m to nparray
Upvotes: 1
Reputation: 879361
You could use np.tostring and np.fromstring:
In [138]: x = np.arange(12).reshape(3,4)
In [139]: x.tostring()
Out[139]: '\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00'
In [140]: np.fromstring(x.tostring(), dtype=x.dtype).reshape(x.shape)
Out[140]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Note that the string returned by tostring
does not save the dtype nor the shape of the original array. You have to re-supply those yourself.
Another option is to use np.save or np.savez or np.savez_compressed to write to a io.BytesIO
object (instead of a file):
import numpy as np
import io
x = np.arange(12).reshape(3,4)
output = io.BytesIO()
np.savez(output, x=x)
The string is given by
content = output.getvalue()
And given the string, you can load it back into an array using np.load
:
data = np.load(io.BytesIO(content))
x = data['x']
This method stores the dtype and shape as well.
For large arrays, np.savez_compressed
will give you the smallest string.
Similarly, you could use np.savetxt and np.loadtxt
:
import numpy as np
import io
x = np.arange(12).reshape(3,4)
output = io.BytesIO()
np.savetxt(output, x)
content = output.getvalue()
# '0.000000000000000000e+00 1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00\n4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00 7.000000000000000000e+00\n8.000000000000000000e+00 9.000000000000000000e+00 1.000000000000000000e+01 1.100000000000000000e+01\n'
x = np.loadtxt(io.BytesIO(content))
print(x)
Summary:
tostring
gives you the underlying data as a string, with no dtype or
shapesave
is like tostring
except it also saves dtype and shape (.npy format)savez
saves the array in npz format (uncompressed) savez_compressed
saves the array in compressed npz formatsavetxt
formats the array in a humanly readable formatUpvotes: 16
Reputation: 1735
If you want to save the dtype
as well you can also use the pickle
module from python.
import pickle
import numpy as np
a = np.ones(4)
string = pickle.dumps(a)
pickle.loads(string)
Upvotes: 2