Reputation: 3396
Given the following arrays:
name = np.array(['a', 'b', 'c'])
val = np.array([0.4, 0.5, 0.6])
alt = np.array([1.1, 2.1, 3.1])
b = np.array([17.2])
How can I combine them into a recarray (or structured array, same thing) that looks like this: [('a', 'b', 'c'), (0.4, 0.5, 0.6), (1.1, 2.1, 3.1), (17.2)]
.
And where print(arr["name"])
returns ('a', 'b', 'c')
.
The actual data has a dozen arrays. There is always one array (b
) that only has size of one; the others all have the same size, but that size will vary. So, I'm looking for a solution that is extensible to these conditions. Thank you.
Upvotes: 1
Views: 74
Reputation: 231335
Define a dtype:
In [41]: dt = np.dtype([('name','U10'),('val','f'),('alt','f'),('b','f')])
make a zeros array of the desired shape and dtype:
In [43]: arr = np.zeros(3, dt)
Copy the arrays to their respective fields:
In [44]: arr['name']=name; arr['val']=val; arr['alt']=alt
In [45]: arr['b']=b
And the result:
In [46]: arr
Out[46]:
array([('a', 0.4, 1.1, 17.2), ('b', 0.5, 2.1, 17.2),
('c', 0.6, 3.1, 17.2)],
dtype=[('name', '<U10'), ('val', '<f4'), ('alt', '<f4'), ('b', '<f4')])
That looks different from what you want, but it is a valid structured array. Yours isn't. And access by field name does what you want:
In [47]: arr['name']
Out[47]: array(['a', 'b', 'c'], dtype='<U10')
The b
values have been replicated. You can't make a "ragged" structured array:
In [48]: arr['b']
Out[48]: array([17.2, 17.2, 17.2], dtype=float32)
The other answer creates a dict
, which gives the same "key" result, but is a distinct structure. But it may be what you really want.
There are some helper functions that create a recarray from a set of arrays, but their action amounts to the same thing. And they (probably) won't work directly with the single element b
.
You could make the list of tuples with:
In [53]: from itertools import zip_longest
In [54]: [ijk for ijk in zip_longest(name,val,alt,b)]
Out[54]: [('a', 0.4, 1.1, 17.2), ('b', 0.5, 2.1, None), ('c', 0.6, 3.1, None)]
In [55]: np.array(_, dt)
Out[55]:
array([('a', 0.4, 1.1, 17.2), ('b', 0.5, 2.1, nan),
('c', 0.6, 3.1, nan)],
dtype=[('name', '<U10'), ('val', '<f4'), ('alt', '<f4'), ('b', '<f4')])
Though the b
fill of None/nan may not be what you want.
You could combine the arrays into one object dtype array, but the elements are not accessible by name. That requires a dict
:
In [64]: barr = np.array([name, val, alt, b], dtype=object)
In [65]: barr
Out[65]:
array([array(['a', 'b', 'c'], dtype='<U1'), array([0.4, 0.5, 0.6]),
array([1.1, 2.1, 3.1]), array([17.2])], dtype=object)
Upvotes: 2
Reputation: 1848
The following solution produces output that closely matches what you say you desire (but it's not a NumPy record array):
import numpy as np
name = np.array(['a', 'b', 'c'])
val = np.array([0.4, 0.5, 0.6])
alt = np.array([1.1, 2.1, 3.1])
b = np.array([17.2])
arr = {}
for var in ['name', 'val', 'alt', 'b']:
arr[var] = eval(var)
print(arr["name"])
This prints ['a' 'b' 'c']
. Note that arr
here is a simple dictionary.
An alternative answer using NumPy's numpy.recarray
would be the following:
import numpy as np
# initialization
name = np.array(['a', 'b', 'c'])
val = np.array([0.4, 0.5, 0.6])
alt = np.array([1.1, 2.1, 3.1])
b = np.array([17.2])
# processing
b = np.array([b[0]] * len(name)) # make b longer
fields = ['name', 'val', 'alt', 'b']
dt = np.dtype([('name', '<U12')] + list((colname, 'f8') for colname in fields[1:]))
arr = np.array(list(zip(name, val, alt, b)), dt)
print(arr["name"]) # output: ['a' 'b' 'c']
Here, arr
evaluates to the following:
array([('a', 0.4, 1.1, 17.2), ('b', 0.5, 2.1, 17.2),
('c', 0.6, 3.1, 17.2)],
dtype=[('name', '<U12'), ('val', '<f8'), ('alt', '<f8'), ('b', '<f8')])
Upvotes: 0