Reputation: 13
I want a numpy array of different mixed datatypes, basically a combination of float32 and uint32. The thing is, I don't write the array manually (as all other forums that I've found). Here is a piece of code of what I'm trying to do:
a = np.full((1, 10), 1).astype(np.float32)
b = np.full((1, 10), 2).astype(np.float32)
c = np.full((1, 10), 3).astype(np.float32)
d = np.full((1, 10), 4).astype(np.uint32)
arr = np.dstack([a, b, c, d]) # arr.shape = 1, 10, 4
I want axis 2 of arr to be of mixed data types. Of course a, b, c, and d are read from files, but for simplicity i show them as constant values!
One important note: I want this functionality. Last element of the array have to be represented as a uint32 because I'm dealing with hardware components that expects this order of datatypes (think of it as an API that will throw an error if the data types do not match)
This is what I've tried:
arr.astype("float32, float32, float32, uint1")
but this duplicate each element in axis 2 four times with different data types (same value).
I also tried this (which is basically the same thing):
dt = np.dtype([('floats', np.float32, (3, )), ('ints', np.uint32, (1, ))])
arr = np.dstack((a, b, c, d)).astype(dt)
but I got the same duplication as well.
I know for sure if I construct the array as follows:
arr = np.array([((1, 2, 3), (4)), ((5, 6, 7), (8))], dtype=dt)
where dt is from the code block above, it works nice-ish. but I read those a, b, c, d arrays and I don't know if constructing those tuples (or structures) is the best way to do it because those arrays have length of 850k in practice.
Upvotes: 1
Views: 85
Reputation: 231665
Your dtype:
In [83]: dt = np.dtype([('floats', np.float32, (3, )), ('ints', np.uint32, (1, ))])
and a sample uniform array:
In [84]: x= np.arange(1,9).reshape(2,4);x
Out[84]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
the wrong way of making a structured array:
In [85]: x.astype(dt)
Out[85]:
array([[([1., 1., 1.], [1]), ([2., 2., 2.], [2]), ([3., 3., 3.], [3]),
([4., 4., 4.], [4])],
[([5., 5., 5.], [5]), ([6., 6., 6.], [6]), ([7., 7., 7.], [7]),
([8., 8., 8.], [8])]],
dtype=[('floats', '<f4', (3,)), ('ints', '<u4', (1,))])
The right way:
In [86]: import numpy.lib.recfunctions as rf
In [87]: rf.unstructured_to_structured(x,dt)
Out[87]:
array([([1., 2., 3.], [4]), ([5., 6., 7.], [8])],
dtype=[('floats', '<f4', (3,)), ('ints', '<u4', (1,))])
and alternate way:
In [88]: res = np.zeros(2,dt)
In [89]: res['floats'] = x[:,:3]
In [90]: res['ints'] = x[:,-1:]
In [91]: res
Out[91]:
array([([1., 2., 3.], [4]), ([5., 6., 7.], [8])],
dtype=[('floats', '<f4', (3,)), ('ints', '<u4', (1,))])
https://numpy.org/doc/stable/user/basics.rec.html
Upvotes: 2