Reputation: 1971
I'd like to convert a list of record arrays -- dtype is (uint32, float32) -- into a numpy array of dtype np.object
:
X = np.array(instances, dtype = np.object)
where instances
is a list of arrays with data type np.dtype([('f0', '<u4'), ('f1', '<f4')])
.
However, the above statement results in an array whose elements are also of type np.object
:
X[0]
array([(67111L, 1.0), (104242L, 1.0)], dtype=object)
Does anybody know why?
The following statement should be equivalent to the above but gives the desired result:
X = np.empty((len(instances),), dtype = np.object)
X[:] = instances
X[0]
array([(67111L, 1.0), (104242L, 1.0), dtype=[('f0', '<u4'), ('f1', '<f4')])
thanks & best regards, peter
Upvotes: 0
Views: 326
Reputation: 879143
Stéfan van der Walt (a numpy developer) explains:
The ndarray constructor does its best to guess what kind of data you are feeding it, but sometimes it needs a bit of help....
I prefer to construct arrays explicitly, so there is no doubt what is happening under the hood:
When you say something like
instance1=np.array([(67111L,1.0),(104242L,1.0)],dtype=np.dtype([('f0', '<u4'), ('f1', '<f4')]))
instance2=np.array([(67112L,2.0),(104243L,2.0)],dtype=np.dtype([('f0', '<u4'), ('f1', '<f4')]))
instances=[instance1,instance2]
Y=np.array(instances, dtype = np.object)
np.array
is forced to guess what is the dimension of the array you desire.
instances
is a list of two objects, each of length 2. So, quite reasonably, np.array
guesses that Y
should have shape (2,2):
print(Y.shape)
# (2, 2)
In most cases, I think that is what would be desired. However, in your case, since this is not what you desire, you must construct the array explicitly:
X=np.empty((len(instances),), dtype = np.object)
print(X.shape)
# (2,)
Now there is no question about X's shape: (2, )
and so when you feed in the data
X[:] = instances
numpy is smart enough to regard instances
as a sequence of two objects.
Upvotes: 2