storing record arrays in object arrays

Question

I'd like to convert a list of record arrays -- dtype is (uint32, float32) -- into a numpy array of dtype np.object:

X = np.array(instances, dtype = np.object)

where instances is a list of arrays with data type np.dtype([('f0', '. However, the above statement results in an array whose elements are also of type np.object:



X[0]
array([(67111L, 1.0), (104242L, 1.0)], dtype=object)


Does anybody know why?

The following statement should be equivalent to the above but gives the desired result: 

X = np.empty((len(instances),), dtype = np.object)
X[:] = instances
X[0]
array([(67111L, 1.0), (104242L, 1.0), dtype=[('f0', '


thanks & best regards, 
 peter

unutbu · Accepted Answer

Stéfan van der Walt (a numpy developer) explains:

The ndarray constructor does its best to guess what kind of data you are feeding it, but sometimes it needs a bit of help....

I prefer to construct arrays explicitly, so there is no doubt what is happening under the hood:

When you say something like

instance1=np.array([(67111L,1.0),(104242L,1.0)],dtype=np.dtype([('f0', '



np.array is forced to guess what is the dimension of the array you desire. 
instances is a list of two objects, each of length 2. So, quite reasonably, np.array guesses that Y should have shape (2,2):

print(Y.shape)
# (2, 2)


In most cases, I think that is what would be desired. However,
in your case, since this is not what you desire, you must construct the array explicitly:

X=np.empty((len(instances),), dtype = np.object)
print(X.shape)
# (2,)


Now there is no question about X's shape: (2, ) and so when you feed in the data

X[:] = instances


numpy is smart enough to regard instances as a sequence of two objects.

storing record arrays in object arrays

Answers (1)

Related Questions