vnal
vnal

Reputation: 53

Index elements in specific dimension numpy

I know the title is very general but I don't know of a better way to describe my question.

I'm using scipy's io.loadmat to load a Matlab mat file. This mat file originally had some structs in it which I suppose were converted to numpy arrays. The structure of the mat file is as follows. There are 500 structs each with 3 fields.

print(data[0].shape)
(500, )

The first and second fields have elements of shape (300, 300)

print(data[0][0].shape)
(300, 300)
print(data[499][0].shape)
(300, 300)
print(data[0][1].shape)
(300, 300)
print(data[499][1].shape)
(300, 300)

The third field is a scalar

print(data[0][2].shape)
(1, 1)
print(data[499][2].shape)
(1, 1)

I want to split up this file so I have a variables of size (500, 300, 300), (500, 300, 300) and (500, )

I've tried

field1 = data[:][0]

but it gives the wrong elements. field1[0] = data[0][0], field1[1] = data[0][1], field1[2] = data[0][2] and field1[3] gives an invalid index error. I want field1[0] = data[0][0] ... field1[499] = data[499][0]

How do I index across the dimension of size 500?

I know I can do

field1 = np.array([data[i][0] for i in range(500)])

but I'm wondering if there's something simpler

Upvotes: 2

Views: 127

Answers (1)

hpaulj
hpaulj

Reputation: 231665

Sounds like you have a structured array with 3 fields. Something along this line line:

two fields:

In [38]: dt = np.dtype([('f0',int,(2,2)),('f1','U3',(1,1))])                                           

for records/items:

In [39]: data = np.zeros((4,), dtype=dt)                                                               
In [40]: data                                                                                          
Out[40]: 
array([([[0, 0], [0, 0]], [['']]), ([[0, 0], [0, 0]], [['']]),
       ([[0, 0], [0, 0]], [['']]), ([[0, 0], [0, 0]], [['']])],
      dtype=[('f0', '<i8', (2, 2)), ('f1', '<U3', (1, 1))])
In [41]: data.shape                                                                                    
Out[41]: (4,)

one record:

In [42]: data[0]                                                                                       
Out[42]: ([[0, 0], [0, 0]], [['']])

the field may be selected by number - because it is a tuple (or tuple-like):

In [43]: data[0][0]                                                                                    
Out[43]: 
array([[0, 0],
       [0, 0]])

but to select by field for all records, use the name:

In [45]: data['f0']                                                                                    
Out[45]: 
array([[[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]]])
In [46]: data['f0'].shape                                                                              
Out[46]: (4, 2, 2)

Upvotes: 0

Related Questions