Jordi Ferrer
Jordi Ferrer

Reputation: 13

Structured numpy array within a multidimensional array

Imagine a numpy array of N x M dimension. In each cell, it contains a structured array with X elements, each containing an x_label.

I would like to access a specific x_label so it returns a N x M array only containing the value of the label of interest.

Is there a way to so so without having to use a for loop (or a np.map()) function and creating a new array?

Example:

import numpy as np
arr = np.array([[[],[]],
                [[],[]]])

# Each cell contains:
np.array([('par1', 'par2', 'par3')], dtype=[('label_1', 'U10'), ('label_2', 'U10'), ('label3', 'U10')])

How can I get a 2x2 np.array returned with the par1 values only? I have tried unsuccessfully:

arr['label_1']
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Thank you!

Upvotes: 0

Views: 173

Answers (1)

Paul Panzer
Paul Panzer

Reputation: 53029

I'm assuming your outer array is of Object dtype, otherwise there should be no problems:

>>> x = np.array([('par1', 'par2', 'par3')], dtype=[('label_1', 'U10'), ('label_2', 'U10'), ('label3', 'U10')])
>>> Y = np.array(4*[x]+[None])[:-1].reshape(2,2)
>>> Y
array([[array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')]),
        array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])],
       [array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')]),
        array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])]],
      dtype=object)

(Note how I have to jump through hoops to even create such a thing.)

Make your life easy by converting to a proper structured array:

>>> Z = np.concatenate(Y.ravel()).reshape(Y.shape)
>>> Z
array([[('par1', 'par2', 'par3'), ('par1', 'par2', 'par3')],
       [('par1', 'par2', 'par3'), ('par1', 'par2', 'par3')]],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])

Now, you can simply index by label:

>>> Z['label_1']
array([['par1', 'par1'],
       ['par1', 'par1']], dtype='<U10')

Upvotes: 1

Related Questions