Given two ndarrays how do I extract entries that belong to a subset of indices in numpy-ish way?

Question

Given two ndarrays (10000, 2000) and another is (10000,)

Second one has 10 different classes ("class1", "class2", ...) mixed up together (no order). Second array's indices correspond to first dim of first array.

How do I extract indices of ONLY class1 and class2 and pick appropriate entries from array 1 in numpy-ish way?

e.g for

["class1", "class3", "class4", "class2", "class1"]

and

[
[1,1,1,1]
[2,2,2,2]
[3,3,3,3]
[4,4,4,4]
[5,5,5,5]
]

I would get output

[
[1,1,1,1]
[4,4,4,4]
[5,5,5,5]
]

jpp · Accepted Answer

You can use Boolean indexing via np.isin:

import numpy as np

classes = np.array(["class1", "class3", "class4", "class2", "class1"])
data = np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3],
                 [4,4,4,4], [5,5,5,5]])

bool_idx = np.isin(classes, ['class1', 'class2'])
res = data[bool_idx]

# array([[1, 1, 1, 1],
#        [4, 4, 4, 4],
#        [5, 5, 5, 5]])

Given two ndarrays how do I extract entries that belong to a subset of indices in numpy-ish way?

Answers (1)

Related Questions