Extract fixed number of elements per row in numpy array

Question

Suppose I have an array a, and a boolean array b, I want to extract a fixed number of elements from the valid elements in each row of a. The valid elements are the ones indicated by b.

Here is an example:

a = np.arange(24).reshape(4,6)
b = np.array([[0,0,1,1,0,0],[0,1,0,1,0,1],[0,1,1,1,1,0],[0,0,0,0,1,1]]).astype(bool)
x = []
for i in range(a.shape[0]):
    c = a[i,b[i]]
    d = np.random.choice(c, 2)
    x.append(d)

Here I used a for loop, which will be slow in case these arrays are big and high-dimensional. Is there a more efficient way to do this? Thanks.

orlp · Accepted Answer

Generate a random uniform [0, 1] matrix of shape a.
Multiply this matrix by the mask b to set invalid elements to zero.
Select the k maximum indices from each row (simulating an unbiased random k-sample from only the valid elements in this row).
(Optional) use these indices to get the elements.

a = np.arange(24).reshape(4,6)
b = np.array([[0,0,1,1,0,0],[0,1,0,1,0,1],[0,1,1,1,1,0],[0,0,0,0,1,1]])
k = 2

r = np.random.uniform(size=a.shape)
indices = np.argpartition(-r * b, k)[:,:k]

To get the elements from the indices:

>>> indices
array([[3, 2],
       [5, 1],
       [3, 2],
       [4, 5]])
>>> a[np.arange(a.shape[0])[:,None], indices]
array([[ 3,  2],
       [11,  7],
       [15, 14],
       [22, 23]])

Extract fixed number of elements per row in numpy array

Answers (1)

Related Questions