user32882
user32882

Reputation: 5877

Random valid data items in numpy array

Suppose I have a numpy array as follows:

data = np.array([[1, 3, 8, np.nan], [np.nan, 6, 7, 9], [np.nan, 0, 1, 2], [5, np.nan, np.nan, 2]])

I would like to randomly select n-valid items from the array, including their indices.

Does numpy provide an efficient way of doing this?

Upvotes: 0

Views: 30

Answers (2)

Paul Panzer
Paul Panzer

Reputation: 53029

Example

data = np.array([[1, 3, 8, np.nan], [np.nan, 6, 7, 9], [np.nan, 0, 1, 2], [5, np.nan, np.nan, 2]])
n = 5

Get valid indices

y_val, x_val = np.where(~np.isnan(data))
n_val = y_val.size

Pick random subset of size n by index

pick = np.random.choice(n_val, n)

Apply index to valid coordinates

y_pick, x_pick = y_val[pick], x_val[pick]

Get corresponding data

data_pick = data[y_pick, x_pick]

Admire

data_pick
# array([2., 8., 1., 1., 2.])
y_pick
# array([3, 0, 0, 2, 3])
x_pick
# array([3, 2, 0, 2, 3])

Upvotes: 1

Sajid
Sajid

Reputation: 145


Find nonzeros by :


In [37]: a = np.array(np.nonzero(data)).reshape(-1,2) 

In [38]: a                                            
Out[38]:                                              
array([[0, 0],                                        
       [0, 0],                                        
       [1, 1],                                        
       [1, 1],                                        
       [2, 2],                                        
       [2, 3],                                        
       [3, 3],                                        
       [3, 0],                                        
       [1, 2],                                        
       [3, 0],                                        
       [1, 2],                                        
       [3, 0],                                        
       [2, 3],                                        
       [0, 1],                                        
       [2, 3]])                                       

Now pick a random choice :


In [44]: idx = np.random.choice(np.arange(len(a)))

In [45]: data[a[idx][0],a[idx][1]]
Out[45]: 2.0

Upvotes: 0

Related Questions