Reputation: 12808
I have a large number of rows with 2 columns.
From each row I want to choose a value (with weighted probability) from either the first or the second column.
import numpy as np
values = np.array(
[[0.41, 0.31],
[0.73, 0.15],
[0.44, 0.30],
[0.67, 0.18],
])
I wanted to use random choice between 0 and 1 as an index like this with weights 0.6 for the first column and 0.4 for the second column:
probs_chosen = np.random.choice([0,1], size=4, replace=True, p=[0.6, 0.4])
print(probs_chosen)
array([0, 0, 1, 0])
But how do I use this index to select from row 1 the first value, from row 2 the first value etc.
Or any other way to solve my problem is also fine. A pandas solution is also ok.
Expected result in this case:
[0.41, 0.73, 0.30, 0.67]
Upvotes: 0
Views: 114
Reputation:
You can use numpy advanced indexing:
row_idx = np.arange(values.shape[0])
col_idx = np.random.choice([0,1], size=4, replace=True, p=[0.6, 0.4])
out = values[row_idx, col_idx]
Output:
array([0.41, 0.73, 0.3 , 0.67])
Upvotes: 1