Reputation: 861
I have two related numpy arrays, X
and y
. I need to select n
random rows from X
and store this in an array, the corresponding y
value and the appends to it the index of the points randomly selected.
I have another array index
which stores a list of index which I dont want to sample.
How can I do this?
Sample data:
index = [2,3]
X = np.array([[0.3,0.7],[0.5,0.5] ,[0.2,0.8], [0.1,0.9]])
y = np.array([[0], [1], [0], [1]])
If these X
's were randomly selected (where n=2
):
randomylSelected = np.array([[0.3,0.7],[0.5,0.5]])
the desired output would be:
index = [0,1,2,3]
randomlySelectedY = [0,1]
How can I do this?
Upvotes: 51
Views: 89463
Reputation: 152587
You can create random indices with np.random.choice
:
n = 2 # for 2 random indices
index = np.random.choice(X.shape[0], n, replace=False)
Then you just need to index your arrays with the result:
x_random = X[index]
y_random = Y[index]
Upvotes: 83
Reputation: 3373
just to wrap @MSeifert 's answer in a function:
def random_sample(arr: numpy.array, size: int = 1) -> numpy.array:
return arr[np.random.choice(len(arr), size=size, replace=False)]
useage:
randomly_selected_y = random_sample(Y)
Upvotes: 10