siamii
siamii

Reputation: 24114

balance positives and negatives in numpy

I have a matrix where the last column has some floats in it. Around 70% of the numbers are positive, while 30% are negative. I'd like to remove some rows with a positive number so that the result matrix has approxiamtely the same number of positive and negative numbers in the last column. I'd like to remove the positives rows randomly.

Upvotes: 0

Views: 208

Answers (1)

Akavall
Akavall

Reputation: 86266

What about this:

import numpy as np

x = np.arange(30).reshape(10, 3)

x[[0,1,2,],[2,2,2]] = x[[0,1,2],[2,2,2]] * -1

a = np.where(x[:,2] > 0)[0]

n_pos = np.sum(x[:,2] > 0)
n_neg = np.sum(x[:,2] < 0)

n_to_remove = n_pos - n_neg
np.random.shuffle(a)

new_x = np.delete(x, a[:n_to_remove], axis = 0)

Result:

>>> x

array([[ 0,  1, -2],
       [ 3,  4, -5],
       [ 6,  7, -8],
       [ 9, 10, 11],
       [12, 13, 14],
       [15, 16, 17],
       [18, 19, 20],
       [21, 22, 23],
       [24, 25, 26],
       [27, 28, 29]])
>>> new_x
array([[ 0,  1, -2],
       [ 3,  4, -5],
       [ 6,  7, -8],
       [15, 16, 17],
       [18, 19, 20],
       [27, 28, 29]])

I think this is easier to do with arrays than matrices, let me know if you need a solution with matrices.

Upvotes: 1

Related Questions