Reputation: 173
I have a large numpy array (8 by 30000) and I want to delete some rows according to some criteria. This criteria is only applicable in one column.
Example:
>>> p = np.array([[0, 1, 3], [1 , 5, 6], [4, 3, 56], [1, 34, 4]])
>>> p
array([[ 0, 1, 3],
[ 1, 5, 6],
[ 4, 3, 56],
[ 1, 34, 4]])
here I would like to remove every row in which the value of the 3rd column is >30, ie. here row 3.
As the array is pretty large, I'd like to avoid for
loops. I thought of this:
>>> a[~(a>30).any(1), :]
array([[0, 1, 3],
[1, 5, 6]])
But there, it obviously removes the two last rows. Any ideas on how to do that in a efficient way?
Upvotes: 3
Views: 5471
Reputation: 212855
p = p[~(p[:,2] > 30)]
or (if your condition is easily inversible):
p = p[p[:,2] <= 30]
returns
array([[ 0, 1, 3],
[ 1, 5, 6],
[ 1, 34, 4]])
Upvotes: 4