Reputation: 1645
I have a pandas dataframe like this
df = pd.DataFrame(data=[[21, 1],[32, -4],[-4, 14],[3, 17],[-7,NaN]], columns=['a', 'b'])
df
I want to be able to remove all rows with negative values in a list of columns and conserving rows with NaN.
In my example there is only 2 columns, but I have more in my dataset, so I can't do it one by one.
Upvotes: 31
Views: 172428
Reputation: 3491
I was looking for a solution for this that doesn't change the dtype (which will happen if NaN's are mixed in with ints as suggested in the answers that use dropna
. Since the questioner already had a NaN in their data, that may not be an issue for them. I went with this solution which preserves the int64
dtype. Here it is with my sample data:
df = pd.DataFrame(data={'a':[0, 1, 2], 'b': [-1,0,1], 'c': [-2, -1, 0]})
columns = ['b', 'c']
filter_ = (df[columns] >= 0).all(axis=1)
df[filter_]
a b c
2 2 1 0
Upvotes: 1
Reputation: 737
I've found you can simplify the answer by just doing this:
>>> cols = ['b']
>>> df = df[df[cols] > 0]
dropna()
is not an in-place method, so you have to store the result.
>>> df = df.dropna()
Upvotes: 15
Reputation: 12108
If you want to apply it to all columns, do df[df > 0]
with dropna()
:
>>> df[df > 0].dropna()
a b
0 21 1
3 3 17
If you know what columns to apply it to, then do for only those cols with df[df[cols] > 0]
:
>>> cols = ['b']
>>> df[cols] = df[df[cols] > 0][cols]
>>> df.dropna()
a b
0 21 1
2 -4 14
3 3 17
Upvotes: 55