dooms
dooms

Reputation: 1645

Filter rows of pandas dataframe whose values are lower than 0

I have a pandas dataframe like this

df = pd.DataFrame(data=[[21, 1],[32, -4],[-4, 14],[3, 17],[-7,NaN]], columns=['a', 'b'])
df

I want to be able to remove all rows with negative values in a list of columns and conserving rows with NaN.

In my example there is only 2 columns, but I have more in my dataset, so I can't do it one by one.

Upvotes: 31

Views: 172428

Answers (3)

Zev
Zev

Reputation: 3491

I was looking for a solution for this that doesn't change the dtype (which will happen if NaN's are mixed in with ints as suggested in the answers that use dropna. Since the questioner already had a NaN in their data, that may not be an issue for them. I went with this solution which preserves the int64 dtype. Here it is with my sample data:

df = pd.DataFrame(data={'a':[0, 1, 2], 'b': [-1,0,1], 'c': [-2, -1, 0]})
columns = ['b', 'c']
filter_ = (df[columns] >= 0).all(axis=1)
df[filter_]


   a  b  c
2  2  1  0

Upvotes: 1

rgahan
rgahan

Reputation: 737

I've found you can simplify the answer by just doing this:

>>> cols = ['b']
>>> df = df[df[cols] > 0]

dropna() is not an in-place method, so you have to store the result.

>>> df = df.dropna()

Upvotes: 15

ComputerFellow
ComputerFellow

Reputation: 12108

If you want to apply it to all columns, do df[df > 0] with dropna():

>>> df[df > 0].dropna()
    a   b
0  21   1
3   3  17

If you know what columns to apply it to, then do for only those cols with df[df[cols] > 0]:

>>> cols = ['b']
>>> df[cols] = df[df[cols] > 0][cols]
>>> df.dropna()
    a   b
0  21   1
2  -4  14
3   3  17

Upvotes: 55

Related Questions