Reputation: 91

How to delete rows from a pandas dataframe based on certain condition

I have the following dataframe:

id       stat_day         x    y
0       2016-03-29        0    3
1       2016-03-29        0    4
2       2016-03-30        0    2

How to delete the lines where both x and y are equal to zero?

Upvotes: 1

Answers (3)

efajardo

Reputation: 797

You can create a boolean series that equals False when both x and y are zero and True otherwise. This translates into df.x != 0 | df.y !=0. Hence something like this might work:

df = df[(df.x != 0) | (df.y != 0)]

Upvotes: 0

piRSquared

Reputation: 294536

Consider the dataframe df

np.random.seed([3,1415])
df = pd.DataFrame(np.random.choice([0, 1], (10, 2)), columns=['x', 'y'])

   x  y
0  0  1
1  0  1
2  0  0
3  1  0
4  1  1
5  1  1
6  0  1
7  1  0
8  1  0
9  0  0

option 1
pd.DataFrame.query

df.query('x != 0 or y != 0')

   x  y
0  0  1
1  0  1
3  1  0
4  1  1
5  1  1
6  0  1
7  1  0
8  1  0

option 2
boolean slicing

df[df.x.ne(0) | df.y.ne(0)]

   x  y
0  0  1
1  0  1
3  1  0
4  1  1
5  1  1
6  0  1
7  1  0
8  1  0

option 3
boolean slicing take 2

df[df.astype(bool).any(1)]

   x  y
0  0  1
1  0  1
3  1  0
4  1  1
5  1  1
6  0  1
7  1  0
8  1  0

Upvotes: 2

Miriam Farber

Reputation: 19664

This will do the job:

import pandas as pd
df=pd.DataFrame({'stat_day':['2016-03-29','2016-03-29','2016-03-30'],'x':[0,0,0],'y':[3,4,2]})
df=df.loc[df[['x','y']].values.any(axis=1)]

In your example there are no such lines (where both x and y are 0) so df will stay the same, but if you define it so that in the first row both of them are 0, like that:

import pandas as pd
df=pd.DataFrame({'stat_day':['2016-03-29','2016-03-29','2016-03-30'],'x':[0,0,0],'y':[0,4,2]})
df=df.loc[df[['x','y']].values.any(axis=1)]

then df is

    stat_day    x   y
1   2016-03-29  0   4
2   2016-03-30  0   2

Upvotes: 0

How to delete rows from a pandas dataframe based on certain condition

Answers (3)

Related Questions