Reut
Reut

Reputation: 1592

Drop pandas rows if value is not between two other values on the same column

I have the following dataframe:

>>>      mes1       mes2      mes3     mes4     mes5
A1      0.45        0.21      0.53     0.33     0.11
A2      0.44        0.32      0.11     0.38     0.91
A3      0.78        0.31      0.53     0.32     0.14
A4      0.12        0.33      0.56     0.43     0.12
posUp   0.52        0.40      0.62     0.48     0.54
posDown 0.32        0.15      0.45     0.24     0.05

I want to filer my dataframe, so I'll be left only with rows that their value is between the value of "posUp" and "posDown" for all the columns, so the result should be:

>>>      mes1       mes2      mes3     mes4     mes5
A1      0.45        0.21      0.53     0.33     0.11
posUp   0.52        0.40      0.62     0.48     0.54
posDown 0.32        0.15      0.45     0.24     0.05

I have tried to do it by slicing the dataframe into series and then put condition like this:

for i in df:
    db=df[i]
    vmin=db.loc['posUp']
    vmax=db.loc['posDown']
    db=db[(db>vmin)&(db<vmax)]

and then I wanted to drop the rows that will not be found in the last db filter, but it didn't filter anything and when I print db I got "Series([],Name: ..." Beside that, I believe there is more convenient / efficient way to do it than for loops.

So my end goal is to have only the rows that in all the columns, their value is between posUp and posDown.

Upvotes: 0

Views: 450

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150765

Try with le and ge:

mask = (df.le(df.loc['posUp'])       # compare with `posUp` row-wise
        & df.ge(df.loc['posDown'])   # compare with `posDown` row-wise
       ).all(1)                      # check for all True along the rows
df[mask]

Output:

         mes1  mes2  mes3  mes4  mes5
A1       0.45  0.21  0.53  0.33  0.11
posUp    0.52  0.40  0.62  0.48  0.54
posDown  0.32  0.15  0.45  0.24  0.05

Upvotes: 2

BENY
BENY

Reputation: 323306

You can try all after sub . PS : A3 should not included since mes1 is 0.78

out = df[(df.sub(df.loc['posUp']).le(0) & df.sub(df.loc['posDown']).ge(0)).all(1)]
Out[107]: 
         mes1  mes2  mes3  mes4  mes5
A1       0.45  0.21  0.53  0.33  0.11
posUp    0.52  0.40  0.62  0.48  0.54
posDown  0.32  0.15  0.45  0.24  0.05

Upvotes: 2

Related Questions