William
William

Reputation: 4028

pandas numpy mask filter not working as expected

I have a dataframe,you can have it by running this code:

import numpy as np
import pandas as pd
from io import StringIO

df4s = """
   LowerAge    age    1       2      3      4 
0  2            3     o.234   o.234  o.234  o.234
1  3            4     o.234   o.234  o.234  o.234
2  4            2     o.234   o.234  o.234  o.234      
3  5            3     o.234   o.234  o.234  o.234         
"""
df4 = pd.read_csv(StringIO(df4s.strip()), sep='\s+')

df4

The ouput is:

  LowerAge  age   1       2       3       4
0   2       3     o.234   o.234   o.234   o.234
1   3       4     o.234   o.234   o.234   o.234
2   4       2     o.234   o.234   o.234   o.234
3   5       3     o.234   o.234   o.234   o.234

Now the logic is like this: for each row ,if LowerAge-1 < age,then df4[str(LowerAge-1)] =1,or it will stay the same, for example:

In the first row,LowerAge-1 equals 1 and it is less than age,then value of column '1'(because LowerAge-1 equals 1) will equal 1,

in the second row, LowerAge-1 equals 2 and it is less than age, then value of column '2' will equal 1.

The ideal output should be:

  LowerAge  age   1      2       3       4
0   2       3     1      o.234   o.234   o.234
1   3       4     o.234  1       o.234   o.234
2   4       2     o.234  o.234   o.234   o.234
3   5       3     o.234  o.234   o.234   o.234

My code is:

index_age = df4['LowerAge']-1

mask=index_age < df4['age']

df4.loc[mask, index_age.astype(str)]=1

my output:

 LowerAge  age  1      2         3      4
0       2   3   1      1         1      1
1       3   4   1      1         1      1
2       4   2   o.234  o.234     o.234  o.234
3       5   3   o.234  o.234     o.234  o.234

If I want to stick to use mask to do it,what should I do ,Any friend can hlep?

Upvotes: 0

Views: 514

Answers (1)

BENY
BENY

Reputation: 323226

If your case you can do slice then crosstab and update

s = (df.LowerAge-1)
s = s[s<df.age]
df.update(pd.crosstab(s.index,s.astype(str)).where(lambda x : x==1))
df
Out[454]: 
   LowerAge  age      1      2      3      4
0         2    3    1.0  o.234  o.234  o.234
1         3    4  o.234    1.0  o.234  o.234
2         4    2  o.234  o.234  o.234  o.234
3         5    3  o.234  o.234  o.234  o.234

Upvotes: 1

Related Questions