Reputation: 1048
I have the below data in a pd dataframe :
There are cutoffs for Sliver and OThers(Like all Multi's - Multi_A etc) as below:
d_id Sliver Multi_A Multi_B Multi_C Multi_D no_of_prem Flag_Sliver
5 4.80 53.05 19.07 13.59 0 3 No
18 5.17 27.63 26.83 12.15 0 3 No
25 1.28 0.14 0.12 0 0 2 Yes
32 0.90 0.43 0.94 0 0 2 No
33 3.69 3.24 0.77 1.36 0 3 No
34 0.15 11.62 5.92 1.42 0 3 No
55 0.31 11.29 5.95 1.02 0 3 No
64 0.25 0.82 2.77 0 0 2 No
86 0.02 9.65 1.82 0.10 0 3 No
89 0.13 3.19 1.55 1.77 0 3 No
Sliver 5
Multis 0.15
IF any d_id has value in Sliver which is less than 5 and any two of the multi's(there are 4 Multi's in total) for that d_id has values less than 0.15 then the Flag variable is given a Yes or else a No. IN the above example only 25 satisfies the condition and hence flagged as yes.
Can someone please help me with these conditions?
Upvotes: 3
Views: 478
Reputation: 862771
Use numpy.where
by condition:
mask = df.filter(like='Multi').lt(.15).all(1) & df['Sliver'].lt(5)
#alternative
#mask = df.drop(['d_id','no_of_prem', 'Sliver'], axis=1).lt(.15).all(1) & df['Sliver'].lt(5)
df['Flag_Sliver'] = np.where(mask, 'Yes' ,'No')
print (df)
d_id Sliver Multi_A Multi_B Multi_C Multi_D no_of_prem Flag_Sliver
0 5 4.80 53.05 19.07 13.59 0 3 No
1 18 5.17 27.63 26.83 12.15 0 3 No
2 25 1.28 0.14 0.12 0.00 0 2 Yes
3 32 0.90 0.43 0.94 0.00 0 2 No
4 33 3.69 3.24 0.77 1.36 0 3 No
5 34 0.15 11.62 5.92 1.42 0 3 No
6 55 0.31 11.29 5.95 1.02 0 3 No
7 64 0.25 0.82 2.77 0.00 0 2 No
8 86 0.02 9.65 1.82 0.10 0 3 No
9 89 0.13 3.19 1.55 1.77 0 3 No
Explanation:
First select all Multi
columns by filter
or drop
unnecessary columns:
print (df.filter(like='Multi'))
#print (df.drop(['d_id','no_of_prem', 'Sliver'], axis=1))
Multi_A Multi_B Multi_C Multi_D
0 53.05 19.07 13.59 0
1 27.63 26.83 12.15 0
2 0.14 0.12 0.00 0
3 0.43 0.94 0.00 0
4 3.24 0.77 1.36 0
5 11.62 5.92 1.42 0
6 11.29 5.95 1.02 0
7 0.82 2.77 0.00 0
8 9.65 1.82 0.10 0
9 3.19 1.55 1.77 0
Compare by .15
by lt
(<
):
print (df.filter(like='Multi').le(.15))
Multi_A Multi_B Multi_C Multi_D
0 False False False True
1 False False False True
2 True True True True
3 False False True True
4 False False False True
5 False False False True
6 False False False True
7 False False True True
8 False False True True
9 False False False True
Get all only True
s rows by DataFrame.all
:
print (df.filter(like='Multi').le(.15).all(1))
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
Compare also column Sliver
:
print (df['Sliver'].lt(5))
0 True
1 False
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True
Name: Sliver, dtype: bool
and chain by &
(AND
):
mask = df.filter(like='Multi').lt(.15).all(1) & df['Sliver'].lt(5)
print (mask)
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
Upvotes: 2