Reputation: 143
Guys,
I have a df like this :
A B C D E Yes
14 12 123 153 178 0
13 1 435 55 87 0
14 12 123 1 435 0
......
15 0 125 66 90 0
Let us say, we have two variable x and y, are integer. I want to change the 'Yes' column to '1' if any one of following conditions are fulfilled :
df.D < x and df.E > x
df.D > x and df.E > y
df.D > y and df.E > y
Besides, I am sure df.E is always larger than df.D in those raw data.
How can I do it quickly ? I tried to write some expression based on that, but all have some problems ... Really appreciate.
Upvotes: 1
Views: 48
Reputation: 13401
Addition to @jpp answer you can use np.where
too
df = pd.DataFrame({'A':[14,13,14,15], 'B':[12,1,12,0], 'D':[153,55,1,66],'E':[178,87,435,90],'Yes':[0,0,0,0]})
x = 100
y = 200
m1 = (df['D'] < x) & (df['E'] > x)
m2 = (df['D'] > x) & (df['E'] > y)
m3 = (df['D'] > y) & (df['E'] > y)
df['Yes'] = np.where(m1|m2|m3, 1, 0)
print(df)
Output:
A B D E Yes
0 14 12 153 178 0
1 13 1 55 87 0
2 14 12 1 435 1
3 15 0 66 90 0
Upvotes: 0
Reputation: 164623
You can create some Boolean series and use them as masks with pd.DataFrame.loc
. For example:
x = 10
y = 20
m1 = (df['D'] < x) & (df['E'] > x)
m2 = (df['D'] > x) & (df['E'] > y)
m3 = (df['D'] > y) & (df['E'] > y)
df.loc[m1 | m2 | m3, 'Yes'] = 1
Upvotes: 1