Reputation: 365
Existing Dataframe :
Id created_by
A A
A 123
B X
B B
Expected Dataframe :
Id created_by status
A A category_1
A 123 category_2
B X category_3
B B category_1
I am looking to create a status tag basis the condition :
if Id == created_by >> category_1
if id != created_by >> category_2
if id != created_by & created_by == 'X' >> category_3
i am using below code :
conditions = [
df['Id'] == df['created_by'],
df['Id'] != df['created_by'],
(df['Id'] != df['created_by']) & (df['created_by'] == 'X')
]
# Creating Labels
result = ['category_1','category_2','category_3']
# Creating status column
df['status'] = np.select(conditions, result , default='REST')
somehow i am not getting correct number for third condition. what am i missing
Upvotes: 0
Views: 14
Reputation: 862671
Problem is in second condition, there is necessary add filtering non X
values in created_by
:
conditions = [
df['Id'] == df['created_by'],
(df['Id'] != df['created_by']) & (df['created_by'] != 'X'),
(df['Id'] != df['created_by']) & (df['created_by'] == 'X')
]
# Creating Labels
result = ['category_1','category_2','category_3']
# Creating status column
df['status'] = np.select(conditions, result , default='REST')
print (df)
Id created_by status
0 A A category_1
1 A 123 category_2
2 B X category_3
3 B B category_1
For improve solution (call conditions only once) you can create helper masks and chain like:
m1 = df['Id'] == df['created_by']
m2 = df['created_by'] == 'X'
conditions = [m1, ~m1 & ~m2, ~m1 & m2]
# Creating Labels
result = ['category_1','category_2','category_3']
# Creating status column
df['status'] = np.select(conditions, result , default='REST')
Upvotes: 1