Reputation: 65
I have a pandas data frame containing employee information like this:
df=pd.DataFrame({
'Id':[1,2,3,4],
'Name':['Joe','Henry','Sam','Max'],
'Salary':[70000,80000,60000,90000],
'ManagerId':[3,4,np.nan,np.nan]
})
Id Name Salary ManagerId
0 1 Joe 70000 3.0
1 2 Henry 80000 4.0
2 3 Sam 60000 NaN
3 4 Max 90000 NaN
What I need to do is to filter employees having their salary exceed his manager's (in this case Joe since his salary is larger than his manager, Sam).
0 1 Joe 70000 3.0
Because of the relation between Id and Manager Id, I think I can use loops as the last resort, but that seems to be really manual and looks ugly too. I wonder that if I can do this with masking. As a beginner, I can only do simple masking where the condition is static so far, like to filter employees that have salary exceeding 60000. But in this case, the condition for each employee is different from each other.
I have no idea what this technique is called so I just made up the title.
Thanks for any help.
Upvotes: 2
Views: 263
Reputation: 862406
Idea is match ManagerID
by Salary
by Id
, so possible compare for greater and filter:
df = df[df['Salary'].gt(df['ManagerID'].map(df.set_index(['Id'])['Salary']))]
print (df)
Id Name Salary ManagerID
0 1 Joe 70000 3.0
Details:
print (df['ManagerID'].map(df.set_index(['Id'])['Salary']))
0 60000.0
1 90000.0
2 NaN
3 NaN
Name: ManagerID, dtype: float64
Upvotes: 2