Reputation: 327
I have a pandas dataframe of employees that I need to filter based on 2 columns. I need to filter on department and level. So let's say we have department 'Human Resources' and within that it has level 1,2,3,4,5. I'm specifically looking for Human Resources level 2,4 and 5.
I have my desired departments and levels stored in dictionary, for example:
departments = dict({'Human Resources' : ['2','4','5'] ,'IT' : ['1','3','5','6'], etc.... })
My dataframe will list every employee, for all departments and for all levels (plus lots more). I now want to filter that dataframe using the dictionary above. So in the Human Resources example, I just want returned the employees who are in 'Human Resouces' and are in levels 2, 4 and 5.
An example of the df would be:
employee_ID Department Level
001 Human Resources 1
002 Human Resources 1
003 Human Resources 2
004 Human Resources 3
005 Human Resources 4
006 Human Resources 4
007 Human Resources 5
008 IT 1
009 IT 2
010 IT 3
011 IT 4
012 IT 5
013 IT 6
Using the dictionary I've displayed above, my expected result would be
employee_ID Department Level
003 Human Resources 2
005 Human Resources 4
006 Human Resources 4
007 Human Resources 5
008 IT 1
010 IT 3
012 IT 5
013 IT 6
I have no idea how I'd do this?
Upvotes: 3
Views: 158
Reputation: 29635
you can use groupby
on Departement and use isin
on the Level and get the value for the departement concerned with the name of the group.
#example data
departments = dict({'Human Resources' : ['2','4','5'] ,'IT' : ['1','3','5','6']})
df = pd.DataFrame({'Id':range(10),
'Departement': ['Human Resources']*5+['IT']*5,
'Level':list(range(1,6))*2})
#filter
print (df[df.groupby('Departement')['Level']
.apply(lambda x: x.isin(departments[x.name]))])
Id Departement Level
1 1 Human Resources 2
3 3 Human Resources 4
4 4 Human Resources 5
5 5 IT 1
7 7 IT 3
9 9 IT 5
Upvotes: 2