Reputation: 25
From my df showing employees with multiple levels of managers (see prior question here), I want to map rows to a department ID, based on a manager ID that may appear across multiple columns:
eid, mid, l2mid l3mid
111, 112, 116, 115
113, 114, 115, 0
112, 117, 114, 0
110, 115, 0, 0
116, 118, 0, 0
[edit: corrected data set to reflect row for eid=110 will be dropped edit #2: modified row for eid=112 to reflect that i need to search multiple columns to get a match in dictionary.]
The dictionary is
country = {112: 'US', 114: 'Ireland', 118: 'Mexico'}
I'd like write Python that searches the manager columns 'mid':'l3mid' and then writes the country string value into the new column. I'd like to drop rows if they do not have a manager from one of the country codes in the dictionary. So the output I'm looking for is:
eid, mid, l2mid l3mid country
111, 112, 116, 115, US
113, 114, 115, 0, Ireland
112, 117, 114, 0 Ireland
116, 118, 0, 0 Mexico
I have tried building a function to do this but can't quite figure out the syntax. I appreciate your help as I'm new to this work.
Upvotes: 0
Views: 187
Reputation: 862691
I added a solution, if manager columns (mid,l2mid,l3mid) value match the dictionary keys, then the values are joined splitted by ,
:
s = df.drop('eid',1).applymap(country.get)
.dropna(how='all', axis=0)
.apply(lambda x: ', '.join(x.dropna()), 1)
df = df.loc[s.index].assign(country=s)
print (df)
eid mid l2mid l3mid country
0 111 112 114 115 US, Ireland
1 113 114 115 0 Ireland
2 112 114 118 0 Ireland
4 116 118 0 0 Mexico
Upvotes: 2