Robl09
Robl09

Reputation: 41

Mapping key to multiple values in a dictionary pandas

I have two data frames. The first is a list of email addresses

email_ad                   Band 30     Band 40
[email protected]
[email protected]

The second is the following data frame

email_ad                 Name       Manager_Name   Manager_Band_level 
[email protected]. Tom Banks  Boss1          30
[email protected].  Bill Bob   Boss2          40

I want to map each email in the first dataframe to the second, if the email is a match, then check the Manager Band level. Based off the number, fill that corresponding manager's email into the "Band 30" or "Band 40" categories.

So the desired dataframe is the following:

email_ad                   Band 30     Band 40
[email protected]    Boss1
[email protected].                Boss2

if someone could please help me for syntax that would be amazing. I was additionally stuck making a dictionary out of only certain columns in the dataframe. Thanks so much!!

Upvotes: 0

Views: 71

Answers (2)

mcskinner
mcskinner

Reputation: 2748

You can do this with a pivot.

pivoted = df.pivot('email_ad', 'Manager_Band_level', 'Manager_Name')
pivoted
# Manager_Band_level          30     40
# email_ad                             
# [email protected]  Boss1    NaN
# [email protected]     NaN  Boss2

That format is probably usable as is. But if you want to clean it up a bit more, you can fill in the missing values and flatten it back out.

cleaned = pivoted.fillna('').reset_index()
cleaned
# Manager_Band_level                 email_ad     30     40
# 0                   [email protected]  Boss1       
# 1                    [email protected]         Boss2

You can get rid of the awkward, and now incorrect, column index name by resetting the columns.

cleaned.columns = list(cleaned.columns)
cleaned
#                   email_ad     30     40
# 0  [email protected]  Boss1       
# 1   [email protected]         Boss2

If the Band prefix is important, you can add that at the same time.

cleaned.columns = [col if col == 'email_ad' else f'Band {col}' for col in cleaned.columns] 
cleaned
#                   email_ad Band 30 Band 40
# 0  [email protected]   Boss1        
# 1   [email protected]           Boss2

Upvotes: 2

jcaliz
jcaliz

Reputation: 4021

Try using merge and maybe a loop if there are multiple bands:

import pandas as pd
from io import StringIO

s1 = '''
email_ad,Band 30,Band 40
[email protected],,
[email protected],,
'''

s2= '''
email_ad,Name,Manager_Name,Manager_Band_level
[email protected],Tom Banks,Boss1,30
[email protected],Bill Bob,Boss2,40
'''


df1 = pd.read_csv(StringIO(s1))
df2 = pd.read_csv(StringIO(s2))

And then do the merge

df3 = df1.merge(df2, how='left', left_on='email_ad', right_on='email_ad')
for i in (30,40):
    df3.loc[df3.Manager_Band_level == i, f'Band {i}'] = \
        df3.loc[df3.Manager_Band_level == i, 'Manager_Name']

df3[['email_ad', 'Band 30', 'Band 40']]

#       email_ad    Band 30 Band 40
# 0 [email protected] Boss1   NaN
# 1 [email protected]  NaN Boss2

Upvotes: 0

Related Questions