Firdhaus Saleh
Firdhaus Saleh

Reputation: 189

Pandas mapping data from 2 dataframe with different column names

I tried to map these two dataframes but I failed. Maybe because the column names and it's value is a little bit different.

I wanted to create a new dataframe like in the dfNew.

df

Employee ID Employee Name   Activity Month
A0001       John Smith      Apr-19
A0002       Will Cornor     Apr-19
A0001       John Smith      May-19
A0003       David Teo       May-19
A0001       John Smith      May-19
A0002       Will Cornor     Jun-19
A0001       John Smith      Jun-19

df2

Month       Bonus
2019-04-01  5000
2019-05-01  4000
2019-06-01  6000

dfNew

Employee ID Employee Name   Activity Month  Bonus
A0001       John Smith      Apr-19          5000
A0002       Will Cornor     Apr-19          5000
A0001       John Smith      May-19          4000
A0003       David Teo       May-19          4000
A0001       John Smith      May-19          4000
A0002       Will Cornor     Jun-19          6000
A0001       John Smith      Jun-19          6000

Upvotes: 2

Views: 1254

Answers (2)

Firdhaus Saleh
Firdhaus Saleh

Reputation: 189

This is the answer for now based on @jezrael suggestion

df1['Activity Month'] = pd.to_datetime(df1['Activity Month'], format='%b-%y').dt.strftime('%b-%y')

df2['Month'] = pd.to_datetime(df2['Month'], format='%Y-%m-%d').dt.strftime('%b-%y')

df2['Activity Month'] = df2.pop('Month')
df1 = df1.merge(df2, on='Activity Month', how='left')

Upvotes: 0

jezrael
jezrael

Reputation: 863631

Use Series.dt.strftime fr change format of datetimes, so possible Series.map:

s = df2.set_index(df2['Month'].dt.strftime('%b-%y'))['Bonus']
df1['Bonus'] = df1['Activity Month'].map(s)
print (df1)
  Employee     ID Employee Name Activity Month  Bonus
0    A0001   John         Smith         Apr-19   5000
1    A0002   Will        Cornor         Apr-19   5000
2    A0001   John         Smith         May-19   4000
3    A0003  David           Teo         May-19   4000
4    A0001   John         Smith         May-19   4000
5    A0002   Will        Cornor         Jun-19   6000
6    A0001   John         Smith         Jun-19   6000

Or use DataFrame.merge with DataFrame.pop for new column with remove original:

df2['Activity Month'] = df2.pop('Month').dt.strftime('%b-%y')
df1 = df1.merge(df2, on='Activity Month', how='left')

Upvotes: 4

Related Questions