Maccaroni123
Maccaroni123

Reputation: 31

Map one dataframe to another depending on column

I have two dataframes, depending on which column of df1 a value from df2 occurs, a value from df2 has to be copied to another specific column of df1.

I tried looping through the dataframe, but it takes ages due to the amount of data. I'm sure there must be some "pandaic" way of doing this.

import pandas as pd

data1 = {'A': ['X1', 'Y1'],
        'B': ['X2', 'Y2'],
        'A1': ['NaN','NaN'],
        'B1': ['NaN','NaN'],
        }

data2 = {'AB': ['X1', 'Y2', 'X2','Y1'],
        'D': ['D1', 'D4', 'D3', 'D2'],
        }
df1=pd.DataFrame(data1, columns=['A','B','A1','B1'])
df2=pd.DataFrame(data2, columns=['AB','D'])

If a value of df1 column A is found in df2, the corresponding value for D has to be copied to column A1. If a value of df1 column B is found in df2, the value of D has to be copied to B1 etc...

It is possible that either a value in A/B is NaN or that there is no corresponding value in df2.

data1_goal = {'A': ['X1', 'Y1'],
        'B': ['X2', 'Y2'],
        'A1': ['D1', 'D2'],
        'B1': ['D3', 'D4'],
        }
df_goal=pd.DataFrame(data1_goal, columns=['A','B','A1','B1'])

is there an elegant way of doing this beside iterating through df1?

Thank you!

Upvotes: 3

Views: 57

Answers (2)

anky
anky

Reputation: 75100

Here one way:

m=df1.loc[:,['A','B']].melt().merge(df2,left_on='value',right_on='AB').drop('AB',1)
n=m.assign(k=m.groupby('variable').cumcount()).set_index(['variable','k']).unstack(0)
df1=df1.mask(df1.eq('NaN'),n.values) #df1.mask(df1.isna(),n.values) if actual NaN
print(df1)

    A   B  A1  B1
0  X1  X2  D1  D3
1  Y1  Y2  D2  D4

Upvotes: 1

mujjiga
mujjiga

Reputation: 16896

 df1.merge(df2, left_on='A', right_on='AB').merge(
     df1.merge(df2, left_on='B', right_on='AB'), 
     on=['A','B'])[['A', 'B', 'D_x','D_y']].rename(
     columns={'D_x': 'A1','D_y':'B1'})

Upvotes: 1

Related Questions