Reputation: 279
I would like to merge
two DataFrames, but the matching key in the 2nd DataFrame is scattered in two different columns. What is the best way to merge the two DF?
import pandas as pd
data1 = {'key': ['abc','efg', 'xyz', 'sdf']}
data2 = {'key1' : ['abc','sss','ggg','ccc'],
'key2' : ['aaa','efg','xyz', 'sdf'],
'msg' : ['happy','mad','smile','great']}
df1= pd.DataFrame(data1)
df2= pd.DataFrame(data2)
Upvotes: 0
Views: 1042
Reputation: 1284
If both dataframes have a relation where row1 of df1 corresponds to row1 of df2 and so on then you can simply merge on index.
Code:
df = pd.merge(df1, df2, left_index=True, right_index=True)
print(df)
Output:
key key1 key2 msg
0 abc abc aaa happy
1 efg sss efg mad
2 xyz ggg xyz smile
3 sdf ccc sdf great
Upvotes: 0
Reputation: 146
easiest way to fix this would be to melt the dataframe and merge the dataframes
data1 = {'key': ['abc','efg', 'xyz', 'sdf']}
data2 = {'key1' : ['abc','sss','ggg','ccc'],
'key2' : ['aaa','efg','xyz', 'sdf'],
'msg' : ['happy','mad','smile','great']}
data1 = pd.DataFrame(data1)
data2 = pd.DataFrame(data2).melt(id_vars='msg', value_name='key').drop('variable', axis=1)
df = data1.merge(data2, on='key')
df =
key msg
0 abc happy
1 efg mad
2 xyz smile
3 sdf great
Upvotes: 1