TK7372
TK7372

Reputation: 49

Merging two dataframes based on common values using pandas python?

This is my first question on stackoverflow so please bear with me.
I'm new to pandas, I need to merge two dataframes based on values in columns:

df1:

   Col1  Col2
0   A    10
1   B    12
2   C    22
3   D    22
4   E    22
5   F    11
6   G    55
7   H    60

df2:

   Col3 Col4
0   J    11
1   K    22
2   L    22
3   M    60
4   N    80
5   O    45
6   P    12
7   Q    65

The resulting dataframe should be like:

  Col1  Col2    Col3    Col4
0   B    12       P      12
1   C    22       K      22
2   D    22       L      22
3   H    60       M      60

It should compare Col2 and Col4 and only keep common rows and merge two data frames into one. I tried following code but result is not what I desire.

both_DFS=df1.merge(df2,left_on='Col2',right_on='Col4')

For this result is:

  Col1  Col2    Col3    Col4
0   B    12       P      12
1   C    22       K      22
2   C    22       L      22
3   D    22       K      22
4   D    22       L      22
5   E    22       K      22
6   E    22       L      22
7   H    60       M      60
 

Upvotes: 2

Views: 376

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

You need to enumerate the duplicate rows with groupby().cumcount then merge:

(df1.assign(idx=df1.groupby('Col2').cumcount())
    .merge(df2.assign(idx=df2.groupby('Col4').cumcount()),
           left_on=['Col2','idx'],
           right_on=['Col4','idx'])
)

Output:

  Col1  Col2  idx Col3  Col4
0    B    12    0    P    12
1    C    22    0    K    22
2    D    22    1    L    22
3    F    11    0    J    11
4    H    60    0    M    60

Upvotes: 1

Related Questions