Reputation: 2080
Say I have two data frames,
df1.head()
AH1
A
B
C
D
F
The second data frame df2.head()
AH1 BC
A,B SAP
C,D XY
A,B,C BMZ
A,F Zoom
What I need is to merge the columns AH1
from two data frames df1
and df2
based on the same values. In the end, I am aiming to have something like this,
output
AH1 BC
A SAP, BMZ,Zoom
B SAP,BMZ
C XY,BMZ
D XY
F Zoom
Any help or suggestions are appreciated. I know pandas.merge
would go however not between column vector and column of the list.
Upvotes: 0
Views: 63
Reputation: 150735
You can do an explode
:
(df2.assign(AH1=df2.AH1.str.split(','))
.explode('AH1')
.groupby('AH1')
['BC'].agg(list)
)
Output:
AH1
A [SAP, BMZ, Zoom]
B [SAP, BMZ]
C [XY, BMZ]
D [XY]
F [Zoom]
Name: BC, dtype: object
Or replacing agg(list)
with agg(', '.join)
:
(df2.assign(AH1=df2.AH1.str.split(','))
.explode('AH1')
.groupby('AH1')
['BC'].agg(', '.join)
)
Output:
AH1
A SAP, BMZ, Zoom
B SAP, BMZ
C XY, BMZ
D XY
F Zoom
Name: BC, dtype: object
Upvotes: 1