Pandas merging rows with same values based on multiple columns

Question

I have a sample dataset like this

Col1 Col2 Col3
A 1,2,3 A123
A 4,5 A456
A 1,2,3 A456
A 4,5 A123

I just want to merge the Col2 and Col3 into single row based on the unique Col1.

Expected Result:

Col1 Col2 Col3
A 1,2,3,4,5 A123,A456

I referred some solutions and tried with the following. But it only appends single column.

df.groupby(df.columns.difference(['Col3']).tolist())\
                 .Col3.apply(pd.Series.unique).reset_index()

sammywemmy · Accepted Answer

Drop duplicates with subsets Col1 and 3
groupby Col1
Then aggregate, using the string concatenate method

(df.drop_duplicates(['Col1','Col3'])
.groupby('Col1')
.agg(Col2 = ('Col2',lambda x: x.str.cat(sep=',')),
     Col3 = ('Col3', lambda x: x.str.cat(sep=','))
     )
.reset_index()
 )

    Col1    Col2    Col3
0   A   1,2,3,4,5   A123,A456

Pandas merging rows with same values based on multiple columns

Answers (1)

Related Questions