Michieldo
Michieldo

Reputation: 189

How to merge duplicates as new columns

I am attempting to merge to dataframes on 1 column for which I would like the output of duplicates to be an extra column instead of a new row.

What happens now:

df1 = pd.DataFrame({'A': ['A0'],
                     'B': ['B0']})

df2 = pd.DataFrame({'A': ['A0', 'A0'],
                     'C': ['C4', 'C5']})

df1.merge(df2, on = 'A', how = 'left')

Gives the output:

    A   B   C
0  A0  B0  C4
1  A0  B0  C5

What I would like the output to be:

    A   B   C_1   C_2
0  A0  B0  C4    C5

Thanks!

Upvotes: 2

Views: 236

Answers (2)

Billy Bonaros
Billy Bonaros

Reputation: 1721

In one line of code:

df1.merge(df2.assign(Cs=range(0,len(df2))).pivot(index='A',columns='Cs'),on='A')

    A   B (C, 0) (C, 1)
0  A0  B0     C4     C5

Upvotes: 0

jezrael
jezrael

Reputation: 862511

Create unique values of column A in df2 by MultiIndex by DataFrame.set_index with counter column by GroupBy.cumcount, reshape by Series.unstack and flatten Multiindex by map with join::

df2 = df2.set_index(['A', df2.groupby('A').cumcount().add(1).astype(str)]).unstack()
df2.columns = df2.columns.map('_'.join)
df2 = df2.reset_index()
print (df2)
    A C_1 C_2
0  A0  C4  C5

df = df1.merge(df2, on = 'A', how = 'left')
print (df)
    A   B C_1 C_2
0  A0  B0  C4  C5

Upvotes: 4

Related Questions