Reputation: 189
I am attempting to merge to dataframes on 1 column for which I would like the output of duplicates to be an extra column instead of a new row.
What happens now:
df1 = pd.DataFrame({'A': ['A0'],
'B': ['B0']})
df2 = pd.DataFrame({'A': ['A0', 'A0'],
'C': ['C4', 'C5']})
df1.merge(df2, on = 'A', how = 'left')
Gives the output:
A B C
0 A0 B0 C4
1 A0 B0 C5
What I would like the output to be:
A B C_1 C_2
0 A0 B0 C4 C5
Thanks!
Upvotes: 2
Views: 236
Reputation: 1721
In one line of code:
df1.merge(df2.assign(Cs=range(0,len(df2))).pivot(index='A',columns='Cs'),on='A')
A B (C, 0) (C, 1)
0 A0 B0 C4 C5
Upvotes: 0
Reputation: 862511
Create unique values of column A
in df2
by MultiIndex
by DataFrame.set_index
with counter column by GroupBy.cumcount
, reshape by Series.unstack
and flatten Multiindex
by map
with join
::
df2 = df2.set_index(['A', df2.groupby('A').cumcount().add(1).astype(str)]).unstack()
df2.columns = df2.columns.map('_'.join)
df2 = df2.reset_index()
print (df2)
A C_1 C_2
0 A0 C4 C5
df = df1.merge(df2, on = 'A', how = 'left')
print (df)
A B C_1 C_2
0 A0 B0 C4 C5
Upvotes: 4