Reputation: 459
I will be able to explain with example what I need to achieve:
Though both dataframe have duplicates, values of the column 'first_name' are different. Now I want to merge both, with output something like this:
df_a.merge(df_b, on='subject_id', how='left')
pandas merge will not give this output because of duplicates. how can I get my desired output or any other suggestions?
Upvotes: 1
Views: 50
Reputation: 863166
I believe you need helper coumns created by GroupBy.cumcount
and used it for merge
, last remove it:
df_a['g'] = df_a.groupby('subject_id').cumcount()
df_b['g'] = df_b.groupby('subject_id').cumcount()
df_a.merge(df_b, on=['subject_id', 'g'], how='left').drop('g', axis=1)
Upvotes: 2