Chau362
Chau362

Reputation: 37

Merge dataframes with duplicate keys

I have a dataframe df1 in the form of:

df1:

      a   b  
1     x   bb
2     y   ba

I want to merge that with another dataframe df2 that has multiple rows where the column 'a' is matching the index of df1:

df2:

      a   c   d   e
1     1   cc  dd  ee
2     1   cd  de  ef
3     1   dd  ef  ff
4     2   ff  fg  fh
5     2   fg  fh  ff

I tried merging but I got lots of duplicates for df1 looking like this:

      a   b   c   d   e
1     x   bb  cc  dd  ee
1     x   bb  cd  de  ef
1     x   bb  dd  ef  ff

2     y   ba  ff  fg  fh
2     y   ba  fg  fh  ff

How can I merge them to achieve something like this:

      a   b   c   d   e
1     x   bb  cc  dd  ee
              cd  de  ef
              dd  ef  ff

2     y   ba  ff  fg  fh
              fg  fh  ff

I read about stacking but I dont really want to have multiple indices. Any help would be appreciated!

Upvotes: 0

Views: 517

Answers (1)

JE_Muc
JE_Muc

Reputation: 5774

You can merge these two dataframes using the pd.merge method. I assume you want to broadcast the values of b. Else please provide a fill-value for b.

If you want to have the columns a and b as part of a MultiIndex and retain the index numbering of df1, merge the dataframes like this:

df_new = pd.merge(df1.reset_index(), df2).set_index(['index', 'a', 'b'])

If you want to keep a and b as columns, merge the dataframes like this:

df_new = pd.merge(df1.reset_index(), df2).set_index('index')

Upvotes: 1

Related Questions