N08
N08

Reputation: 1315

Performing outer join that merges joined columns

I am performing an outer join on two DataFrames:

df1 = pd.DataFrame({'id':      [1, 2, 3, 4, 5], 
                    'date':    [4, 5, 6, 7, 8],
                    'str':     ['a', 'b', 'c', 'd', 'e']})

df2 = pd.DataFrame({'id':      [1, 2, 3, 4, 6], 
                    'date':    [4, 5, 6, 7, 8],
                    'str':     ['A', 'B', 'C', 'D', 'Q']})

pd.merge(df1, df2, on=["id","date"], how="outer")

This gives the result

   date  id str_x str_y
0     4   1     a     A
1     5   2     b     B
2     6   3     c     C
3     7   4     d     D
4     8   5     e   NaN
5     8   6   NaN     Q

Is it possible to perform the outer join such that the str-columns are concatenated? In other words, how to perform the join such that we get the DataFrame

   date  id     str
0     4   1     aA
1     5   2     bB
2     6   3     cC
3     7   4     dD
4     8   5     e
5     8   6     Q

where all NaN have been set to None.

Upvotes: 2

Views: 50

Answers (2)

piRSquared
piRSquared

Reputation: 294586

If 'id', 'date' is unique in each data frame, then you can set the index and add the dataframes.

icols = ['date', 'id']
df1.set_index(icols).add(df2.set_index(icols), fill_value='').reset_index()

   date  id str
0     4   1  aA
1     5   2  bB
2     6   3  cC
3     7   4  dD
4     8   5   e
5     8   6   Q

Upvotes: 2

jezrael
jezrael

Reputation: 863801

I think not, possible solution is replace NaNs and join together:

df = (pd.merge(df1, df2, on=["id","date"], how="outer", suffixes=('','_'))
        .assign(str=lambda x: x['str'].fillna('') + x['str_'].fillna(''))
        .drop('str_', 1))

Similar alternative:

df = (pd.merge(df1, df2, on=["id","date"], how="outer", suffixes=('','_'))
        .assign(str=lambda x: x.filter(like='str').fillna('').values.sum(axis=1))
        .drop('str_', 1))

print (df)
   date  id str
0     4   1  aA
1     5   2  bB
2     6   3  cC
3     7   4  dD
4     8   5   e
5     8   6   Q

Upvotes: 2

Related Questions