SR5
SR5

Reputation: 33

Combine two pandas DataFrames with one unique column each and keep the row index (Python)

I tried to combine two dataframes into one. Each dataframe has one unique column name. The row indices may overlap between both dataframes, but there are also unique ones.

Here is an example:

df1  = pd.DataFrame({'ColumnA': [1, 2, 3, 4]}, index=['a', 'b', 'c', 'd'])
df2  = pd.DataFrame({'ColumnB': [5, 6, 7, 8]}, index=['a', 'b', 'e', 'f'])  

I expect as a result the following:

 ColumnA   ColumnB
a 1        5
b 2        6
c 3        NaN
d 4        NaN
e NaN      7
f NaN      8

I tried the following:

df1 = df1.merge(df2, left_index=True, right_index=True)

I also tried this:

df1 = pd.concat([df1, df2], axis=1, sort=False)

Both with the results, that unique rows are gone.

Can anyone help me here? Thanks.

Upvotes: 3

Views: 720

Answers (3)

Ch3steR
Ch3steR

Reputation: 20669

df.merge has a parameter how using 'outer' yields desired results.

df1.merge(df2, how='outer', left_index=True, right_index=True)

Output:

   ColumnA  ColumnB
a      1.0      5.0
b      2.0      6.0
c      3.0      NaN
d      4.0      NaN
e      NaN      7.0
f      NaN      8.0

Upvotes: 1

Valdi_Bo
Valdi_Bo

Reputation: 30971

You tried merge and concat, but there is yet another method.

Run:

df1.join(df2, how='outer')

Upvotes: 2

Umar.H
Umar.H

Reputation: 23099

IIUC combine_first or join

df1.combine_first(df2)

   ColumnA  ColumnB
a      1.0      5.0
b      2.0      6.0
c      3.0      NaN
d      4.0      NaN
e      NaN      7.0
f      NaN      8.0

or

df1.join(df2,how='outer')

   ColumnA  ColumnB
a      1.0      5.0
b      2.0      6.0
c      3.0      NaN
d      4.0      NaN
e      NaN      7.0
f      NaN      8.0

Upvotes: 3

Related Questions