Reputation: 33
I tried to combine two dataframes into one. Each dataframe has one unique column name. The row indices may overlap between both dataframes, but there are also unique ones.
Here is an example:
df1 = pd.DataFrame({'ColumnA': [1, 2, 3, 4]}, index=['a', 'b', 'c', 'd'])
df2 = pd.DataFrame({'ColumnB': [5, 6, 7, 8]}, index=['a', 'b', 'e', 'f'])
I expect as a result the following:
ColumnA ColumnB
a 1 5
b 2 6
c 3 NaN
d 4 NaN
e NaN 7
f NaN 8
I tried the following:
df1 = df1.merge(df2, left_index=True, right_index=True)
I also tried this:
df1 = pd.concat([df1, df2], axis=1, sort=False)
Both with the results, that unique rows are gone.
Can anyone help me here? Thanks.
Upvotes: 3
Views: 720
Reputation: 20669
df.merge
has a parameter how
using 'outer'
yields desired results.
df1.merge(df2, how='outer', left_index=True, right_index=True)
Output:
ColumnA ColumnB
a 1.0 5.0
b 2.0 6.0
c 3.0 NaN
d 4.0 NaN
e NaN 7.0
f NaN 8.0
Upvotes: 1
Reputation: 30971
You tried merge and concat, but there is yet another method.
Run:
df1.join(df2, how='outer')
Upvotes: 2
Reputation: 23099
IIUC combine_first
or join
df1.combine_first(df2)
ColumnA ColumnB
a 1.0 5.0
b 2.0 6.0
c 3.0 NaN
d 4.0 NaN
e NaN 7.0
f NaN 8.0
or
df1.join(df2,how='outer')
ColumnA ColumnB
a 1.0 5.0
b 2.0 6.0
c 3.0 NaN
d 4.0 NaN
e NaN 7.0
f NaN 8.0
Upvotes: 3