Reputation: 783
How to overlay/consolidate two data frames in python, such that overlapping cells (index,column) get added and uncommon cells preserve the values from original data frame?
This is possible in Excel as explained here.
For Eg: I have two data frames
I want the output to be like
Upvotes: 1
Views: 200
Reputation: 862581
You need add
with fillna
, but in df1
and df2
have to be NO NaN
values, because fillna
removed them:
print (df1.add(df2, fill_value=0).fillna(0).astype(int))
a b c
a1 1 2 0
a2 3 514 123
a3 5 539 134
Another solution with union
of both indexes
and reindex
- works if NaN
in DataFrames
:
idx = df1.index.union(df2.index)
print (df1.reindex(idx, fill_value=0))
a b
a1 1 2
a2 3 4
a3 5 6
print (df2.reindex(idx, fill_value=0))
b c
a1 0 0
a2 510 123
a3 533 134
idx = df1.index.union(df2.index)
print (df1.reindex(idx, fill_value=0)
.add(df2.reindex(idx, fill_value=0), fill_value=0)
.astype(int))
a b c
a1 1 2 0
a2 3 514 123
a3 5 539 134
df1 = pd.DataFrame({'a':[1,3,5],
'b':[2,4,6]}, index=['a1','a2','a3'])
print (df1)
a b
a1 1 2
a2 3 4
a3 5 6
df2 = pd.DataFrame({'b':[510,533],
'c':[123,np.nan]}, index=['a2','a3'])
print (df2)
b c
a2 510 123.0
a3 533 NaN
print (df1.reindex(idx, fill_value=0).add(df2.reindex(idx, fill_value=0), fill_value=0))
a b c
a1 1.0 2 0.0
a2 3.0 514 123.0
a3 5.0 539 NaN
print (df1.add(df2, fill_value=0).fillna(0).astype(int))
a b c
a1 1 2 0
a2 3 514 123
a3 5 539 0 <- replace NaN to 0
Upvotes: 1