Reputation: 968
I have two dataframes with two shared columns (Symbol
, Date
) that I am trying to merge on:
df1
Company Symbol ID Date Price
0 A Inc AA 123 2019-03-31 1.0
1 A Inc AA 123 2019-06-30 NaN
2 A Inc AA 123 2019-09-30 3.0
3 B Inc BB 456 2019-03-31 5.0
4 B Inc BB 456 2019-06-30 6.0
5 B Inc BB 456 2019-09-30 7.0
6 X Inc XX 999 2019-03-31 9.0
df2
Symbol Date Price Income
0 AA 2019-03-31 1.1 10
1 AA 2019-06-30 2.1 11
2 AA 2019-09-30 3.1 12
3 BB 2019-03-31 5.1 14
4 BB 2019-06-30 6.1 15
5 BB 2019-09-30 7.1 16
6 ZZ 2019-03-31 8.0 20
If Price
exists in df1
and df2
, then I want to use the one from df1
; however, in cases where it doesn't exist in df1
, I want to use the one from df2
, so that the output looks like:
Company Symbol ID Date Price Income
0 A Inc AA 123.0 2019-03-31 1.0 10.0
1 A Inc AA 123.0 2019-06-30 2.1 11.0
2 A Inc AA 123.0 2019-09-30 3.0 12.0
3 B Inc BB 456.0 2019-03-31 5.0 14.0
4 B Inc BB 456.0 2019-06-30 6.0 15.0
5 B Inc BB 456.0 2019-09-30 7.0 16.0
6 X Inc XX 999.0 2019-03-31 9.0 NaN
7 NaN ZZ NaN 2019-03-31 8.0 20.0
df3 = pd.merge(df1, df2, on=['Date', 'Symbol'], how='outer')
gets me close, but how do I replace the NaN
values in Price_x
with the value in the Price_y
column before dropping Price_y
?
Upvotes: 0
Views: 104
Reputation: 323226
You can fix it after the merge
df3 = pd.merge(df1, df2, on=['Date', 'Symbol'], how='outer')
df3['Price']=df3.Price_x.fillna(df3.Price_y)
df3=df3.drop(['Price_x','Price_y'],axis=1)
Upvotes: 1