Reputation: 424
I have two dataframes with different indexing that I want to sum the same column from the two dataframes. I tried the following but gives NaN values
result['Anomaly'] = df['Anomaly'] + tmp['Anomaly']
df
date Anomaly
0 2018-12-06 0
1 2019-01-07 0
2 2019-02-06 1
3 2019-03-06 0
4 2019-04-06 0
tmp
date Anomaly
0 2018-12-06 0
1 2019-01-07 1
4 2019-04-06 0
result
date Anomaly
0 2018-12-06 0.0
1 2019-01-07 NaN
2 2019-02-06 1.0
3 2019-03-06 0.0
4 2019-04-06 0.0
What I want is actually:
result
date Anomaly
0 2018-12-06 0
1 2019-01-07 1
2 2019-02-06 1
3 2019-03-06 0
4 2019-04-06 0
Upvotes: 1
Views: 1828
Reputation: 76
You can try this
pd.concat([df, tmp]).groupby('date', as_index=False)["Anomaly"].sum()
date Anomaly
0 2018-12-06 0
1 2019-01-07 1
2 2019-02-06 1
3 2019-03-06 0
4 2019-04-06 0
Upvotes: 3
Reputation: 863291
Here is necessary align by datetimes
, so first use DataFrame.set_index
for DatetimeIndex
and then use Series.add
:
df = df.set_index('date')
tmp = tmp.set_index('date')
result = df['Anomaly'].add(tmp['Anomaly'], fill_value=0).reset_index()
Upvotes: 3
Reputation: 149125
You must first set correct indices on your dataframes, and then add using the date
indices:
tmp1 = tmp.set_index('date')
result = df.set_index('date')
result.loc[tmp1.index] += tmp1
result.reset_index(inplace=True)
Upvotes: 0
Reputation: 75120
res = pd.DataFrame({'date':df.date,'Anomaly':tmp.Anomaly.combine_first(df.Anomaly)})
print(res)
date Anomaly
0 2018-12-06 0.0
1 2019-01-07 1.0
2 2019-02-06 1.0
3 2019-03-06 0.0
4 2019-04-06 0.0
Upvotes: 2