Reputation: 201
I am trying to sum panda data frames with different start dates.
If there is no value for df2 I want to use 0, but I cant figure out how df1.fillna(0)
also didn't work. Any ideas?
What I have
df1
COL
2020-06-03 50
2020-06-04 51
2020-06-05 52
2020-06-06 49
2020-06-07 46
df2
COL
2020-06-03
2020-06-04
2020-06-05
2020-06-06 10
2020-06-07 10
What I do
portfolio = df1 + df2
or
portfolio = df1.fillna(0) + df2fillna(0)
What I get
df3
COL
2020-06-03 NaN
2020-06-04 NaN
2020-06-05 NaN
2020-06-06 59
2020-06-07 56
What I want
df3
COL
2020-06-03 50
2020-06-04 51
2020-06-05 52
2020-06-06 59
2020-06-07 56
Upvotes: 1
Views: 41
Reputation: 3898
How about this.
# setup
import pandas as pd
import numpy as np
df1 = pd.DataFrame({
"COL": [50, 51, 52, 49, 46]
},
index=['2020-06-03', '2020-06-04', '2020-06-05', '2020-06-06', '2020-06-07'])
df2 = pd.DataFrame({
"COL": [np.nan, np.nan, np.nan, 10, 10]
},
index=['2020-06-03', '2020-06-04', '2020-06-05', '2020-06-06', '2020-06-07'])
# logic
pd.concat([df1, df2]).reset_index().groupby('index').sum()
Output:
COL
index
2020-06-03 50.0
2020-06-04 51.0
2020-06-05 52.0
2020-06-06 59.0
2020-06-07 56.0
Upvotes: 2