Reputation: 424

NaN values when adding two columns

I have two dataframes with different indexing that I want to sum the same column from the two dataframes. I tried the following but gives NaN values

result['Anomaly'] = df['Anomaly'] + tmp['Anomaly']

df
    date           Anomaly
0 2018-12-06         0
1 2019-01-07         0
2 2019-02-06         1
3 2019-03-06         0
4 2019-04-06         0

tmp
    date           Anomaly
0 2018-12-06         0
1 2019-01-07         1
4 2019-04-06         0

result
    date           Anomaly
0 2018-12-06        0.0
1 2019-01-07        NaN
2 2019-02-06        1.0
3 2019-03-06        0.0
4 2019-04-06        0.0

What I want is actually:

result
    date           Anomaly
0 2018-12-06         0
1 2019-01-07         1
2 2019-02-06         1
3 2019-03-06         0
4 2019-04-06         0

Upvotes: 1

Answers (4)

Greeser

Reputation: 76

You can try this

pd.concat([df, tmp]).groupby('date', as_index=False)["Anomaly"].sum()

         date  Anomaly
0  2018-12-06        0
1  2019-01-07        1
2  2019-02-06        1
3  2019-03-06        0
4  2019-04-06        0

Upvotes: 3

jezrael

Reputation: 863291

Here is necessary align by datetimes, so first use DataFrame.set_index for DatetimeIndex and then use Series.add:

df = df.set_index('date')
tmp = tmp.set_index('date')
result = df['Anomaly'].add(tmp['Anomaly'], fill_value=0).reset_index()

Upvotes: 3

Serge Ballesta

Reputation: 149125

You must first set correct indices on your dataframes, and then add using the date indices:

tmp1 = tmp.set_index('date')
result = df.set_index('date')
result.loc[tmp1.index] += tmp1
result.reset_index(inplace=True)

Upvotes: 0

anky

Reputation: 75120

combine_first():

res = pd.DataFrame({'date':df.date,'Anomaly':tmp.Anomaly.combine_first(df.Anomaly)})
print(res)

         date  Anomaly
0  2018-12-06      0.0
1  2019-01-07      1.0
2  2019-02-06      1.0
3  2019-03-06      0.0
4  2019-04-06      0.0

Upvotes: 2

NaN values when adding two columns

Answers (4)

Related Questions