Pandas - when nan, add value from another dataframe

Question

I have some code like:

import pandas as pd
import numpy as np

df = pd.DataFrame({'COL1': [15, np.nan,np.nan,24], 
                   'COL2' : [7.2,3.5,np.nan, 0.1]})

df2 = pd.DataFrame({'COL1_add': [2, 3, 5, 2], 
                    'COL2_add' : [-2.3,-5.3,-3.8,-4.5]})

I'd like to use df2 to fill NaNs in df by adding their values to the previous rows, and some rows will have 10's if not hundreds of NaN's in a row (df2 will have no NaNs), so I can't just do a simple shift and add.

For this example, I'd like the result to be:

df
    COL1    COL2
0    15     7.2
1    18     3.5
2    23     -0.3
3    24     0.1

Any suggestions?

akuiper · Accepted Answer

Assuming df and df2 have the same shape, you can try combine; and use a lambda function to do the customized calculation, here you can fill missing values in df (or x) with values in df2 (or y) and then do a groupby.cumsum to calculate the values by NA chunks;

df2.columns = df.columns

df.combine(df2, lambda x, y: x.fillna(y).groupby(x.notnull().cumsum()).cumsum()) 

#   COL1    COL2
#0  15.0     7.2
#1  18.0     3.5
#2  23.0    -0.3
#3  24.0     0.1

Pandas - when nan, add value from another dataframe

Answers (1)

Related Questions