How can I average values from two dataframes (same index) into one df, with conditions on missing values NaN?

Question

I work with two dataframes (one is derived of Rainfall data from 1981 till now, the other one of Vegetation Index data from 2002 till now).

pR:

MonthDekad            01d1        01d2       01d3       02d1       02d2  \
AdminCode Year                                                            
2688      1981    2.702703    2.702703   2.702703   2.702703   2.702703   
          1982   16.216216   21.621622  18.918919  32.432432  54.054054   
          ...........   
          2016   0.166331     0.318759   0.431364   0.492916   0.632023   
          2017  -0.492916    -0.431364        NaN        NaN        NaN

and pV:

MonthDekad          01d1      01d2      01d3      02d1      02d2  \
AdminCode Year                                                               
2688      2002       NaN       NaN       NaN       NaN       NaN        
          2003  0.477121  0.477121  0.477121  0.477121  0.477121       
          ............ 
          2016       NaN  0.636822  0.000000  0.000000  0.000000 
          2017 -0.636822 -0.636822       NaN       NaN       NaN

Both are indexed the same way (multi-indexed, level0 = admincode for the localisation, level1 for the year) and columns are the dekads of the year.

I need to combine them into one dataframe, by

averaging the 2 values corresponding to the same index position, only if the values are both numbers,
otherwise (if one of the two is missing / NaN), the final value should be the one not missing (e.g.: for 1981 till 2002, only rainfall values).
Of course if both are NaN, it'll be NaN too.

I am blocked with the second conditional. So far, I have only thought of

pRV = pR.add(pV, fill_value=0)

that I then divide by 2 but it's a problem when only one value is added because it'll divide it too... Any idea how to solve this?

Allen Qin · Accepted Answer

First concat 2 DFs and then group by all indexes. Finally take the mean for each key.

pd.concat([pR,pV]).groupby(level=[0,1]).mean()

How can I average values from two dataframes (same index) into one df, with conditions on missing values NaN?

Answers (1)

Related Questions