user16490995
user16490995

Reputation: 1

Mean including nan values

import numpy as np
import pandas as pd
    
GFG_dict = { '2019': [10,20,30,40],'2020': [20,30,40,50],'2021': [30, np.NaN, np.NaN, 60],'2022':[40,50,60,70]}
    
gfg = pd.DataFrame(GFG_dict)
    
gfg['mean1'] = gfg.mean(axis = 1)

gfg['mean2'] = gfg.loc[:,'2019':'2022'].sum(axis = 1)/4

gfg

I want to get the average as in column mean2.

How do I get the same result using: .mean(axis =1)

Upvotes: 0

Views: 481

Answers (1)

Matteo Zanoni
Matteo Zanoni

Reputation: 4152

When you do mean2 you are doing the sum (which skips nan values) and then always dividing by 4 (which is the count including nan).

You can achieve this by doing a fillna(0) and then calculating the mean:

gfg['mean1'] = gfg.fillna(0).mean(axis = 1)

NOTE: this will not fill the nan values in your dataframe but only do it for the mean calculation, so your dataframe will not be modified

Upvotes: 1

Related Questions