Rehan Azher
Rehan Azher

Reputation: 1340

Python3.x Pandas mean and std returning NAN

hitting this issue , which I am unable to fix , i have a large Data Frame (Sample Posted below ) , with 5 columns. I want to calculate Std and mean for each row. somehow it keep returning NaN.

   CellName                Apr-2018    Feb-2018    Jan-2018    Mar-2018  mean
0  BDG652ML_KPBENDULML1    9.450841   24.119474   27.091426   17.527006   NaN
1  BDG652ML_KPBENDULML2   15.917555   10.548731   11.019208   14.592388   NaN 
2  BDG652ML_KPBENDULML3   24.957360   21.122519   21.197216   24.950549   NaN

I have checked , all my month columns and float64 df.types gives:

CellName     object
Apr-2018    float64
Feb-2018    float64
Jan-2018    float64
Mar-2018    float64
dtype: object

I know that I don't need to exclude cell column and i can get mean easily by using

df['mean'] = df.mean(numeric_only =True)

i have also tried :

df['mean'] = df.iloc[:,1:].mean(numeric_only =True)

but still it remains None. Same behavior for Std as well.

Any hints on on what i may be doing wrong?

Upvotes: 2

Views: 2904

Answers (1)

jezrael
jezrael

Reputation: 862611

Use parameter axis=1 for mean per rows, numeric_only parameter seems should be omit:

df['mean'] = df.mean(axis=1)
#df['mean'] = df.mean(numeric_only=True, axis=1)
print (df)

              CellName   Apr-2018   Feb-2018   Jan-2018   Mar-2018       mean
0  BDG652ML_KPBENDULML1   9.450841  24.119474  27.091426  17.527006  19.547187
1  BDG652ML_KPBENDULML2  15.917555  10.548731  11.019208  14.592388  13.019470
2  BDG652ML_KPBENDULML3  24.957360  21.122519  21.197216  24.950549  23.056911

df['std'] = df.std(axis=1)
print (df)

              CellName   Apr-2018   Feb-2018   Jan-2018   Mar-2018       std
0  BDG652ML_KPBENDULML1   9.450841  24.119474  27.091426  17.527006  7.828126
1  BDG652ML_KPBENDULML2  15.917555  10.548731  11.019208  14.592388  2.644401
2  BDG652ML_KPBENDULML3  24.957360  21.122519  21.197216  24.950549  2.190731

If want add both columns assign is your friend, because mean or std need count only from original numeric columns:

df = df.assign(std=df.std(axis=1), mean=df.mean(axis=1))
print (df)

               CellName   Apr-2018   Feb-2018   Jan-2018   Mar-2018       std  \
0  BDG652ML_KPBENDULML1   9.450841  24.119474  27.091426  17.527006  7.828126   
1  BDG652ML_KPBENDULML2  15.917555  10.548731  11.019208  14.592388  2.644401   
2  BDG652ML_KPBENDULML3  24.957360  21.122519  21.197216  24.950549  2.190731   

        mean  
0  19.547187  
1  13.019470  
2  23.056911  

Upvotes: 4

Related Questions