data_person
data_person

Reputation: 4470

Pandas Row mean with NaN

I have a sample pandas DF:

df = pd.DataFrame(np.random.randint(0,10,size =(6,2)),columns=["A","B"])
df.loc[2,"B"]=np.NaN
df

Sample DF:

    A    B
0   5   0.0
1   4   8.0
2   6   NaN
3   8   9.0
4   4   7.0
5   5   4.0

I am trying to get the mean of the entire DF, in DataFrame format:

pd.DataFrame(df.mean(axis=0)).transpose()

OP:

      A          B
0   5.333333    5.6

The OP is correct for column A, but for column B the OP is 5.6 because of 28/5 but it is originally 6 rows and it is neglecting the row with NaN value, is there any way to include that row as well?

Expected OP - where for column A its 32/6 and for column B its 28/6:

      A          B
0   5.333333    4.66666666

Upvotes: 2

Views: 210

Answers (1)

jezrael
jezrael

Reputation: 862611

Simplier is replace missing values to 0, then Series.to_frame for one column DataFrame and transpose:

df = df.fillna(0).mean().to_frame().T
print (df)
          A         B
0  5.333333  4.666667

Or yatu solution from comment divide sum of DataFrame by length:

df = df.sum().div(len(df)).to_frame().T
print (df)
          A         B
0  5.333333  4.666667

Upvotes: 3

Related Questions