Reputation: 4470
I have a sample pandas DF:
df = pd.DataFrame(np.random.randint(0,10,size =(6,2)),columns=["A","B"])
df.loc[2,"B"]=np.NaN
df
Sample DF:
A B
0 5 0.0
1 4 8.0
2 6 NaN
3 8 9.0
4 4 7.0
5 5 4.0
I am trying to get the mean
of the entire DF, in DataFrame
format:
pd.DataFrame(df.mean(axis=0)).transpose()
OP:
A B
0 5.333333 5.6
The OP
is correct for column A
, but for column B
the OP is 5.6
because of 28/5
but it is originally 6
rows and it is neglecting the row with NaN
value, is there any way to include that row as well?
Expected OP - where for column A
its 32/6
and for column B
its 28/6
:
A B
0 5.333333 4.66666666
Upvotes: 2
Views: 210
Reputation: 862611
Simplier is replace missing values to 0
, then Series.to_frame
for one column DataFrame
and transpose:
df = df.fillna(0).mean().to_frame().T
print (df)
A B
0 5.333333 4.666667
Or yatu
solution from comment divide sum of DataFrame by length:
df = df.sum().div(len(df)).to_frame().T
print (df)
A B
0 5.333333 4.666667
Upvotes: 3