Reputation: 9102
I can't get the average or mean of a column in pandas. A have a dataframe. Neither of things I tried below gives me the average of the column weight
>>> allDF
ID birthyear weight
0 619040 1962 0.1231231
1 600161 1963 0.981742
2 25602033 1963 1.3123124
3 624870 1987 0.94212
The following returns several values, not one:
allDF[['weight']].mean(axis=1)
So does this:
allDF.groupby('weight').mean()
Upvotes: 291
Views: 1082839
Reputation: 353019
If you only want the mean of the weight
column, select the column (which is a Series) and call .mean()
:
In [479]: df
Out[479]:
ID birthyear weight
0 619040 1962 0.123123
1 600161 1963 0.981742
2 25602033 1963 1.312312
3 624870 1987 0.942120
In [480]: df.loc[:, 'weight'].mean()
Out[480]: 0.83982437500000007
Upvotes: 437
Reputation: 17794
You can use the method agg
(aggregate
):
df.agg('mean')
It's possible to apply multiple statistics:
df.agg(['mean', 'max', 'min'])
Upvotes: 3
Reputation: 7
You can easily follow the following code
import pandas as pd
import numpy as np
classxii = {'Name':['Karan','Ishan','Aditya','Anant','Ronit'],
'Subject':['Accounts','Economics','Accounts','Economics','Accounts'],
'Score':[87,64,58,74,87],
'Grade':['A1','B2','C1','B1','A2']}
df = pd.DataFrame(classxii,index = ['a','b','c','d','e'],columns=['Name','Subject','Score','Grade'])
print(df)
#use the below for mean if you already have a dataframe
print('mean of score is:')
print(df[['Score']].mean())
Upvotes: -2
Reputation: 888
Do note that it needs to be in the numeric data type in the first place.
import pandas as pd
df['column'] = pd.to_numeric(df['column'], errors='coerce')
Next find the mean on one column or for all numeric columns using describe()
.
df['column'].mean()
df.describe()
Example of result from describe:
column
count 62.000000
mean 84.678548
std 216.694615
min 13.100000
25% 27.012500
50% 41.220000
75% 70.817500
max 1666.860000
Upvotes: 4
Reputation: 1688
You can simply go for: df.describe() that will provide you with all the relevant details you need, but to find the min, max or average value of a particular column (say 'weights' in your case), use:
df['weights'].mean(): For average value
df['weights'].max(): For maximum value
df['weights'].min(): For minimum value
Upvotes: 2
Reputation: 2222
You can use either of the two statements below:
numpy.mean(df['col_name'])
# or
df['col_name'].mean()
Upvotes: 6
Reputation: 4285
Additionally if you want to get the round
value after finding the mean
.
#Create a DataFrame
df1 = {
'Subject':['semester1','semester2','semester3','semester4','semester1',
'semester2','semester3'],
'Score':[62.73,47.76,55.61,74.67,31.55,77.31,85.47]}
df1 = pd.DataFrame(df1,columns=['Subject','Score'])
rounded_mean = round(df1['Score'].mean()) # specified nothing as decimal place
print(rounded_mean) # 62
rounded_mean_decimal_0 = round(df1['Score'].mean(), 0) # specified decimal place as 0
print(rounded_mean_decimal_0) # 62.0
rounded_mean_decimal_1 = round(df1['Score'].mean(), 1) # specified decimal place as 1
print(rounded_mean_decimal_1) # 62.2
Upvotes: 3
Reputation: 15152
Mean for each column in df
:
A B C
0 5 3 8
1 5 3 9
2 8 4 9
df.mean()
A 6.000000
B 3.333333
C 8.666667
dtype: float64
and if you want average of all columns:
df.stack().mean()
6.0
Upvotes: 18
Reputation: 2913
You can also access a column using the dot notation (also called attribute access) and then calculate its mean:
df.your_column_name.mean()
Upvotes: 11
Reputation: 491
Try df.mean(axis=0)
, axis=0
argument calculates the column wise mean of the dataframe so the result will be axis=1
is row wise mean so you are getting multiple values.
Upvotes: 46
Reputation: 209
you can use
df.describe()
you will get basic statistics of the dataframe and to get mean of specific column you can use
df["columnname"].mean()
Upvotes: 18
Reputation: 454
Do try to give print (df.describe())
a shot. I hope it will be very helpful to get an overall description of your dataframe.
Upvotes: 27