Reputation: 1688
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])
df['C']=[5,2,np.nan,4,np.nan]
df['index']=df.index
The DataFrame:
A B C index
0 0 1 5 0
1 2 3 2 1
2 4 5 NaN 2
3 6 7 4 3
4 8 9 NaN 4
I want to use groupby funciton to calculate the average of "A""B""C" for every index.
df1 = df.groupby('index')
average = df1.apply(lambda df,x,y :np.average(df.iloc[:,x:y]),0,3)
The average:
index
0 2.00
1 2.33
2 NaN
3 5.67
4 NaN
dtype:float64
what I want to do is that :
when A B C are digits,the average = (A+B+C)/3
when one of ABC is np.nan, the average = sum/2
Upvotes: 1
Views: 173
Reputation: 17506
Why use numpy instead of pandas.mean()
?
df
Out[207]:
A B C
0 0 1 5.0
1 2 3 2.0
2 4 5 NaN
3 6 7 4.0
4 8 9 NaN
In [210]: df.mean(axis=1)
Out[210]:
0 2.000000
1 2.333333
2 4.500000
3 5.666667
4 8.500000
dtype: float64
It does just what you need, and also works on groupby
objects.
Upvotes: 0
Reputation: 862681
You can use custom function:
You can use GroupBy.mean
- NaN
are omited:
#groupby by index
df1 = df.groupby(level=0)
#groupby by column index
#df1 = df.groupby('index')
#filter columns for apply first, item() is for return scalar
average = df1['A','B','C'].apply(lambda x: x.mean(axis=1).item())
print (average)
0 2.000000
1 2.333333
2 4.500000
3 5.666667
4 8.500000
dtype: float64
Or with idea from Paul Panzer comment:
df1 = df.groupby(level=0)
average = df1['A','B','C'].apply(np.nanmean)
print (average)
0 2.000000
1 2.333333
2 4.500000
3 5.666667
4 8.500000
dtype: float64
Also groupby
is not necessary, only select desired columns and use DataFrame.mean
with axis=1
:
df['new'] = df[['A','B','C']].mean(axis=1)
print (df)
A B C index new
0 0 1 5.0 0 2.000000
1 2 3 2.0 1 2.333333
2 4 5 NaN 2 4.500000
3 6 7 4.0 3 5.666667
4 8 9 NaN 4 8.500000
print (df)
A B C
0 0 1 5.0
1 2 3 2.0
2 4 5 NaN
3 6 7 4.0
4 8 9 NaN
#if need mean of all columns
df['new'] = df.mean(axis=1)
print (df)
A B C new
0 0 1 5.0 2.000000
1 2 3 2.0 2.333333
2 4 5 NaN 4.500000
3 6 7 4.0 5.666667
4 8 9 NaN 8.500000
Upvotes: 1