Reputation: 3086
How compute mean()
or other function on particular rows using GroupBy. Consider the following dataframe:
In[239]: df.groupby(['id'])['summary']
Out[239]:
summary
id
11 2.0
11 3.0
11 3.0
11 3.0
11 3.0
11 3.0
14 NaN
14 NaN
14 NaN
14 NaN
14 NaN
14 2.0
17 NaN
17 NaN
17 NaN
17 NaN
17 5.0
17 5.0
18 4.0
18 5.0
18 4.0
18 3.0
18 3.0
18 4.0
23 2.0
23 1.0
23 2.0
23 1.0
23 3.0
23 1.0
...
81 10.0
81 9.0
81 8.0
81 8.0
81 9.0
81 9.0
82 0.0
82 0.0
82 0.0
82 0.0
82 0.0
82 0.0
83 1.0
83 0.0
83 1.0
83 2.0
83 2.0
83 1.0
84 2.0
84 0.0
84 0.0
84 0.0
84 1.0
84 NaN
85 5.0
85 4.0
85 4.0
85 5.0
85 5.0
85 4.0
mean()
of only first three rows of each id?mean()
of masked (index with some conditions) rows within each id ?For example:
df.groupby(['id'])['summary'].mean()
will compute mean() of each group (defined by id), but it takes all rows.
Upvotes: 0
Views: 55
Reputation: 61967
The following would get both the mean of the first three rows and the mean of some mask.
df.groupby('id')['summary'].agg([lambda x: x.iloc[:3].mean(), lambda x: x[mask].mean()])
Upvotes: 2