Applying function on particular rows with GroupBy

Question

How compute mean() or other function on particular rows using GroupBy. Consider the following dataframe:

 In[239]: df.groupby(['id'])['summary']
Out[239]: 
                summary
id         
11                  2.0
11                  3.0
11                  3.0
11                  3.0
11                  3.0
11                  3.0
14                  NaN
14                  NaN
14                  NaN
14                  NaN
14                  NaN
14                  2.0
17                  NaN
17                  NaN
17                  NaN
17                  NaN
17                  5.0
17                  5.0
18                  4.0
18                  5.0
18                  4.0
18                  3.0
18                  3.0
18                  4.0
23                  2.0
23                  1.0
23                  2.0
23                  1.0
23                  3.0
23                  1.0
                ...
81                 10.0
81                  9.0
81                  8.0
81                  8.0
81                  9.0
81                  9.0
82                  0.0
82                  0.0
82                  0.0
82                  0.0
82                  0.0
82                  0.0
83                  1.0
83                  0.0
83                  1.0
83                  2.0
83                  2.0
83                  1.0
84                  2.0
84                  0.0
84                  0.0
84                  0.0
84                  1.0
84                  NaN
85                  5.0
85                  4.0
85                  4.0
85                  5.0
85                  5.0
85                  4.0

How to compute mean() of only first three rows of each id?
How to compute mean() of masked (index with some conditions) rows within each id ?

For example:

df.groupby(['id'])['summary'].mean()

will compute mean() of each group (defined by id), but it takes all rows.

Ted Petrou · Accepted Answer

The following would get both the mean of the first three rows and the mean of some mask.

df.groupby('id')['summary'].agg([lambda x: x.iloc[:3].mean(), lambda x: x[mask].mean()])

Applying function on particular rows with GroupBy

Answers (1)

Related Questions