spazznolo
spazznolo

Reputation: 769

Take Smallest Group by Average

I'm looking to filter a set for the group with the smallest average value. In the example below, we'd group by ID, get the average year, and then only keep the group with the smallest average year. This feels basic, but I haven't seen an answer for it yet.

df = pd.DataFrame({'ID':[1,1,2,2,3,3], 
                   'YEAR' : [2011,2012,2012,2013,2013,2014], 
                   'V': [0,1,1,0,1,0],
                   'C':[00,11,22,33,44,55]})

Upvotes: 0

Views: 36

Answers (2)

Fardin Ahsan
Fardin Ahsan

Reputation: 53

groups = df.groupby(['ID']).mean()

Then

groups.loc[groups['YEAR'] == groups['YEAR'].min()]

Upvotes: 1

Mustafa Aydın
Mustafa Aydın

Reputation: 18306

In [180]: min_id = df.groupby("ID")["YEAR"].mean().idxmin()

In [181]: df.query("ID == @min_id")
Out[181]:
   ID  YEAR  V   C
0   1  2011  0   0
1   1  2012  1  11
  • get the index of the minimum mean year having group
  • filter the frame where ID is equal to that

this is how the aggregated mean looks like:

In [182]: df.groupby("ID")["YEAR"].mean()
Out[182]:
ID
1    2011.5
2    2012.5
3    2013.5
Name: YEAR, dtype: float64

so the minimum is 2011.5, index of it is 1 (the ID).

Upvotes: 4

Related Questions