Performing functions on multiindex in groupby

Question

I have a dataframe with a MultiIndex. Here's a minimal working example:

df = pd.DataFrame({'note':[1,1,1,2,2,2,2],'t': [0.5,0.7,1.2,0.3,0.9,1.3,1.7],'val':[1,-1,0,0,1,0,0]})
dfs = df.set_index(['note','t'])

which gives

>>> dfs
          val
note t
1    0.5    1
     0.7   -1
     1.2    0
2    0.3    0
     0.9    1
     1.3    0
     1.7    0

what I want is to get (a) the minimum value and (b) the first value in the t index per group:

note min  first
1    0.5  0.5
2    0.3  0.3

I could do a groupby on the original dataframe df where note and t are columns and not indices:

df.groupby('note').agg({'t': [min, lambda x: list(x)[0]]})

but I'd rather not do a reset_index() followed by another set_index() to restore the dataframe to the MultiIndex version. How do I do this? The agg function only works on columns and not the indices.

jezrael · Accepted Answer

It is possible, but not very clean:

df = (dfs.index.get_level_values(1).to_series()
         .groupby(dfs.index.get_level_values(0))
         .agg(['min', 'first']))
print (df)
      min  first
note            
1     0.5    0.5
2     0.3    0.3

df = dfs.reset_index('t').groupby(level=0)['t'].agg(['min', 'first'])
print (df)
      min  first
note            
1     0.5    0.5
2     0.3    0.3

Performing functions on multiindex in groupby

Answers (2)

Related Questions