Get all max and min values from df.groupby.().size() result

Question

I have dataframe like this

df = pd.DataFrame({
    'col1' : ['MM1', 'MM1', 'MM1', 'MM2', 'MM2', 'MM2', 'MM4', 'MM4', 'MM4'],
    'col2' : ['S1', 'S1', 'S3', 'S3', 'S4', 'S4', 'S2', 'S2', 'S2'],
    'col3' : [1,1,2,3,4,3,2,2,4]
    })

This gives me the grouped by result and the count

df.groupby(["col1", "col3"]).size()

This gives me the first result of the maximum

df.groupby(["col1", "col3"]).size().idxmax()

However, I want all maximum occurence. Everything I find on Stackoverflow is, when there is already a column with, i.e. the count. And then there are multiple maximums in the count column. But this is different here. I group by two values and then I want the maximum/minimum values of the result of this grouping. How to do this in pandas?

Here I expect

(M1, 2)
(M2, 4)
(M4, 4)

Edit:

df.groupby(["col1", "col3"]).size()

gives me

col1  col3       <- lets name this column count
MM1   1       2
      2       1
MM2   3       2
      4       1
MM4   2       2
      4       1

Lets name the unnamed columns count.

How do I get all the maximums in the count column? Hence, this:

2, 2, 2
How do I get its corresponding values from col1 and col2? Something like this (column names only for clarity):

col1, col3, count
MM1,  1,    2
MM2   3     2
MM4   2     2

jezrael · Accepted Answer

Yiu can convert second level (col3) to column. so possible aggregate by GroupBy.agg with DataFrameGroupBy.idxmax and GroupBy.max:

df1 = (df.groupby(["col1", "col3"])
         .size()
         .reset_index(level=0)
         .groupby('col1')
         .agg(col3 = (0,'idxmax'), count=(0, 'max')))

print (df1)
      col3  count
col1             
MM1      1      2
MM2      3      2
MM4      2      2

Get all max and min values from df.groupby.().size() result

Answers (2)

Related Questions