moys
moys

Reputation: 8033

Number of occurances of Highest & 2nd Highest values after groupby in Pandas

I have a dataframe as below

Itr	Type	Start	Values
2	1	101	20.402
2	1	102	20.402
2	1	103	20.399
2	1	104	20.399
2	1	105	20.399
2	1	106	20.383
2	1	107	20.383
2	1	108	20.383
2	1	109	20.383
2	1	110	20.383
2	1	111	20.36
2	1	112	20.36
2	1	113	20.36
2	1	114	20.36
2	1	115	20.36
2	1	116	20.36
2	1	117	20.36
2	1	118	20.36
2	1	119	20.36
2	1	120	20.36
3	1	121	20.348
3	1	122	20.348
3	1	123	20.348
3	1	124	20.348
3	1	125	20.348
3	1	126	20.34
3	1	127	20.34
3	1	128	20.34
3	1	129	20.34
3	1	130	20.34
3	1	131	20.337
3	1	132	20.337
3	1	133	20.337
3	1	134	20.337
3	1	135	20.337
3	1	136	20.342
3	2	121	20.058
3	2	122	20.058
3	2	123	20.058
3	2	124	20.058
3	2	125	20.043
3	2	126	20.043
3	2	127	20.043
3	2	128	20.043
3	2	129	20.043
3	2	130	20.035
3	2	131	20.035
3	2	132	20.035
3	2	133	20.035
3	2	134	20.035
3	2	135	20.021

I want to do a groupby of iterations (Itr) & Type & then find the number of occurrences of the maximum value & number of occurrences of the 2nd maximum value. For example for a groupby of Irr 2 & cycle 1, the result should be 2 & 3 because 20.402 (the maximum value) occurs twice & 20.399 (2nd max value) occurs thrice. I have tried but not been successful .Is there a way to do it? Thanks

Upvotes: 1

Views: 586

Answers (1)

ALollz
ALollz

Reputation: 59579

First get a mask of the top values within each group. Then get the size of the masked DataFrame.

m = df.groupby(['Itr', 'Type'])['Values'].transform(lambda x: x.isin(x.drop_duplicates().nlargest(2)))
df.where(m).groupby(['Itr', 'Type', 'Values']).size()

Itr  Type  Values
2.0  1.0   20.399    3
           20.402    2
3.0  1.0   20.342    1
           20.348    5
     2.0   20.043    5
           20.058    4
dtype: int64

Alternatively with groupby + value_counts, then sort and groupby + tail to keep the 2 largest values within groups.

(df.groupby(['Itr', 'Type'])['Values'].value_counts()
   .sort_index(level=2)
   .groupby(level=[0, 1]).tail(2))

Itr  Type  Values
3    2     20.043    5
           20.058    4
     1     20.342    1
           20.348    5
2    1     20.399    3
           20.402    2
Name: Values, dtype: int64

Upvotes: 1

Related Questions