Reputation: 8033
I have a dataframe as below
Itr Type Start Values
2 1 101 20.402
2 1 102 20.402
2 1 103 20.399
2 1 104 20.399
2 1 105 20.399
2 1 106 20.383
2 1 107 20.383
2 1 108 20.383
2 1 109 20.383
2 1 110 20.383
2 1 111 20.36
2 1 112 20.36
2 1 113 20.36
2 1 114 20.36
2 1 115 20.36
2 1 116 20.36
2 1 117 20.36
2 1 118 20.36
2 1 119 20.36
2 1 120 20.36
3 1 121 20.348
3 1 122 20.348
3 1 123 20.348
3 1 124 20.348
3 1 125 20.348
3 1 126 20.34
3 1 127 20.34
3 1 128 20.34
3 1 129 20.34
3 1 130 20.34
3 1 131 20.337
3 1 132 20.337
3 1 133 20.337
3 1 134 20.337
3 1 135 20.337
3 1 136 20.342
3 2 121 20.058
3 2 122 20.058
3 2 123 20.058
3 2 124 20.058
3 2 125 20.043
3 2 126 20.043
3 2 127 20.043
3 2 128 20.043
3 2 129 20.043
3 2 130 20.035
3 2 131 20.035
3 2 132 20.035
3 2 133 20.035
3 2 134 20.035
3 2 135 20.021
I want to do a groupby of iterations (Itr) & Type & then find the number of occurrences of the maximum value & number of occurrences of the 2nd maximum value. For example for a groupby of Irr 2 & cycle 1, the result should be 2 & 3 because 20.402 (the maximum value) occurs twice & 20.399 (2nd max value) occurs thrice. I have tried but not been successful .Is there a way to do it? Thanks
Upvotes: 1
Views: 586
Reputation: 59579
First get a mask of the top values within each group. Then get the size of the masked DataFrame.
m = df.groupby(['Itr', 'Type'])['Values'].transform(lambda x: x.isin(x.drop_duplicates().nlargest(2)))
df.where(m).groupby(['Itr', 'Type', 'Values']).size()
Itr Type Values
2.0 1.0 20.399 3
20.402 2
3.0 1.0 20.342 1
20.348 5
2.0 20.043 5
20.058 4
dtype: int64
Alternatively with groupby
+ value_counts
, then sort
and groupby
+ tail
to keep the 2 largest values within groups.
(df.groupby(['Itr', 'Type'])['Values'].value_counts()
.sort_index(level=2)
.groupby(level=[0, 1]).tail(2))
Itr Type Values
3 2 20.043 5
20.058 4
1 20.342 1
20.348 5
2 1 20.399 3
20.402 2
Name: Values, dtype: int64
Upvotes: 1