Reputation: 95
I want to group by two columns, find the nlargest(2) of a third column, and split those two values found from nlargest into two new columns. The code I have thus far:
dfworking.groupby(["STATE", "D"])['GENERAL PERCENT'].nlargest(2)
This finds me the two largest values, they print just find. How would I go from here and separate those 2 nlargest values into two separate columns?
Upvotes: 0
Views: 166
Reputation: 402553
You can use groupby
, agg
, and .apply
.
df
col1 col2 col3
0 A 1 1.1
1 A 3 1.1
2 A 3 1.1
3 B 2 2.6
4 B 4 2.5
5 B 2 3.4
6 B 5 2.6
7 A 3 2.6
8 B 6 3.4
9 C 3 3.4
10 B 5 2.6
11 D 1 1.1
12 D 1 1.1
13 D 1 3.3
df.groupby(['col1', 'col2'])['col3']\
.agg(lambda x: x.nlargest(2)).apply(pd.Series)
0 1
col1 col2
A 1 1.1 NaN
3 2.6 1.1
B 2 3.4 2.6
4 2.5 NaN
5 2.6 2.6
6 3.4 NaN
C 3 3.4 NaN
D 1 3.3 1.1
Upvotes: 1