Reputation: 2286
Below is an example of the data set I am working with. I am trying to do a group by on Location for only the top 3 locations based on KGs.
Location KG Dollars
BKK 7 2
BKK 5 3
BKK 4 2
BKK 3 3
BKK 2 2
HKG 8 3
HKG 6 2
HKG 4 3
HKG 3 2
HKG 2 3
The output would look like below. Grouped on the location, summing both KG and Dollars for the top 3 KG records for each location.
Location KG Dollars
BKK 16 7
HKG 18 8
I've tried different types of groupbys, just having a problem only specifying the top n KG records for the groupby.
Upvotes: 1
Views: 57
Reputation: 77027
You could do
In [610]: df.groupby('Location').apply(lambda x:
x.nlargest(3, 'KG')[['KG', 'Dollars']].sum())
Out[610]:
KG Dollars
Location
BKK 16 7
HKG 18 8
Upvotes: 3