Reputation: 9
The program is:
import numpy as np
import pandas as pd
p = {'item' : ['apple','apple','orange','orange','guns','guns','guns'],'Days' : ['Mon' , 'Tue' , 'Wed' , 'Thu' , 'Fri' , 'Sat' , 'Sun'] ,'sales' : [100 , 80 , 200 , 100 , 5 , 10 , 5]}
df = pd.DataFrame(p)
print(df)
x = df.groupby('item')
print(x.max())
But the output is:
The max day of guns
happened in Sat
, so why does pandas show Sun
?
Upvotes: 0
Views: 493
Reputation:
max
, when called on a groupby, computes the max per-column. So 10
is the largest of [5, 10, 5]
, and Sun
is the largest (alphabetically) of ['Fri', 'Sat', 'Sun']
.
I think you want to use idxmax
and .loc
:
filtered = df.loc[df.groupby('item')['sales'].idxmax()]
Output:
item Days sales
0 apple Mon 100
5 guns Sat 10
2 orange Wed 200
Upvotes: 2