Bahram Shakiba
Bahram Shakiba

Reputation: 9

Pandas groupby max not returning max value for some columns

The program is:

import numpy as np
import pandas as pd
p = {'item' : ['apple','apple','orange','orange','guns','guns','guns'],'Days' : ['Mon' , 'Tue' , 'Wed' , 'Thu' , 'Fri' , 'Sat' , 'Sun'] ,'sales' : [100 , 80 , 200 , 100 , 5 , 10 , 5]}

df = pd.DataFrame(p)

print(df)

x = df.groupby('item')

print(x.max())

But the output is:

pandas groupby output

The max day of guns happened in Sat, so why does pandas show Sun?

Upvotes: 0

Views: 493

Answers (1)

user17242583
user17242583

Reputation:

max, when called on a groupby, computes the max per-column. So 10 is the largest of [5, 10, 5], and Sun is the largest (alphabetically) of ['Fri', 'Sat', 'Sun'].

I think you want to use idxmax and .loc:

filtered = df.loc[df.groupby('item')['sales'].idxmax()]

Output:

     item Days  sales
0   apple  Mon    100
5    guns  Sat     10
2  orange  Wed    200

Upvotes: 2

Related Questions