tarski
tarski

Reputation: 249

Python Pandas groupby-apply strange behavior

Can anyone help me understand why there is different behavior between the two calls to apply below? Thank you.

In [34]: df
Out[34]: 
   A  B  C
0  1  0  0
1  1  7  4
2  2  9  8
3  2  2  4
4  2  2  1
5  3  3  3
6  3  3  2
7  3  5  7

In [35]: g = df.groupby('A')

In [36]: g.apply(max)
Out[36]: 
   A  B  C
A         
1  1  7  4
2  2  9  8
3  3  5  7

In [37]: g.apply(lambda x: max(x))
Out[37]: 
A
1    C
2    C
3    C
dtype: object

Upvotes: 2

Views: 174

Answers (1)

chrisb
chrisb

Reputation: 52286

Short answer - you probably just want

df.groupby('A').max()

Longer answer - max is a generic python function that finds the max of any iterable. Because iterating a DataFrame is over the columns, calling the python max just finds the "largest" column, which happens in your second case.

In the first case - pandas has intercept logic, which turns things like g.apply(sum) into g.sum().

Upvotes: 3

Related Questions