Pandas dataframe group by doesn't remove grouped key

Question

I'm trying to follow an example of groupby from the documentation here. As per the example, I first create a data frame:

df = pd.DataFrame({'A': 'a a b'.split(), 'B': [1,2,3], 'C': [4,6, 5]})

Now, let's group by the column labeled "A" and sum the other two by its values:

df.groupby('A').sum()

This does the reasonable thing, grouping by "A" and producing:

Now, let's try the same thing, but explicitly define the sum() function:

df.groupby('A', group_keys=False).apply(lambda x: np.sum(x))

This, for some inexplicable reason, decides to apply the function also to the entries of the "A" column. And of course, other numeric functions (like square) throw errors since they are applied on the strings. In fact, it causes the examples provided in the link above to not work.

    A  B   C
A
a  aa  3  10
b   b  3   5

I tried python 2.7 and 3.6 with the same results. How can I make it do the intelligent thing and not apply the function to the column I am grouping by?

akuiper · Accepted Answer

There's probably no intelligent way for groupby.apply to do that other than drop the group variable in apply:

df.groupby('A').apply(lambda g: g.drop('A', 1).sum())

#   B   C
#A
#a  3  10
#b  3   5

Pandas dataframe group by doesn't remove grouped key

Answers (2)

Related Questions

Pandas dataframe group by doesn&#39;t remove grouped key

Answers (2)

Related Questions

Pandas dataframe group by doesn't remove grouped key