Reputation: 136
I'm trying to groupby a column in pandas and then sum the groupby.
Here is an example df and my expected output:
d = {'a':[1, 1, 1, 1, 2, 2, 2], 'b': [3, 4, 5, 6, 7,8,9] }
data = pd.DataFrame(data = d)
# should return sum of sum of groups
# correct output would be 42
I know that I can return the sum of groups by using:
data.groupby('a')['b'].transform(sum)
# which returns
0 18
1 18
2 18
3 18
4 24
5 24
6 24
Name: b, dtype: int64
However, I'm not sure how to get the sum of the sum of groups. i.e.
# sum of groupby
# group 1: 18
# group 2: 24
# sum of sum of groupby
# 18 + 24 = 42
Upvotes: 1
Views: 651
Reputation: 13349
You need to use agg
in place of transform
.
res = data.groupby('a')['b'].agg(sum).sum()
res:
42
data.groupby('a')['b'].agg(sum)
will give you
a
1 18
2 24
Name: b, dtype: int64
Upvotes: 2