Reputation: 5299
Assume I have df like below
bin=[0,5,10]
sex age num
1 4 11
1 3 12
2 2 13
2 10 14
I would like to sum in num and count age. First I tried
df.groupby([df.sex,pd.cut(df.age,bin)]).size()
or
df.groupby([df.sex,pd.cut(df.age,bin)]).sum()
But I couldnt figure out how to aggregate and transform them. I want the following result.like
sex age num
1 [0,5) 2 23
1 [5,10) 0 0
2 [0,5) 1 13
2 [5,10) 1 14
If I can get this result in simple way,please someone tell me how to transform original dataframe.
Upvotes: 2
Views: 54
Reputation: 862701
Use agg
and then add missing values by unstack
+ stack
:
df = df.groupby([df.sex,pd.cut(df.age,bin)])
.agg({'num':'sum', 'age':'size'})
.unstack(fill_value=0)
.stack()
print (df)
num age
sex age
1 (0, 5] 23 2
(5, 10] 0 0
2 (0, 5] 13 1
(5, 10] 14 1
Or create MultiIndex
and reindex
:
bin=[0,5,10]
cats = pd.cut(df.age,bin)
mux = pd.MultiIndex.from_product([df.sex.unique(), cats.cat.categories])
df = df.groupby([df.sex,cats]).agg({'num':'sum', 'age':'size'}).reindex(mux, fill_value=0)
print (df)
num age
1 (0, 5] 23 2
(5, 10] 0 0
2 (0, 5] 13 1
(5, 10] 14 1
Upvotes: 1