Heisenberg
Heisenberg

Reputation: 5299

How to aggregate with grouping in python

Assume I have df like below

bin=[0,5,10]

sex age num
1    4   11     
1    3   12
2    2   13
2    10  14

I would like to sum in num and count age. First I tried

df.groupby([df.sex,pd.cut(df.age,bin)]).size()

or

df.groupby([df.sex,pd.cut(df.age,bin)]).sum()

But I couldnt figure out how to aggregate and transform them. I want the following result.like

sex        age   num
 1 [0,5)    2     23   
 1 [5,10)   0      0
 2 [0,5)    1     13
 2 [5,10)   1     14

If I can get this result in simple way,please someone tell me how to transform original dataframe.

Upvotes: 2

Views: 54

Answers (1)

jezrael
jezrael

Reputation: 862701

Use agg and then add missing values by unstack + stack:

df = df.groupby([df.sex,pd.cut(df.age,bin)])
       .agg({'num':'sum', 'age':'size'})
       .unstack(fill_value=0)
       .stack()
print (df)
             num  age
sex age              
1   (0, 5]    23    2
    (5, 10]    0    0
2   (0, 5]    13    1
    (5, 10]   14    1

Or create MultiIndex and reindex:

bin=[0,5,10]
cats = pd.cut(df.age,bin)
mux = pd.MultiIndex.from_product([df.sex.unique(), cats.cat.categories])
df = df.groupby([df.sex,cats]).agg({'num':'sum', 'age':'size'}).reindex(mux, fill_value=0)
print (df)
           num  age
1 (0, 5]    23    2
  (5, 10]    0    0
2 (0, 5]    13    1
  (5, 10]   14    1

Upvotes: 1

Related Questions