cmal
cmal

Reputation: 2222

How to aggregate some of the levels in a deep nested `groupby` in pandas?

I am doing the exercise on https://repl.it/@freeCodeCamp/fcc-medical-data-visualizer, and the groupby problem stuck me:

Now I get a tree like nested-level groupby, I want to get the total count of some sublevels:

>>> m.groupby(['cardio', 'cholesterol', 'gluc', 'smoke', 'alco', 'active', 'overweight']).size()
cardio  cholesterol  gluc  smoke  alco  active  overweight
0       0            0     0      0     0       0              36400
                                        1       0             158936
                                  1     0       0                752
                                        1       0               4056
                           1      0     0       0               1960
                                        1       0              11640
                                  1     0       0                744
                                        1       0               5544
                     1     0      0     0       0               2440
                                        1       0              10104
                                  1     0       0                 88
                                        1       0                456
                           1      0     0       0                152
                                        1       0                896
                                  1     0       0                 40
                                        1       0                432
        1            0     0      0     0       0               4056
                                        1       0              18792
                                  1     0       0                184
                                        1       0               1144
                           1      0     0       0                320
                                        1       0               1584
                                  1     0       0                152
                                        1       0                888
                     1     0      0     0       0               3400
                                        1       0              12832
                                  1     0       0                112
                                        1       0                496
                           1      0     0       0                152
                                        1       0                976
                                                               ...
1       0            0     0      1     0       0                552
                                        1       0               2968
                           1      0     0       0               1792
                                        1       0               7536
                                  1     0       0                704
                                        1       0               2792
                     1     0      0     0       0               2840
                                        1       0              10200
                                  1     0       0                 96
                                        1       0                488
                           1      0     0       0                216
                                        1       0                824
                                  1     0       0                 72
                                        1       0                360
        1            0     0      0     0       0               9680
                                        1       0              41152
                                  1     0       0                352
                                        1       0               1992
                           1      0     0       0                792
                                        1       0               3536
                                  1     0       0                256
                                        1       0               1576
                     1     0      0     0       0               6688
                                        1       0              24848
                                  1     0       0                240
                                        1       0               1304
                           1      0     0       0                416
                                        1       0               1728
                                  1     0       0                216
                                        1       0                616
dtype: int64

For example, I want to sum the count of cholesterol, gluc, smoke, alco, and active, to get a cardio->overweight sublevel and the total count of 0 and 1s with respect to cardio and overweight, which would give me something like:

cardio  overweight
0       0          88888
        1          99999
1       0          77777
        1          66666

Upvotes: 0

Views: 35

Answers (1)

BENY
BENY

Reputation: 323326

Check sum and know your level

df = df.sum(level = [0, 6])

Upvotes: 1

Related Questions