get_dummies and count together

Question

I have a dataframe with different "cases" as rows, in which there's an id and a category:

df = DataFrame({ 'id':[1122,3344,5566,5566,3344,5566,1122,3344], 
            'category':['health','transport','energy','energy','transport','transport','transport','energy']})

    category    id
0   health      1122
1   transport   3344
2   energy      5566
3   energy      5566
4   transport   3344
5   transport   5566
6   transport   1122
7   energy      3344

I'm trying to find a good way to both get dummies of the categories and also count them, so with the above example I would get this:

     health  transport  energy
1122    1        1          0
3344    0        2          1
5566    0        1          2

Any ideas?

MaxU - stand with Ukraine · Accepted Answer

you can use pivot_table() method:

In [71]: df.pivot_table(index='id', columns='category', aggfunc='size', fill_value=0)
Out[71]:
category  energy  health  transport
id
1122           0       1          1
3344           1       0          2
5566           2       0          1

or:

In [76]: df.pivot_table(index='id', columns='category', aggfunc='size', fill_value=0).rename_axis(None, 1)
Out[76]:
      energy  health  transport
id
1122       0       1          1
3344       1       0          2
5566       2       0          1

get_dummies and count together

Answers (1)

Related Questions