Count all categories in pandas data frame and add their values

Question

I've got a data frame that looks something along these lines:

           Dog_breed       Dog_name             Points 
      ============================================================
Monday        Pug            George               12
Tuesday       Poodle         Fido                -15
Wednesday     Pug            Buddy                 8
Thursday      Dachshund      Lyla                -10
Friday        Poodle         Remi                  9
Saturday      Pug            Kermit               -5
Sunday        Beagle         Sara                  3

For every breed of dog (i.e., category), I need to get the sum of the points — for some rows, it's positive, and for some, it's negative — and the number of dogs of that breed.

The goal is to add up all the points for each breed, count the number of times each breed occurs in the original data set, and end up with a data frame that looks like this:

             Quantity     Overall_point_change
          ============================================
Pug             3                  15
Poodle          2                  -6
Dachshund       1                 -10
Beagle          1                   3

Can pandas do this, or does it require a loop? Really struggling to wrap my head around this — can anyone recommend a way to set this up?

Def_Os · Accepted Answer

You're looking for pandas' groupby.

df.groupby('Dog_breed').agg(['count', 'sum'])

Read the following for a firmer understanding: http://pandas.pydata.org/pandas-docs/stable/groupby.html

Count all categories in pandas data frame and add their values

Answers (2)

Related Questions