scrollex
scrollex

Reputation: 2713

Count all categories in pandas data frame and add their values

I've got a data frame that looks something along these lines:

           Dog_breed       Dog_name             Points 
      ============================================================
Monday        Pug            George               12
Tuesday       Poodle         Fido                -15
Wednesday     Pug            Buddy                 8
Thursday      Dachshund      Lyla                -10
Friday        Poodle         Remi                  9
Saturday      Pug            Kermit               -5
Sunday        Beagle         Sara                  3

For every breed of dog (i.e., category), I need to get the sum of the points — for some rows, it's positive, and for some, it's negative — and the number of dogs of that breed.

The goal is to add up all the points for each breed, count the number of times each breed occurs in the original data set, and end up with a data frame that looks like this:

             Quantity     Overall_point_change
          ============================================
Pug             3                  15
Poodle          2                  -6
Dachshund       1                 -10
Beagle          1                   3

Can pandas do this, or does it require a loop? Really struggling to wrap my head around this — can anyone recommend a way to set this up?

Upvotes: 2

Views: 3696

Answers (2)

Woody Pride
Woody Pride

Reputation: 13955

DF.groupby('Dog_breed').agg({'Points': sum, 'Dog_name' : 'count'})

gives perhaps a slightly better outcome? Only because it specifies the aggregation function applied to each column.

Upvotes: 2

Def_Os
Def_Os

Reputation: 5467

You're looking for pandas' groupby.

df.groupby('Dog_breed').agg(['count', 'sum'])

Read the following for a firmer understanding: http://pandas.pydata.org/pandas-docs/stable/groupby.html

Upvotes: 7

Related Questions