Anu
Anu

Reputation: 197

groupby .sum() returns wrong value in pandas

I have a data frame as follows,

Category    Feature valueCount  
    A   color   153  
    A   color   7  
    A   color   48  
    A   color   16  
    B   length  5  
    C   height  1  
    C   height  16  

I want to get the sum of valueCount by Category and Feature I am using the following code;

DF['valueSum'] = DF.groupby(['Category','Feature'])['valueCount'].transform('sum')

I am getting the output as;

Category    Feature valueCount  valueSum
A   color   153 26018
A   color   7   26018
A   color   48  26018
A   color   16  26018
B   length  5   25
C   height  1   257
C   height  16  257

which is really weird, as it is taking the square of valueCount and then adding up. Anyone knows, what is going wrong here?

Upvotes: 2

Views: 1852

Answers (2)

bikuser
bikuser

Reputation: 2093

the ideal way is:

In [4]: df

Out[4]: 
  Category Feature  valueCount
0        A   color         153
1        A   color           7
2        A   color          48
3        A   color          16
4        B  length           5
5        C  height           1
6        C  height          16

In [5]: df.groupby(df['Category']).sum()
Out[5]: 
          valueCount
Category            
A                224
B                  5
C                 17

Upvotes: 1

aluriak
aluriak

Reputation: 5847

According to the doc, The GroupBy objects provides a sum method that do what you need:

In [12]: grouped.sum()

Upvotes: 1

Related Questions