rpb
rpb

Reputation: 3299

How to groupby two columns and calculate the summation of rows using Pandas?

I have a pandas data frame df like:

Name  Hour Activity
    A   4   TT
    A   3   TT
    A   5   UU
    B   1   TT
    C   1   TT
    D   1   TT
    D   2   TT
    D   3   UU
    D   4   UU

The next step is to get the summation if the rows have identical value of the column Name and Activity.

For example, for the case Name: A and Activity: TT will give the summation of 7

The result is the presented as below

    TT  UU
A   7   5
B   1   0
C   1   0
D   3   7

Is it possible to do something like this using pandas groupby?

Upvotes: 2

Views: 107

Answers (2)

David Erickson
David Erickson

Reputation: 16683

A pivot_table would also work here:

df = pd.pivot_table(df, index='Name', columns='Activity', values='Hour', aggfunc='sum', fill_value=0)
df
Out[1]: 
Activity  TT  UU
Name            
A          7   5
B          1   0
C          1   0
D          3   7

From there, you could do some further clean up to get to:

df.columns = [''.join(col) for col in df.columns]
df = df.reset_index()
df
Out[2]: 
  Name  TT  UU
0    A   7   5
1    B   1   0
2    C   1   0
3    D   3   7

Upvotes: 1

Andy L.
Andy L.

Reputation: 25239

Try groupby.sum and unstack

df_final = df.groupby(['Name', 'Activity']).Hour.sum().unstack(fill_value=0)

Out[177]:
Activity  TT  UU
Name
A          7   5
B          1   0
C          1   0
D          3   7

Upvotes: 2

Related Questions