Mr. Engineer
Mr. Engineer

Reputation: 375

pandas: How to pivot multiple columns and calculate their sum?

I have a DataFrame like this:

Team      Player      Goals       YellowCards          RedCards

Team1     Player1       2             1                    1

Team1     Player2       3             1                    0

Team2     Player3       2             2                    1

I'm trying to calculate sum of Goals, YellowCards and RedCards for each team and create new dataframe for result. I have tried:

pd.crosstab(df['Team'],[df['Goals'],df['YellowCards'],df['RedCards']], aggfunc='sum')

But it's not working. Preferably I would like to do this with either crosstab or pivot_table function. Any advise is highly appreciated.

Upvotes: 2

Views: 8301

Answers (2)

I added column totals and grand totals

data=[('Team1','Player1',       2,             1,                    1),
('Team1','Player2',       3,             1,                    0),
('Team2','Player3',       2,             2,                    1)]

df=pd.DataFrame(data=data,columns=['Team','Player','Goals', 'YellowCards','RedCards'])

fp=df.pivot_table(index='Team',aggfunc='sum')
fp['Totals'] = fp.sum(axis='columns')
fp.loc[('Grand Total'), :] = fp.sum()
print(fp)

output

 Goals  RedCards  YellowCards  Totals
 Team                                             
 Team1          5.0       1.0          2.0     8.0
 Team2          2.0       1.0          2.0     5.0
 Grand Total    7.0       2.0          4.0    13.0

Upvotes: 0

jezrael
jezrael

Reputation: 862791

Because need DataFrame.pivot_table the simpliest solution is:

df = df.pivot_table(index='Team',aggfunc='sum')
print (df)
       Goals  RedCards  YellowCards
Team                               
Team1      5         1            2
Team2      2         1            2

Working like aggregate sum:

df = df.groupby('Team').sum()

EDIT: If need specify columns:

df = df.pivot_table(index='Team',aggfunc='sum',values=['Goals','RedCards','YellowCards'])
print (df)
       Goals  RedCards  YellowCards
Team                               
Team1      5         1            2
Team2      2         1            2

Working like:

df = df.groupby('Team')[['Goals','RedCards','YellowCards']].sum()

Upvotes: 3

Related Questions