extraymond
extraymond

Reputation: 626

Pandas pivot_table multiple aggfunc with margins

I've noticed that I can't set margins=True when having multiple aggfunc such as ("count","mean","sum").

It will vomit KeyError: 'Level None not found'

This is the example code.

df.pivot_table(values=A,index=[B,C,D],columns=E,aggfunc=("count","mean","sum"),
margins=True,margins_name="Total",fill_value=0)

Update:

This is the sample df:

[{'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'}]

And the code throwing errors.

df.pivot_table(values="Results",index="Game_ID",
columns="Team",aggfunc=("count","mean","sum"),margins=True)

Upvotes: 1

Views: 8333

Answers (1)

Dan Fiorino
Dan Fiorino

Reputation: 418

I see the error you are talking about. I got around it by using the function calls instead of the string names "count","mean", and "sum."

First, we start with your dataframe:

import pandas as pd

df=pd.DataFrame([{'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 1, 'Team': 'A'},
 {'Game_ID': 'no.1', 'Results': 0, 'Team': 'B'}])

Then just replace the aggregate functions with standard library call to len and the numpy aggregate functions.

  • "count" becomes len
  • "mean" becomes np.mean
  • "sum" becomes np.sum

The pivot table is made with the following lines:

import numpy as np
df.pivot_table(values="Results",
               index="Game_ID",
               columns="Team",
               aggfunc=[len,np.mean,np.sum],
               margins=True)

Output pivot table

Note, len might not be what you want, but in this example it gives the same answer as "count" would on its own. Look at numpy.count_nonzero, for example.

Upvotes: 4

Related Questions