Reputation: 325
I would like to be able to show the sub-total column from a multi-index pivot table in different ways for example, I would like to show the sum for a selected row and the max for another, is this possible?
I managed to get half code correct but I am stuck in replicating the code without offsetting the previous one and I am not able to loop this code over.
In my example I want to get the max value from Toyota and the sum value from Honda shown in the newly created Total column.
cars = {'Brand': ['Honda','Toyota', 'Honda','Toyota'],
'Target': ['A','B', 'A','B'],
'Speed': [20, 80, 30 , 10],
'Date' : ['13/02/2019', '18/02/2019', '18/02/2019', '13/02/2019']
}
df = pd.DataFrame(cars)
table = pd.pivot_table(df, values=['Speed'],
index=['Target', 'Brand'],
columns=['Date'],
fill_value=0, aggfunc=np.sum, dropna=True)
table
the code craeted: (which works only for the last line as it overwrites the first one)
table['Total'] = table.loc(axis=0)[:, ['Toyota']].max(axis=1)
table['Total'] = table.loc(axis=0)[:, ['Honda']].sum(axis=1)
Current output:
Disired Output:
I would like to be able to see also the max value for Toyota which would be 80.
Upvotes: 1
Views: 118
Reputation: 863226
Use slicers
for set new values in both sides, here :
means all values for levels:
idx = pd.IndexSlice
table.loc[idx[:, 'Toyota'], 'Total'] = table.max(axis=1)
table.loc[idx[:, 'Honda'], 'Total'] = table.sum(axis=1)
print (table)
Speed Total
Date 13/02/2019 18/02/2019
Target Brand
A Honda 20 30 50.0
B Toyota 10 80 80.0
You can set and select in both sides:
idx = pd.IndexSlice
table.loc[idx[:, 'Toyota'], 'Total'] = table.loc[idx[:, 'Toyota'], :].max(axis=1)
table.loc[idx[:, 'Honda'], 'Total'] = table.loc[idx[:, 'Honda'], :].sum(axis=1)
Upvotes: 2