Reputation: 3
I have a data frame:
product = DataFrame({'_product': ['shoes','dress','cap','shoes','purse','t-shirt','t-shirt','dress','t-shirt'],
'city': ['A','A','A','B','A','A','B','C','A'],
'color':['red','black','black','white','black','green','white','yellow','blue'],
'size':['36','S','M','40','-','L','L','M','S'],
'param1':['x0001','x0008','x0006','x0002','x0001','x0009','x0011','x0003','x0001'],
'param2':[23,1,367,689,35,97,100,44,15],
'param3':['f1','t1','u7','f1','r4','f2','f2','t2','f4'],
'counter':[1,1,1,1,1,1,1,1,1]})
table=product[['_product','city','color','size','param1','param2','param3','counter']]
Applying
pivot_product=pivot_table(table,values=['counter'],rows=['_product','city','color','size','param1','param2','param3'],aggfunc=[np.sum],fill_value=0,margins=True)
I get a pivot table with Grand Total row only ("All").
This is a hypothetical sample, in reality I import a table with 100 000 rows and 20 columns.
!! It is absolutely necessary for me to have subtotals on the product level.
Is there any efficient way to insert rows with subtotals into this table just as Excel pivot table with Field Setting>Layout & Print>"Show item labels in tabular form" allows to do?
Upvotes: 0
Views: 1281
Reputation: 35255
I'm not familiar with that operation in Excel, but here're a one-liner to compute subtotals by product.
In [43]: pivot_product['subtotals'] = pivot_product[('sum', 'counter')].groupby(level=0).transform(np.sum)
In [44]: pivot_product
Out[44]:
sum subtotals
counter
_product city color size param1 param2 param3
cap A black M x0006 367 u7 1 1
dress A black S x0008 1 t1 1 2
C yellow M x0003 44 t2 1 2
purse A black - x0001 35 r4 1 1
shoes A red 36 x0001 23 f1 1 2
B white 40 x0002 689 f1 1 2
t-shirt A blue S x0001 15 f4 1 3
green L x0009 97 f2 1 3
B white L x0011 100 f2 1 3
All 9 9
It could be that you want np.size
where I use np.count
, depending on what the 'counter' column means.
Upvotes: 1