user2233035
user2233035

Reputation: 3

pivot table subtotals

I have a data frame:

product = DataFrame({'_product': ['shoes','dress','cap','shoes','purse','t-shirt','t-shirt','dress','t-shirt'], 
             'city': ['A','A','A','B','A','A','B','C','A'],
             'color':['red','black','black','white','black','green','white','yellow','blue'],
             'size':['36','S','M','40','-','L','L','M','S'],
             'param1':['x0001','x0008','x0006','x0002','x0001','x0009','x0011','x0003','x0001'],
             'param2':[23,1,367,689,35,97,100,44,15],
             'param3':['f1','t1','u7','f1','r4','f2','f2','t2','f4'],
             'counter':[1,1,1,1,1,1,1,1,1]})

table=product[['_product','city','color','size','param1','param2','param3','counter']]

Applying

pivot_product=pivot_table(table,values=['counter'],rows=['_product','city','color','size','param1','param2','param3'],aggfunc=[np.sum],fill_value=0,margins=True)

I get a pivot table with Grand Total row only ("All").

This is a hypothetical sample, in reality I import a table with 100 000 rows and 20 columns.

!! It is absolutely necessary for me to have subtotals on the product level.

Is there any efficient way to insert rows with subtotals into this table just as Excel pivot table with Field Setting>Layout & Print>"Show item labels in tabular form" allows to do?

Upvotes: 0

Views: 1281

Answers (1)

Dan Allan
Dan Allan

Reputation: 35255

I'm not familiar with that operation in Excel, but here're a one-liner to compute subtotals by product.

In [43]: pivot_product['subtotals'] = pivot_product[('sum', 'counter')].groupby(level=0).transform(np.sum)

In [44]: pivot_product
Out[44]: 
                                                    sum  subtotals
                                                counter           
_product city color  size param1 param2 param3                    
cap      A    black  M    x0006  367    u7            1          1
dress    A    black  S    x0008  1      t1            1          2
         C    yellow M    x0003  44     t2            1          2
purse    A    black  -    x0001  35     r4            1          1
shoes    A    red    36   x0001  23     f1            1          2
         B    white  40   x0002  689    f1            1          2
t-shirt  A    blue   S    x0001  15     f4            1          3
              green  L    x0009  97     f2            1          3
         B    white  L    x0011  100    f2            1          3
All                                                   9          9

It could be that you want np.size where I use np.count, depending on what the 'counter' column means.

Upvotes: 1

Related Questions