Reputation: 3719

Pandas sub- and total of pivot tables with multiindex

I am trying to add a new column with subtotals and a final column with totals. For example,

df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
               "B": ["one", "one", "one", "two", "two","one", "one", "two", "two"],
               "C": ["small", "large", "large", "small","small", "large", "small", "small", "large"],
               "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
               "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})

i.e:

     A    B      C  D  E
0  foo  one  small  1  2
1  foo  one  large  2  4
2  foo  one  large  2  5
3  foo  two  small  3  5
4  foo  two  small  3  6
5  bar  one  large  4  6
6  bar  one  small  5  8
7  bar  two  small  6  9
8  bar  two  large  7  9

Now I pivot:

table = pd.pivot_table(df, values=['D',"E"], index=['A'],columns=['C'])

and add the totals:

table['total'] = table.sum(axis=1)
for t in ["D", "E"]:
   table[t, "partial_total"]  = table[t].sum(axis=1)

While this numerically works, visually it's annoying. I would like to have all the data for D (including the partial_total), then E, then the total. Here's my resulting df:

        D               E                total             D             E
C   large     small large     small            partial_total partial_total
A                                                                         
bar   5.5  5.500000   7.5  8.500000  27.000000     11.000000     16.000000
foo   2.0  2.333333   4.5  4.333333  13.166667      4.333333      8.833333

how do I group together the values for the same (top level) columns?

Upvotes: 1

Answers (3)

Scott Boston

Reputation: 153500

Try this using pd.concat:

table = pd.pivot_table(df, values=['D',"E"], index=['A'],columns=['C'])
table.columns = [f'{i}_{j}' for i, j in table.columns]
pd.concat([table,
           table.sum(axis=1, level=0).add_suffix('_partial_total'),
           table.sum(axis=1).to_frame(name='total')], axis=1)

Output:

     D_large   D_small  E_large   E_small  D_large_partial_total  D_small_partial_total  E_large_partial_total  E_small_partial_total      total
A                                                                                                                                               
bar      5.5  5.500000      7.5  8.500000                    5.5               5.500000                    7.5               8.500000  27.000000
foo      2.0  2.333333      4.5  4.333333                    2.0               2.333333                    4.5               4.333333  13.166667

Upvotes: 2

rafaelc

Reputation: 59274

Trying to perform operations before the pivot_table

g = df.groupby(['A', 'C'])[['D', 'E']]

d = (g.sum()/g.count()).reset_index()
m = d.groupby('A', as_index=False).sum().assign(C='partial')

final = pd.concat([m, d]).pivot_table(index='A', columns='C')

        D                          E                     
C   large     small    partial large     small    partial
A                                                        
bar   5.5  5.500000  11.000000   7.5  8.500000  16.000000
foo   2.0  2.333333   4.333333   4.5  4.333333   8.833333

To answer specifically your last question

how do I group together the values for the same (top level) columns?

You may just sort_index

table.sort_index(axis=1)

        D                             E                              total
C   large partial_total     small large partial_total     small           
A                                                                         
bar   5.5     11.000000  5.500000   7.5     16.000000  8.500000  27.000000
foo   2.0      4.333333  2.333333   4.5      8.833333  4.333333  13.166667

Upvotes: 1

Quang Hoang

Reputation: 150785

You can pivot with margin:

new_df = (df.pivot_table(index='A', columns='C', 
                         values=['D','E'], aggfunc='sum',
                         margins=True, margins_name='partial_total')
   .assign(total=lambda x: x.loc[:, (slice(None),'partial_total')].sum(1))
)

Output:

                D                               E                               total
C               large   small   partial_total   large   small   partial_total   
A                           
bar             11      11      22              15      17      32              54
foo             4       7       11              9       13      22              33
partial_total   15      18      33              24      30      54              87

Upvotes: 2

Pandas sub- and total of pivot tables with multiindex

Answers (3)

Related Questions