nwaldo
nwaldo

Reputation: 187

How to add Multilevel Columns and create new column?

I am trying to create a "total" column in my dataframe

idx = pd.MultiIndex.from_product([['Room 1','Room 2', 'Room 3'],['on','off']])
df = pd.DataFrame([[1,4,3,6,5,15], [3,2,1,5,1,7]], columns=idx)

My dataframe

    Room 1  Room 2  Room 3
    on  off on  off on  off
0   1   4   3   6   5   15
1   3   2   1   5   1   7

For each room, I want to create a total column and then a on% column.

I have tried the following, however, it does not work.

df.loc[:, slice(None), "total" ] = df.xs('on', axis=1,level=1) + df.xs('off', axis=1,level=1) 

Upvotes: 2

Views: 543

Answers (3)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Let us try something fancy ~

df.stack(0).eval('total=on + off \n on_pct=on / total').stack().unstack([1, 2])

     Room 1                   Room 2                      Room 3                  
     off   on total on_pct    off   on total    on_pct    off   on total on_pct
0    4.0  1.0   5.0    0.2    6.0  3.0   9.0  0.333333   15.0  5.0  20.0  0.250
1    2.0  3.0   5.0    0.6    5.0  1.0   6.0  0.166667    7.0  1.0   8.0  0.125

Upvotes: 2

Zach Flanders
Zach Flanders

Reputation: 1304

This groups by the column's first index (rooms) and then loops through each group to add the total and percent on. The final step is to reindex using the unique rooms:

import pandas as pd

idx = pd.MultiIndex.from_product([['Room 1','Room 2', 'Room 3'],['on','off']])
df = pd.DataFrame([[1,4,3,6,5,15], [3,2,1,5,1,7]], columns=idx)

for room, group in df.groupby(level=0, axis=1):
    df[(room, 'total')] = group.sum(axis=1)
    df[(room, 'pct_on')] = group[(room, 'on')] / df[(room, 'total')]

result = df.reindex(columns=df.columns.get_level_values(0).unique(), level=0)

Output:

      Room 1                  Room 2                     Room 3                 
      on off total pct_on     on off total    pct_on     on off total pct_on
0      1   4     5    0.2      3   6     9  0.333333      5  15    20  0.250
1      3   2     5    0.6      1   5     6  0.166667      1   7     8  0.125

Upvotes: 0

INGl0R1AM0R1
INGl0R1AM0R1

Reputation: 1628

Oof this was a roughie, but you can do it like this if you want to avoid loops. Worth noting it redefines your df twice because i need the total columns. Sorry about that, but is the best i could do. Also if you have any questions just comment.

df = pd.concat([y.assign(**{'Total {0}'.format(x+1): y.iloc[:,0] + y.iloc[:,1]})for x , y in df.groupby(np.arange(df.shape[1])//2,axis=1)],axis=1)


df = pd.concat([y.assign(**{'Percentage_Total{0}'.format(x+1): (y.iloc[:,0] / y.iloc[:,2])*100})for x , y in df.groupby(np.arange(df.shape[1])//3,axis=1)],axis=1)
print(df)

Upvotes: 1

Related Questions