sds
sds

Reputation: 60054

Normalizing pandas DataFrame with multiindex

I need to normalize data by level 1 in multi-index, given

import pandas as pd
df = pd.DataFrame(np.arange(12).reshape(4,3), index=[["a","a","b","b"],[1,2,1,2]],
                  columns=["x","y","z"])

so that df is

     x   y   z
a 1  0   1   2
  2  3   4   5
b 1  6   7   8
  2  9  10  11

I need to normalize every column by the 1st level of index, to get

     x       y      z
a 1  0      1/5    2/7
  2  1      4/5    5/7
b 1  6/15   7/17   8/19
  2  9/15  10/17  11/19

(although, obviously, with floats instead of ratios)

I suppose I could do something by iterating over columns and values of the 1st level of multi-index, but I am sure there is a one-liner...

Upvotes: 2

Views: 103

Answers (1)

sammywemmy
sammywemmy

Reputation: 28709

One option is with a groupby:

df/df.groupby(level=0).transform('sum')
Out[87]: 
       x         y         z
a 1  0.0  0.200000  0.285714
  2  1.0  0.800000  0.714286
b 1  0.4  0.411765  0.421053
  2  0.6  0.588235  0.578947

Upvotes: 2

Related Questions