Richard
Richard

Reputation: 65610

Insert new column values in dataframe with MultiIndex?

I have a pandas dataframe that looks like this:

          rank     num      rank    num
          2015     2015     2014    2014   
France    8        1200     9       1216   
Italy     11       789      10      788    

I want to add a new column in the multi-index, called corrected_num, and I want the value of this column to be the value of num for the corresponding row, divided by a number obtained from another dictionary, which looks like this:

b = {2015: 10, 2014: 12}

In other words, I'd like to end up with something like this:

          rank     num      num_corrected    rank    num    num_corrected
          2015     2015     2015             2014    2014   2014
France    8        1200     120              9       1216   101.3
Italy     11       789      78.9             10      788    65.6

So far my approach is to iterate over each row in the dataframe, then over each column in the row, something like this:

for i, row in df.iterrows():
    for year in df.num.columns:
        df.set_value(i, 'num_corrected, ' + year, row.frequency[year] / b[year])

But when I try this my Jupyter notebook freezes, so I'm hoping there's a better way!

Upvotes: 2

Views: 767

Answers (1)

piRSquared
piRSquared

Reputation: 294586

setup

df = pd.DataFrame(
    [
        [8, 1200, 9, 1216],
        [11, 789, 10, 788]
    ],
    ['France', 'Italy'],
    pd.MultiIndex.from_product([['rank', 'num'], [2015, 2014]])
).sort_index(axis=1, level=1)

Make b a series

b = pd.Series({2015: 10, 2014: 12})

method 1

num_c = df.num / b

cols = num_c.columns
num_c.columns = [['num_corrected'] * len(cols), cols]

pd.concat([df, num_c], axis=1)

enter image description here

method 2

d1 = df.stack()
d1['num_corrected'] = d1.num / d1.index.get_level_values(1).to_series().map(b).values
d1.unstack().sort_index(axis=1, level=1)

enter image description here

Upvotes: 1

Related Questions