Reputation: 65610
I have a pandas dataframe that looks like this:
rank num rank num
2015 2015 2014 2014
France 8 1200 9 1216
Italy 11 789 10 788
I want to add a new column in the multi-index, called corrected_num
, and I want the value of this column to be the value of num
for the corresponding row, divided by a number obtained from another dictionary, which looks like this:
b = {2015: 10, 2014: 12}
In other words, I'd like to end up with something like this:
rank num num_corrected rank num num_corrected
2015 2015 2015 2014 2014 2014
France 8 1200 120 9 1216 101.3
Italy 11 789 78.9 10 788 65.6
So far my approach is to iterate over each row in the dataframe, then over each column in the row, something like this:
for i, row in df.iterrows():
for year in df.num.columns:
df.set_value(i, 'num_corrected, ' + year, row.frequency[year] / b[year])
But when I try this my Jupyter notebook freezes, so I'm hoping there's a better way!
Upvotes: 2
Views: 767
Reputation: 294586
setup
df = pd.DataFrame(
[
[8, 1200, 9, 1216],
[11, 789, 10, 788]
],
['France', 'Italy'],
pd.MultiIndex.from_product([['rank', 'num'], [2015, 2014]])
).sort_index(axis=1, level=1)
Make b
a series
b = pd.Series({2015: 10, 2014: 12})
method 1
num_c = df.num / b
cols = num_c.columns
num_c.columns = [['num_corrected'] * len(cols), cols]
pd.concat([df, num_c], axis=1)
method 2
d1 = df.stack()
d1['num_corrected'] = d1.num / d1.index.get_level_values(1).to_series().map(b).values
d1.unstack().sort_index(axis=1, level=1)
Upvotes: 1