Reputation: 401
How do I add a column to a Pandas DataFrame, by multiplying an existing column by a factor from an external dictionary looked up using values from a second column in the same DataFrame as keys?
I have a pd.DataFrame
dataframe df
roughly of the form
code blah... year nominal
0 T.rrr blah... 2012-13 2.08
1 J.rrr blah... 2008-09 1.09
2 1320 blah... 2008-09 1.38
3 1310 blah... 2010-11 1.20
4 1130 blah... 2010-11 1.22
I also have these factors
as a dictionary:
factors = {'2008-09': 1.11075, '2010-11': 1.02947, '2012-13': 1.}
I want to add a column by multiplying each nominal value by the appropriate factor, which I want to look up in the external factors
dictionary using a second column df['year']
in the dataframe df
as a key
df['real'] = df['nominal'] * factors[df['year']]
or
df['real'] = df.nominal * factors[df.year]
but both of those produce errors. I also tried things like
def nominal_to_real(df, nom_col='nominal', year_col='year', factors=factors):
return df[nom_col] * factors[df[year_col]]
df['real] = df.apply(nominal_to_real, axis=1)
which produces the following error
/Users/.../anaconda/lib/python2.7/site-packages/ipykernel/main.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
What am I doing wrong? Surely this should be simple...
Thanks in advance
Upvotes: 1
Views: 3178
Reputation: 401
After some puzzlement, I can confirm that this is how you do it with a single line. The data here contains an extra column of units; I'm filtering out the 'Ratio' records, which I don't want to use. The key is to use the .loc
method described here with .map
df.loc[df.loc[:,'unit']!='Ratio','real'] = df.loc[df.loc[:,'unit']!='Ratio','nominal'] * df.loc[df.loc[:,'unit']!='Ratio','year'].map(factors)
Upvotes: 0
Reputation: 251578
Use the map
method of Series to look up in the dictionary:
df['nominal'] * df['year'].map(factors)
Upvotes: 4