curlew77
curlew77

Reputation: 401

Add a new column to a Pandas DataFrame by using values in another column to lookup values in a dictionary

How do I add a column to a Pandas DataFrame, by multiplying an existing column by a factor from an external dictionary looked up using values from a second column in the same DataFrame as keys?

I have a pd.DataFrame dataframe df roughly of the form

    code    blah...     year    nominal
0   T.rrr   blah...     2012-13     2.08
1   J.rrr   blah...     2008-09     1.09
2   1320    blah...     2008-09     1.38
3   1310    blah...     2010-11     1.20
4   1130    blah...     2010-11     1.22

I also have these factors as a dictionary:

factors = {'2008-09': 1.11075, '2010-11': 1.02947, '2012-13': 1.}

I want to add a column by multiplying each nominal value by the appropriate factor, which I want to look up in the external factors dictionary using a second column df['year'] in the dataframe df as a key

df['real'] = df['nominal'] * factors[df['year']]

or

df['real'] = df.nominal * factors[df.year]

but both of those produce errors. I also tried things like

def nominal_to_real(df, nom_col='nominal', year_col='year', factors=factors):
    return df[nom_col] * factors[df[year_col]]

df['real] = df.apply(nominal_to_real, axis=1)

which produces the following error

/Users/.../anaconda/lib/python2.7/site-packages/ipykernel/main.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

What am I doing wrong? Surely this should be simple...

Thanks in advance

Upvotes: 1

Views: 3178

Answers (2)

curlew77
curlew77

Reputation: 401

After some puzzlement, I can confirm that this is how you do it with a single line. The data here contains an extra column of units; I'm filtering out the 'Ratio' records, which I don't want to use. The key is to use the .loc method described here with .map

df.loc[df.loc[:,'unit']!='Ratio','real'] = df.loc[df.loc[:,'unit']!='Ratio','nominal'] * df.loc[df.loc[:,'unit']!='Ratio','year'].map(factors)

Upvotes: 0

BrenBarn
BrenBarn

Reputation: 251578

Use the map method of Series to look up in the dictionary:

df['nominal'] * df['year'].map(factors)

Upvotes: 4

Related Questions