generate a new column based on values from another data frame

Question

I have a data frame with some personal information:

    df = pd.DataFrame({'person':range(5), 'birth_year':range(1980, 1985)})
    df

it looks like this:

        birth_year  person
    0       1980         0
    1       1981         1
    2       1982         2
    3       1983         3
    4       1984         4

and another data frame with some yearly growth data:

    growth = pd.DataFrame({'year':range(1980,2000),'growth_rate':np.random.randn(20)})
    growth

so it would be like this:

        growth_rate year
    0   -0.474861   1980
    1   -0.898530   1981
    2   -0.730102   1982
    3   -0.231560   1983
    4   -0.023014   1984
    ...

now I want to add a new column in df, which is the growth rate of each person at the age of ten, so for person 0 it will be the year 1990, for person 2 it will be year 1991, etc.. and the growth rate data is from the data frame growth. the resulting data frame should be like this:

        birth_year  person         growth_10
    0       1980         0          value_1990
    1       1981         1          value_1991
    2       1982         2          value_1992
    3       1983         3          value_1993
    4       1984         4          value_1994

How can I manage this?

PS: the order of the columns seems to be alphabetically ordered, like birth_year before person, and growth_rate in front of year, not sure how to fix this..

EdChum · Accepted Answer

You can call map on a temporary column and pass your other df growth with setting the index to column 'year', this will perform the lookup:

In [3]:
df['growth_10'] = (df['birth_year'] + 10).map(growth.set_index('year')['growth_rate'])
df

Out[3]:
   birth_year  person  growth_10
0        1980       0   0.477596
1        1981       1   2.383193
2        1982       2  -1.121759
3        1983       3   0.573546
4        1984       4   1.195171

generate a new column based on values from another data frame

Answers (1)

Related Questions