Reputation: 3223
I am trying to generate annual data for a certain product when I have data for the base year and growth rate.
In the toy example, each product has different annual growth rate in efficiency by its 'color', and I want to generate yearly data until 2030.
Therefore, I have base year data (base_year) as follows:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
And each type of product's growth rate (growthrate) as:
color rate
0 red 30
1 blue 20
Results I desire is:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
4 2019 red circle 65
5 2019 red square 39
6 2019 blue circle 120
7 2019 blue square 72
8 2020 red circle 84.5
... (until 2030)
The data used in the toy code is..
base_year = pd.DataFrame(data = {'year': [2018,2018,2018,2018],
'color': ['red', 'red', 'blue', 'blue'],
'shape' : ['circle', 'square', 'circle', 'square'],
'efficiency' : [50, 30, 100, 60]}, columns = ['year', 'color', 'shape', 'efficiency'])
growthrate = pd.DataFrame(data = {'color': ['red', 'blue'],
'rate' : [30, 20]}, columns = ['color', 'rate'])
I've been trying some approach using .loc, but it seems such approach is quite inefficient.
Any suggestions or hints would be appreciated. Thank you in advance!
Upvotes: 1
Views: 98
Reputation: 18647
Here is one way to do this:
years = 2031 - 2018
df = (pd.concat([df.assign(year=df['year']+i,
efficiency=df['efficiency']*((df['rate']/100+1)**i))
for i, df in enumerate([base_year.merge(growthrate, on='color')] * years)])
.drop('rate', axis=1))
Upvotes: 2