Moritz Schwarz
Moritz Schwarz

Reputation: 2489

Efficiently apply a growth rate to an initial value for varying growth rates over many years

I have a large data set of future growth rates and an initial value for each individual.

I want to multiply the growth rate at t=0 with the initial value (t=0) to get the value at t=1. This t=1 value, I want to multiply with the growth rate at t=1 and so on.

I currently do this with a loop, but considering that I have a lot of individuals over a long time period, this is very slow and I feel there must be a smarter (and more elegant) way of doing this.

I would especially like to do this using dplyr functions.

#Create a sample df
df <- data.frame(id=rep(c("A","B"),each=5),
           year=rep(1:5,2),
           value = NA)

set.seed(123)
growth_rates <- data.frame(id=rep(c("A","B"),each=5),
                           year=rep(1:5,2),
                           value = runif(10,0.95,1.1))

# pick an initial value (in reality this is given)
df[df$year==1&df$id=="A","value"] <- 5
df[df$year==1&df$id=="B","value"] <- 7

for (i in 2:5){
  df[df$year == i,"value"] <- df[df$year == i-1,"value"]*growth_rates[growth_rates$year == i-1,"value"]
}

Thanks for your help!

Upvotes: 1

Views: 573

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388862

We can use accumulate from purrr after assigning the initial values in each group.

library(dplyr)
library(purrr)

initial_value <- c(5, 7)
growth_rates$ans[!duplicated(df$id)] <- initial_value


growth_rates %>%
   group_by(id) %>%
   mutate(ans = accumulate(value[-n()], `*`, .init = first(ans)))


#   id     year value   ans
#   <fct> <int> <dbl> <dbl>
# 1 A         1 0.993  5   
# 2 A         2 1.07   4.97
# 3 A         3 1.01   5.30
# 4 A         4 1.08   5.36
# 5 A         5 1.09   5.81
# 6 B         1 0.957  7   
# 7 B         2 1.03   6.70
# 8 B         3 1.08   6.89
# 9 B         4 1.03   7.47
#10 B         5 1.02   7.72

Upvotes: 2

Allan Cameron
Allan Cameron

Reputation: 173793

Here's a neat dplyr solution using a lag on the cumprod of growth_rates:

initial_vals <- c(5, 7)

growth_rates %>%
  group_by(id) %>% 
  mutate(cumulative = lag(cumprod(value), default = 1)) %>%
  mutate(value = cumulative * initial_vals[match(id, c("A", "B"))]) %>%
  select(-cumulative)
#> # A tibble: 10 x 3
#> # Groups:   id [2]
#>    id     year value
#>    <fct> <int> <dbl>
#>  1 A         1  5   
#>  2 A         2  4.97
#>  3 A         3  5.30
#>  4 A         4  5.36
#>  5 A         5  5.81
#>  6 B         1  7   
#>  7 B         2  6.70
#>  8 B         3  6.89
#>  9 B         4  7.47
#> 10 B         5  7.72

Upvotes: 3

Related Questions