Reputation: 33
I need to create a column in which each observation equals the value of the previous observation multiplied by an observation in another column plus 1. I'm trying to create the indx column in the example below. indx[1] is hard coded as 1.000, but indx[2] = indx[1]*(1+chng[2]).
I've been using mutate in the dplyr library to make new columns, but I don't see how I can reference the previous value of a column as I'm creating it.
Edit: Updated the below example to include to reflect that the data values for i and chng reset to 0 and 0.000 respectively after every 5 observations and that indx would need to reset to 1.000 when this happens as well and begin it's accumulation again.
Example data.table:
test <- data.frame(i = c(0,1,2,3,4,0,1,2,3,4)
,chng = c(.000,.031,.005,-.005,.017,.000,.012,.003,-.013,-.005,)
,indx = c(1,1.031,1.037,1.031,1.048,1,1.012,1.015,1.002,.997))
i chng indx
1: 0 0.000 1.000
2: 1 0.031 1.031
3: 2 0.005 1.037
4: 3 -0.005 1.031
5: 4 0.017 1.048
6: 0 0.000 1.000
7: 1 0.012 1.012
8: 2 0.003 1.015
9: 3 -0.013 1.002
10: 4 -0.005 0.997
Upvotes: 1
Views: 276
Reputation: 16121
As you mentioned, you need the previous value from a column/variable you're creating. This is a sequential process and one option is to use Reduce
(instead of a for loop):
test <- data.frame(i = c(0:4)
,chng = c(.000,.031,.005,-.005,.017))
test$indx = Reduce(function(x,y) x*(1+y), test$chng, accumulate = T, init = 1)[-1]
test
# i chng indx
# 1 0 0.000 1.000000
# 2 1 0.031 1.031000
# 3 2 0.005 1.036155
# 4 3 -0.005 1.030974
# 5 4 0.017 1.048501
For the case where i
resets you can use this:
test <- data.frame(i = c(0,1,2,3,4,0,1,2,3,4)
,chng = c(.000,.031,.005,-.005,.017,.000,.012,.003,-.013,-.005))
library(tidyverse)
test %>%
group_by(group = cumsum(i == 0)) %>% # create a group based on i column
mutate(indx = Reduce(function(x,y) x*(1+y), chng, accumulate = T, init = 1)[-1]) %>% # apply the Reduce function to each group
ungroup() %>% # forget the grouping
select(-group) %>% # remove group column
data.frame() # only for visualisation purposes (see the decimals)
# i chng indx
# 1 0 0.000 1.0000000
# 2 1 0.031 1.0310000
# 3 2 0.005 1.0361550
# 4 3 -0.005 1.0309742
# 5 4 0.017 1.0485008
# 6 0 0.000 1.0000000
# 7 1 0.012 1.0120000
# 8 2 0.003 1.0150360
# 9 3 -0.013 1.0018405
# 10 4 -0.005 0.9968313
Upvotes: 0
Reputation: 269396
Mathematically this is the same as cumprod(test$chng + 1)
:
test %>% mutate(indx = cumprod(chng + 1))
giving:
i chng indx
1 0 0.000 1.000000
2 1 0.031 1.031000
3 2 0.005 1.036155
4 3 -0.005 1.030974
5 4 0.017 1.048501
Regarding the updated question, create a grouping variable g
and apply the above by group:
test %>%
group_by(g = cumsum(i == 0)) %>%
mutate(indx = cumprod(chng + 1)) %>%
ungroup %>%
select(-g)
Upvotes: 2