Reputation: 1284
Say I have the following dataset dt
and a constant constant
.
dt <- structure(list(var1 = c(-92186.7470607738, -19163.5035325072,
-18178.8396858014, -9844.67882723287, -16494.7802822178, -17088.0576319257
), var2 = c(-3.12, NA, NA, NA, NA, NA)), class = c("data.table",
"data.frame"), row.names = c(NA, -6L))
constant <- 608383
print(dt)
var1 var2
1: -92186.747 -3.12
2: -19163.504 NA
3: -18178.840 NA
4: -9844.679 NA
5: -16494.780 NA
6: -17088.058 NA
The value of var2
depends on the value of the previous row, like so
for(i in 2:nrow(dt)){
prev <- dt[(i-1),]
dt[i, var2 := prev$var2 - var1/constant]
}
print(dt)
var1 var2
1: -92186.747 -3.120000
2: -19163.504 -3.088501
3: -18178.840 -3.058620
4: -9844.679 -3.042439
5: -16494.780 -3.015326
6: -17088.058 -2.987238
Though this for loop provides the desired output, it is very slow on a big dataset. Is there a faster way to achieve this output? Preferably using data.table
.
Upvotes: 4
Views: 524
Reputation: 26218
BaseR variant of above answer proposed by dear friend (which you can use easily in your data.table
syntax)
dt <- structure(list(var1 = c(-92186.7470607738, -19163.5035325072,
-18178.8396858014, -9844.67882723287, -16494.7802822178, -17088.0576319257
), var2 = c(-3.12, NA, NA, NA, NA, NA)), class = c("data.table",
"data.frame"), row.names = c(NA, -6L))
dt
#> var1 var2
#> 1 -92186.747 -3.12
#> 2 -19163.504 NA
#> 3 -18178.840 NA
#> 4 -9844.679 NA
#> 5 -16494.780 NA
#> 6 -17088.058 NA
constant <- 608383
Reduce(function(.x, .y) .x - .y /constant , dt$var1[-1], init = dt$var2[1], accumulate = T)
#> [1] -3.120000 -3.088501 -3.058620 -3.042439 -3.015326 -2.987238
Created on 2021-05-26 by the reprex package (v2.0.0)
Upvotes: 2
Reputation: 21908
Here is a solution using accumulate
function from purrr
package in case you were interested. In this solution .y
represents the current value of var1
that we would like to iterate over and .x
represents the accumulated value that we calculate and put in var2
column. As you might have noticed I excluded the first value of var1
as it we don't apply our formula on it.
library(dplyr)
library(purrr)
dt %>%
mutate(var2 = accumulate(var1[-1], .init = var2[1], ~ .x - .y /constant))
var1 var2
1: -92186.747 -3.120000
2: -19163.504 -3.088501
3: -18178.840 -3.058620
4: -9844.679 -3.042439
5: -16494.780 -3.015326
6: -17088.058 -2.987238
Upvotes: 4