Reputation: 965
I'm trying to calculate a new column in my data where each row uses the lag of the row before it with the first row using a value of 0. I'm trying to do this in tidyverse
but can't seem to get it sorted it.
Here is what the data looks like:
library(tidyverse)
sample <- 1:10
var <- c(10, 5, 16, 12, 13, 4, 17, 20, 10, 7)
df <- data.frame(sample, var)
My new column is called new_var
and is calculated as lag(new_var) + sample_i - target - k)
.
I've tried using the the accumulate()
function but I can't seem to get it right. Here's where I'm at:
target <- 11
k <- 2
df %>%
mutate(new_var = accumulate(.y = var,
.f = ~.x + .y - target - k,
.x = 0))
The expected output would be:
(-3, 3, 5,-1, 0, -9, 4, 7, -3, 6)
I worked it up in excel to show what it would look like:
To get to that conclusion, the equation in the 'actual new_var' column for sample 1 is as follows (using 0 as the first number in the initial equation):
O2 + K7 - $O$3 - $O$4
And each subsequent sample lags the 'new_var' column from there (keeping Target and K as constants and using the lag value as the first number in the equation):
O3 + K8 - $O$3 - $O$4
Upvotes: 0
Views: 468
Reputation: 388907
Probably, you were looking for :
library(dplyr)
library(purrr)
df %>%
mutate(new_var = accumulate(.x = var,
.f = ~.x + .y - target - k,
.init = 0)[-1])
# sample var new_var
#1 1 10 -3
#2 2 5 -11
#3 3 16 -8
#4 4 12 -9
#5 5 13 -9
#6 6 4 -18
#7 7 17 -14
#8 8 20 -7
#9 9 10 -10
#10 10 7 -16
which in base R, can be written as :
Reduce(function(x, y) x + y - target - k, df$var, accumulate = TRUE, init = 0)[-1]
#[1] -3 -11 -8 -9 -9 -18 -14 -7 -10 -16
Upvotes: 3