Reputation: 149
I have a quite big dataframe and I'm trying to add a new variable which is the sum of the three previous rows on a running basis, also it should be grouped by ID. The first three rows per ID should be 0. Here's what it should look like.
ID Var1 VarNew
1 2 0
1 2 0
1 3 0
1 0 7
1 4 5
1 1 7
Here's an example dataframe
ID <- c(1, 1, 1, 1, 1, 1)
Var1 <- c(2, 2, 3, 0, 4, 1)
df <- data.frame(ID, Var1)
Upvotes: 2
Views: 2196
Reputation: 26218
Library runner
also helps
library(runner)
df %>% mutate(var_new = sum_run(Var1, k =3, na_pad = T, lag = 1))
ID Var1 var_new
1 1 2 NA
2 1 2 NA
3 1 3 NA
4 1 0 7
5 1 4 5
6 1 1 7
NA
s can be mutated to 0
if desired so, easily.
Upvotes: 0
Reputation: 39657
You can use filter
in ave
.
df$VarNew <- ave(df$Var1, df$ID, FUN=function(x) c(0, 0, 0,
filter(head(df$Var1, -1), c(1,1,1), side=1)[-1:-2]))
df
# ID Var1 VarNew
#1 1 2 0
#2 1 2 0
#3 1 3 0
#4 1 0 7
#5 1 4 5
#6 1 1 7
or using cumsum
in combination with head
and tail
.
df$VarNew <- ave(df$Var1, df$ID, FUN=function(x) {y <- cumsum(x)
c(0, 0, 0, tail(y, -3) - head(y, -3))})
Upvotes: 0
Reputation: 388862
You can use any of the package that has rolling calculation function with a window size of 3 and lag
the result. For example with zoo::rollsumr
.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(VarNew = lag(zoo::rollsumr(Var1, 3, fill = 0), default = 0)) %>%
ungroup
# ID Var1 VarNew
# <dbl> <dbl> <dbl>
#1 1 2 0
#2 1 2 0
33 1 3 0
#4 1 0 7
#5 1 4 5
#6 1 1 7
Upvotes: 4