Reputation: 860
I would like to do some calculations using frollaply() or rollapplyr() with a conditional factor.
I have the following data
df <- tibble(w = c(NA, NA, "c1", NA, NA, "c2", NA, NA, "c3", NA, NA, "c4"),
x = 1:12, y = x * 2) %>%
as.data.table()
Using data.table I generate the following result.
df[, sumx := frollapply(x, 3, FUN = sum)]
w | x | y | sumx |
---|---|---|---|
1 | 2 | NA | |
2 | 4 | NA | |
c1 | 3 | 6 | 6 |
4 | 8 | 9 | |
5 | 10 | 12 | |
c2 | 6 | 12 | 15 |
7 | 14 | 18 | |
8 | 16 | 21 | |
c3 | 9 | 18 | 24 |
10 | 20 | 27 | |
11 | 22 | 30 | |
c4 | 12 | 24 | 33 |
I like this result. Although I would to do something more complicated.
First: I would like let this output more clean, like this:
w | x | y | sumx |
---|---|---|---|
1 | 2 | NA | |
2 | 4 | NA | |
c1 | 3 | 6 | 6 |
4 | 8 | NA | |
5 | 10 | NA | |
c2 | 6 | 12 | 15 |
7 | 14 | NA | |
8 | 16 | NA | |
c3 | 9 | 18 | 24 |
10 | 20 | NA | |
11 | 22 | NA | |
c4 | 12 | 24 | 33 |
Second: I would like create an another variable, for example "sumx2", where the values of the line "c1" is the sum (OBS: not just sum, could be mean or count of a specific value) of all 4 or 5 or n values of variable "x" above (OBS: If not have 4 or 5 or n values above, this absent values has to be understand as NA). The correspondent lines "c2" and "c3" following the same idea. In this way the output expected would be:
w | x | y | sumx | sumx2 |
---|---|---|---|---|
1 | 2 | NA | NA | |
2 | 4 | NA | NA | |
c1 | 3 | 6 | 6 | 6 |
4 | 8 | NA | NA | |
5 | 10 | NA | NA | |
c2 | 6 | 12 | 15 | 18 |
7 | 14 | NA | NA | |
8 | 16 | NA | NA | |
c3 | 9 | 18 | 24 | 30 |
10 | 20 | NA | NA | |
11 | 22 | NA | NA | |
c4 | 12 | 24 | 33 | 42 |
Your help is appreciated!
Upvotes: 0
Views: 292
Reputation: 2419
Check this
library(data.table)
dt <- data.table(w = c(NA, NA, "c1", NA, NA, "c2", NA, NA, "c3", NA, NA, "c4"),
x = 1:12)
dt[,id:=rleidv(x)]
#dt[,sumx := ifelse(is.na(w),NA,frollapply(x,3,sum))]
dt[,sumx := fcase(!is.na(w),frollapply(x,3,sum))]
dt[,sumx2 := fcase(!is.na(w) & id == 3, frollapply(x, n = 3, sum),
!is.na(w) & id >= 4, frollapply(x, n = 4, sum))
]
dt[,id:=NULL]
Result:
dt
w x sumx sumx2
1: <NA> 1 NA NA
2: <NA> 2 NA NA
3: c1 3 6 6
4: <NA> 4 NA NA
5: <NA> 5 NA NA
6: c2 6 15 18
7: <NA> 7 NA NA
8: <NA> 8 NA NA
9: c3 9 24 30
10: <NA> 10 NA NA
11: <NA> 11 NA NA
12: c4 12 33 42
Upvotes: 1
Reputation: 8880
if I understood everything correctly
library(tibble)
df <- tibble(w = c(NA, NA, "c1", NA, NA, "c2", NA, NA, "c3", NA, NA, "c4"),
x = 1:12, y = x * 2)
library(data.table)
setDT(df)
nm_cols <- c("sumX", "sumx2")
df[, (nm_cols) := list(
ifelse(is.na(w), NA, zoo::rollapplyr(x, width = 3, FUN = function(x) sum(x), partial = T)),
ifelse(is.na(w), NA, zoo::rollapplyr(x, width = 4, FUN = function(x) sum(x), partial = T))
)]
df
#> w x y sumX sumx2
#> 1: <NA> 1 2 NA NA
#> 2: <NA> 2 4 NA NA
#> 3: c1 3 6 6 6
#> 4: <NA> 4 8 NA NA
#> 5: <NA> 5 10 NA NA
#> 6: c2 6 12 15 18
#> 7: <NA> 7 14 NA NA
#> 8: <NA> 8 16 NA NA
#> 9: c3 9 18 24 30
#> 10: <NA> 10 20 NA NA
#> 11: <NA> 11 22 NA NA
#> 12: c4 12 24 33 42
Created on 2021-03-21 by the reprex package (v1.0.0)
Upvotes: 1