Reputation: 2043
In the below reproducible code, I would like to add a column for CountIfs
as detailed in the below image, whereby the Excel countifs()
formula in the image has multiple conditions with the tops of the specified ranges "anchored". Any recommendations for how to do the same in dplyr? I'm sure it requires grouping but unsure of how to handle the multiple conditions. The blue below shows the current reproducible code output, the yellow shows what I would like to add, and the non-highlighted shows the underlying countifs()
formula.
Reproducible code:
library(dplyr)
myData <-
data.frame(
Name = c("R","R","T","R","N","N","T"),
Group = c(0,0,0,0,1,1,0),
Code = c(0,2,2,0,3,0,4)
)
myDataRender <- myData %>%
group_by(Group) %>%
mutate(CodeGrp = if_else(Group == 0, 0, max(Code)))
print.data.frame(myDataRender)
Posting alternative dataframe and Excel countifs() example:
> myData1
Name Group Code1 Code2
1 B 0 0 0
2 R 1 1 0
3 R 1 1 1
4 R 2 3 0
5 R 2 3 1
6 B 0 0 0
7 A 3 1 0
8 A 3 1 1
9 A 0 0 0
10 A 0 0 0
11 A 0 0 0
myData1 <-
data.frame(
Name = c("B","R","R","R","R","B","A","A","A","A","A"),
Group = c(0,1,1,2,2,0,3,3,0,0,0),
Code1 = c(0,1,1,3,3,0,1,1,0,0,0),
Code2 = c(0,0,1,0,1,0,0,1,0,0,0)
)
Desired output rendered in XLS:
Small change to Rui Barrada's solution to make his for-loop code exactly reflect the COUNTIFS()
formula in the XLS example:
fun_aux <- function(x,y) {
out <- integer(length(x))
for(i in seq_along(x)) {
cond1 <- y[1:i] > 0
cond2 <- x[1:i] == x[i] # the y in this line in Rui's code is changed to x (for x[i]) to correspond with XLS countifs() example
out[i] <- sum(cond1*cond2)
}
out
}
Equivalent of above for-loop function using base sapply():
mutate(CountIfs = sapply(1:n(), function(x) sum(Code2[1:x] > 0 & Code1[1:x] == Code1[x])))
Upvotes: 1
Views: 107
Reputation: 76402
Try the following.
Vectorized code like what dplyr
or the tidyverse
uses is not suited for rolling functions, where the current result depends on the previous ones. So I have written an auxiliary function fun_aux
to take care of computing the expected result.
suppressPackageStartupMessages(
library(dplyr)
)
myData1 <-
data.frame(
Name = c("B","R","R","R","R","B","A","A","A","A","A"),
Group = c(0,1,1,2,2,0,3,3,0,0,0),
Code1 = c(0,1,1,3,3,0,1,1,0,0,0),
Code2 = c(0,0,1,0,1,0,0,1,0,0,0)
)
fun_aux <- function(x, y) {
out <- integer(length(x))
for(i in seq_along(x)) {
cond1 <- y[1:i] > 0
cond2 <- x[1:i] == y[i]
out[i] <- sum(cond1*cond2)
}
out
}
myDataRender <- myData1 %>%
mutate(CountIfs = fun_aux(Code1, Code2))
print.data.frame(myDataRender)
#> Name Group Code1 Code2 CountIfs
#> 1 B 0 0 0 0
#> 2 R 1 1 0 0
#> 3 R 1 1 1 1
#> 4 R 2 3 0 0
#> 5 R 2 3 1 1
#> 6 B 0 0 0 0
#> 7 A 3 1 0 0
#> 8 A 3 1 1 2
#> 9 A 0 0 0 0
#> 10 A 0 0 0 0
#> 11 A 0 0 0 0
Created on 2022-08-21 by the reprex package (v2.0.1)
Upvotes: 2