Reputation: 435
What is the correct way to perform an inline conditional check for a filter which ignores a NULL input argument?
I've recently been taught about the clean method for inline conditional filtering with dplyr. I'm now interested in applying that to a function where one or more inputs may be NULL. If the argument is provided, then you should filter based on that argument, but if it is null, you should not. In this case, data will just be iris %>% tibble()
. In the past, I would do this in an unwieldy manner:
testfun <- function(data, range = NULL, spec = NULL){
if(!is.null(range)) {
data %<>% filter(between(Petal.Length, range[1], range[2]))
}
if(!is.null(spec)) {
data %<>% filter(Species %in% spec)
}
return(data)
}
My attempt at inline conditional checks looks like this
testfun <- function(data, range = NULL, spec = NULL){
data %>%
filter(
if(!is.null(range)) {between(Petal.Length, range[1], range[2])},
if(!is.null(spec)) {Species %in% spec},
)
}
This works as long as I provide inputs for range and spec. However, if I leave one of them null, I get an error message such as:
Error in 'filter()':
ℹ In argument: 'if (...) NULL'.
Caused by error:
! '..2' must be of size 150 or 1, not size 0.
Upvotes: 1
Views: 84
Reputation: 435
User lroha commented on the post with what I believe to be the correct answer, but they have posted it as an answer.
You can't pass
NULL
tofilter()
so just add an... else TRUE
to your condition statements.
So, instead of:
testfun <- function(data, range = NULL, spec = NULL){
data %>%
filter(
if(!is.null(range)) {between(Petal.Length, range[1], range[2])},
if(!is.null(spec)) {Species %in% spec},
)
}
It should be:
testfun <- function(data, range = NULL, spec = NULL){
data %>%
filter(
if(!is.null(range)) {between(Petal.Length, range[1], range[2])} else TRUE,
if(!is.null(spec)) {Species %in% spec} else TRUE,
)
}
Upvotes: 0
Reputation: 269421
Set the defaults in the argument list to values that would cause the filter expressions to evaluate to TRUE
testfun2 <- function(data, range = c(-Inf, Inf), spec = data$Species) {
data %>%
filter(
between(Petal.Length, range[1], range[2]),
Species %in% spec
)
}
or keep them as NULL in the argument list but then reset them in the code
testfun3 <- function(data, range = NULL, spec = NULL) {
range <- range %||% c(Inf, Inf)
spec <- spec %||% data$Species
data %>%
filter(
between(Petal.Length, range[1], range[2]),
Species %in% spec
)
}
Another possibility is to incorporate the NULL check in the conditions
testfun4 <- function(data, range = NA, spec = NA) {
data %>%
filter(
is.na(range) | between(Petal.Length, range[1], range[2]),
is.na(spec) | Species %in% spec
)
}
Upvotes: 2