Reputation: 12112
When using the pipe operator %>%
with packages such as dplyr
, ggvis
, dycharts
, etc, how do I do a step conditionally? For example;
step_1 %>%
step_2 %>%
if(condition)
step_3
These approaches don't seem to work:
step_1 %>%
step_2
if(condition) %>% step_3
step_1 %>%
step_2 %>%
if(condition) step_3
There is a long way:
if(condition)
{
step_1 %>%
step_2
}else{
step_1 %>%
step_2 %>%
step_3
}
Is there a better way without all the redundancy?
Upvotes: 148
Views: 65700
Reputation: 45
I know this is an old thread, but scrolling through the suggested solutions, I wasn't myself happy with any of them and thus came up with one that merely acts as if the helper function was a native language construction.
Examples:
global_var = F
data %>% ifelse_(is.numeric(voter), select(voter), select(sex)) %>% head() # voter column
data %>% ifelse_(global_var, select(voter), select(sex)) %>% head() # sex column
data %>% ifelse_(is.data.frame(data), select(voter)) %>% head() # voter column
data %>% ifelse_(F, select(voter)) %>% head() # complete data frame
data %>% if_(T, select(voter)) %>% head() # voter column
data %>% if_(F, select(voter)) %>% head() # complete data frame
Usage:
pipe_left %>% ifelse_(condition, if_true, if_false) %>% pipe_right
Short form and alias:
pipe_left %>% ifelse_(condition, if_true) %>% pipe_right
pipe_left %>% if_(condition, if_true) %>% pipe_right
if_true
and if_false
can be any expression that would naturally appear at the current pipe position. If the condition is false and if_false
is not provided, pipe_left is passed on instead.
The condition can include names of the current pipe data.
Source: (requires rlang
and dplyr
or magrittr
)
ifelse_ = function(data,c,a,b){
ce = enexpr(c)
if(eval_tidy(ce,data = data)){
e = enexpr(a)
} else {
if(missing(b)){
return(data)
} else {
e = enexpr(b)
}
}
u = expr(`%>%`((.),!!e))
data %>% {eval_tidy(u)}
}
Upvotes: 1
Reputation: 4639
Edit: purrr::when()
is deprecated as of {purrr} version 1.0.0
I think that's a case for purrr::when()
. Let's sum up a few numbers if their sum is below 25, otherwise return 0.
library("magrittr")
1:3 %>%
purrr::when(sum(.) < 25 ~ sum(.), ~0)
#> [1] 6
when
returns the value resulting from the action of the first valid condition. Put the condition to the left of ~
, and the action to the right of it. Above, we only used one condition (and then an else case), but you can have many conditions.
You can easily integrate that into a longer pipe.
Upvotes: 48
Reputation: 1695
A possible solution is to use an anonymous function
library(magrittr)
1 %>%
(\(.) if (T) . + 1 else .) %>%
multiply_by(2)
Upvotes: 3
Reputation: 47300
I like purrr::when
and the other base solutions provided here are all great but I wanted something more compact and flexible so I designed function pif
(pipe if), see code and doc at the end of the answer.
Arguments can be either expressions of functions (formula notation is supported), and input is returned unchanged by default if condition is FALSE
.
Used on examples from other answers:
## from Ben Bolker
data.frame(a=1:2) %>%
mutate(b=a^2) %>%
pif(~b[1]>1, ~mutate(.,b=b^2)) %>%
mutate(b=b^2)
# a b
# 1 1 1
# 2 2 16
## from Lorenz Walthert
1:3 %>% pif(sum(.) < 25,sum,0)
# [1] 6
## from clbieganek
1 %>% pif(TRUE,~. + 1) %>% `*`(2)
# [1] 4
# from theforestecologist
1 %>% `+`(1) %>% pif(TRUE ,~ .+1)
# [1] 3
Other examples :
## using functions
iris %>% pif(is.data.frame, dim, nrow)
# [1] 150 5
## using formulas
iris %>% pif(~is.numeric(Species),
~"numeric :)",
~paste(class(Species)[1],":("))
# [1] "factor :("
## using expressions
iris %>% pif(nrow(.) > 2, head(.,2))
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1 5.1 3.5 1.4 0.2 setosa
# 2 4.9 3.0 1.4 0.2 setosa
## careful with expressions
iris %>% pif(TRUE, dim, warning("this will be evaluated"))
# [1] 150 5
# Warning message:
# In inherits(false, "formula") : this will be evaluated
iris %>% pif(TRUE, dim, ~warning("this won't be evaluated"))
# [1] 150 5
Function
#' Pipe friendly conditional operation
#'
#' Apply a transformation on the data only if a condition is met,
#' by default if condition is not met the input is returned unchanged.
#'
#' The use of formula or functions is recommended over the use of expressions
#' for the following reasons :
#'
#' \itemize{
#' \item If \code{true} and/or \code{false} are provided as expressions they
#' will be evaluated wether the condition is \code{TRUE} or \code{FALSE}.
#' Functions or formulas on the other hand will be applied on the data only if
#' the relevant condition is met
#' \item Formulas support calling directly a column of the data by its name
#' without \code{x$foo} notation.
#' \item Dot notation will work in expressions only if `pif` is used in a pipe
#' chain
#' }
#'
#' @param x An object
#' @param p A predicate function, a formula describing such a predicate function, or an expression.
#' @param true,false Functions to apply to the data, formulas describing such functions, or expressions.
#'
#' @return The output of \code{true} or \code{false}, either as expressions or applied on data as functions
#' @export
#'
#' @examples
#'# using functions
#'pif(iris, is.data.frame, dim, nrow)
#'# using formulas
#'pif(iris, ~is.numeric(Species), ~"numeric :)",~paste(class(Species)[1],":("))
#'# using expressions
#'pif(iris, nrow(iris) > 2, head(iris,2))
#'# careful with expressions
#'pif(iris, TRUE, dim, warning("this will be evaluated"))
#'pif(iris, TRUE, dim, ~warning("this won't be evaluated"))
pif <- function(x, p, true, false = identity){
if(!requireNamespace("purrr"))
stop("Package 'purrr' needs to be installed to use function 'pif'")
if(inherits(p, "formula"))
p <- purrr::as_mapper(
if(!is.list(x)) p else update(p,~with(...,.)))
if(inherits(true, "formula"))
true <- purrr::as_mapper(
if(!is.list(x)) true else update(true,~with(...,.)))
if(inherits(false, "formula"))
false <- purrr::as_mapper(
if(!is.list(x)) false else update(false,~with(...,.)))
if ( (is.function(p) && p(x)) || (!is.function(p) && p)){
if(is.function(true)) true(x) else true
} else {
if(is.function(false)) false(x) else false
}
}
Upvotes: 13
Reputation: 12664
Here is a quick example that takes advantage of the .
and ifelse
:
X<-1
Y<-T
X %>% add(1) %>% { ifelse(Y ,add(.,1), . ) }
In the ifelse
, if Y
is TRUE
if will add 1, otherwise it will just return the last value of X
. The .
is a stand-in which tells the function where the output from the previous step of the chain goes, so I can use it on both branches.
Edit
As @BenBolker pointed out, you might not want ifelse
, so here is an if
version.
X %>%
add(1) %>%
{if(Y) add(.,1) else .}
Thanks to @Frank for pointing out that I should use {
braces around my if
and ifelse
statements to continue the chain.
Upvotes: 173
Reputation: 7664
Here is a variation on the answer provided by @JohnPaul. This variation uses the `if`
function instead of a compound if ... else ...
statement.
library(magrittr)
X <- 1
Y <- TRUE
X %>% `if`(Y, . + 1, .) %>% multiply_by(2)
# [1] 4
Note that in this case the curly braces are not needed around the `if`
function, nor around an ifelse
function—only around the if ... else ...
statement. However, if the dot placeholder appears only in a nested function call, then magrittr will by default pipe the left hand side into the first argument of the right hand side. This behavior is overridden by enclosing the expression in curly braces. Note the difference between these two chains:
X %>% `if`(Y, . + 1, . + 2)
# [1] TRUE
X %>% {`if`(Y, . + 1, . + 2)}
# [1] 4
The dot placeholder is nested within a function call both times it appears in the `if`
function, since . + 1
and . + 2
are interpreted as `+`(., 1)
and `+`(., 2)
, respectively. So, the first expression is returning the result of `if`(1, TRUE, 1 + 1, 1 + 2)
, (oddly enough, `if`
doesn't complain about extra unused arguments), and the second expression is returning the result of `if`(TRUE, 1 + 1, 1 + 2)
, which is the desired behavior in this case.
For more information on how the magrittr pipe operator treats the dot placeholder, see the help file for %>%
, in particular the section on "Using the dot for secondary purposes".
Upvotes: 24
Reputation: 226162
It would seem easiest to me to back off from the pipes a little tiny bit (although I would be interested in seeing other solutions), e.g.:
library("dplyr")
z <- data.frame(a=1:2)
z %>% mutate(b=a^2) -> z2
if (z2$b[1]>1) {
z2 %>% mutate(b=b^2) -> z2
}
z2 %>% mutate(b=b^2) -> z3
This is a slight modification of @JohnPaul's answer (you might not
really want ifelse
, which evaluates both of its arguments
and is vectorized). It would be nice to modify this to return
.
automatically if the condition is false ...
(caution: I think this works but haven't really tested/thought
about it too much ...)
iff <- function(cond,x,y) {
if(cond) return(x) else return(y)
}
z %>% mutate(b=a^2) %>%
iff(cond=z2$b[1]>1,mutate(.,b=b^2),.) %>%
mutate(b=b^2) -> z4
Upvotes: 17