Reputation: 1761
In what practical programming situations or R "idioms" would you only want to check the first element of each of two vectors for logical comparison? (I.e. disregarding the rest of each vector as in &&
and ||
.)
I can see the use of &
and |
in R, where they do element-wise logical comparison of two vectors. But I cannot see a real life practical use of their sibling operators &&
and ||
. Can anyone provide a clear example of their use?
The documentation ,help("&&")
, says:
The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.
The issue for me is the following: I interpret the documentation of &&
and ||
to say that for logical vectors x
and y
, the &&
and ||
operators only use x[1]
and y[1]
to provide a result.
> c(TRUE, FALSE, FALSE) && c(TRUE, FALSE)
[1] TRUE
> c(TRUE, FALSE, FALSE) && c(FALSE, FALSE)
[1] FALSE
> c(FALSE, FALSE, FALSE) && c(TRUE, FALSE)
[1] FALSE
> c(FALSE, FALSE, FALSE) && c(FALSE, FALSE)
[1] FALSE
I don't see any "programming control-flow" situations where I would have two logical vectors and I would disregard any values past the first element of each.
It seems that x && y
acts like x[1] & y[1]
, and x || y
acts like x[1] | y[1]
.
Here's a test function that evaluates how often these formulations return the same result using randomly generated logical vectors of different lengths. This suggests that they are doing the same thing.
> test <- function( n, maxl=10 ) {
foo <- lapply( X=seq_len( n ), FUN=function(i) {
x <- runif( n=sample( size=1, maxl ) ) > 0.5
y <- runif( n=sample( size=1, maxl ) ) > 0.5
sameres <- all.equal( (x||y), (x[1]|y[1]) )
sameres
} )
table( unlist( foo ) )
}
test( 10000 )
Yields:
TRUE
10000
Here's a benchmarking test on which is faster. It start by creating a list of lists, where each of N
items in dat
is a list containing two randomly generated logical vectors. Then we apply each of the variants on the same data to see which is faster.
library(rbenchmark)
N <- 100
maxl <- 10
dat <- lapply( X=seq_len(N), FUN=function(i) {
list( runif( n=sample( size=1, maxl ) ) > 0.5,
runif( n=sample( size=1, maxl ) ) > 0.5) } )
benchmark(
columns=c("test","replications","relative"),
lapply(dat, function(L){ L[[1]] || L[[2]] } ),
lapply(dat, function(L){ L[[1]][1] | L[[2]][1] } )
)
Yields the following output (removed the \n
characters and extra whitespace):
test replications relative
2 lapply(dat, function(L) { L[[1]][1] | L[[2]][1] }) 100 1.727
1 lapply(dat, function(L) { L[[1]] || L[[2]] }) 100 1.000
Clearly, the ||
formulation is faster than cherry picking the first element of each argument. But I'm still curious as to why one would need such an operator.
Upvotes: 0
Views: 787
Reputation: 10825
I guess that there are a couple of reasons, but probably the most important one is the short-circuit behavior. If a
evaluates to FALSE
in a && b
, then b
is not evaluated. Similarly, if a
evaluates to TRUE
in a || b
, then b
is not evaluated. This allows writing code like
v <- list(1, 2, 3, 4, 5)
idx <- 6
if (idx < length(v) && v[[idx]] == 5) {
foo
} else {
bar
}
Otherwise one needs to write this (maybe) as
if (idx < length(v)) {
if (v[idx] == 5) {
foo
} else {
bar
}
} else {
bar
}
which is 1) much less readable, and 2) repeats bar
, which is bad if bar
is a bigger piece of code.
You cannot use &
in the if
condition, because your index would be out of bounds, and this is not allowed for lists in R:
if (idx < length(v) & v[[idx]] == 5) {
foo
} else {
bar
}
# Error in v[[idx]] : subscript out of bounds
Here is a small illustration of the short-circuit behavior:
t <- function() { print("t called"); TRUE }
f <- function() { print("f called"); FALSE }
f() && t()
# [1] "f called"
# [1] FALSE
f() & t()
# [1] "f called"
# [1] "t called"
# [1] FALSE
t() || f()
# [1] "t called"
# [1] TRUE
t() | f()
# [1] "t called"
# [1] "f called"
# [1] TRUE
Upvotes: 3