Suraj
Suraj

Reputation: 36597

Difference between Boolean operators && and & and between || and | in R

According to the R language definition, the difference between & and && (correspondingly | and ||) is that the former is vectorized while the latter is not.

According to the help text, I read the difference akin to the difference between an "And" and "AndAlso" (correspondingly "Or" and "OrElse")... Meaning: That not all evaluations if they don't have to be (i.e. A or B or C is always true if A is true, so stop evaluating if A is true)

Could someone shed light here? Also, is there an AndAlso and OrElse in R?

Upvotes: 340

Views: 385378

Answers (4)

Konrad Rudolph
Konrad Rudolph

Reputation: 545995

There are three relevant differences between the operators &&/|| and &/|, which are explained in the official documentation. Here’s a summary:

1. & and | are vectorised

This means that if you want to perform element-wise logical operations on vectors you should use & and |:

a = c(TRUE, TRUE, FALSE, FALSE)
b = c(TRUE, FALSE, TRUE, FALSE)

a | b
# [1]  TRUE  TRUE  TRUE FALSE

a || b
# Error in a || b : 'length = 4' in coercion to 'logical(1)'

In previous versions of R, a || b (and a && b) did not cause an error. Instead, the operations silently truncated the output (only the first element was returned; before making this an error it briefly caused a warning in R 4.2).

2. && and || are short-circuited

Short-circuiting means that the right-hand side of the expression is only evaluated if the left-hand side does not already determine the outcome. Pretty much every programming language does this for conditional operations, since it leads to handy idioms when writing if conditions, e.g.:

if (length(x) > 0L && x[1L] == 42) …

This code relies on short-circuiting: without it, the code would fail if x is empty, since the right-hand side attempts to access a non-existent element. Without short-circuiting, we would have to use nested if blocks, leading to more verbose code:

if (length(x) > 0L) {
    if (x[1L] == 42) …
}

As a general rule, inside a conditional expression (if, while) you should always use && and ||, even if short-circuiting isn’t required: it’s more idiomatic, and leads to more uniform code.

3. & and | can perform bitwise arithmetic

In many (most?) programming languages, & and | actually perform bitwise arithmetic instead of Boolean arithmetic. That is, for two integers a and b, a & b calculates the bitwise and, and a | b calculates the bitwise or. For Boolean values there’s no difference between bitwise and logical operations; but for arbitrary integers, the result differs. For instance, 1 | 2 == 3 in most programming languages.

However, this is not true for R: R coerces numeric arguments of & and | to logical values and performs Boolean arithmetic.

… except when both arguments are of type raw:

c(1, 3) | c(2, 4)
# [1] TRUE TRUE

as.raw(c(1, 3)) | as.raw(c(2, 4))
# [1] 03 07

It is worth noting that the operations ! (logical negation) and xor also perform bitwise arithmetic when called with raw arguments.

Upvotes: 7

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37804

The shorter ones are vectorized, meaning they can return a vector, like this:

((-2:2) >= 0) & ((-2:2) <= 0)
# [1] FALSE FALSE  TRUE FALSE FALSE

The longer form is not, and so (as of 4.3.0) must be given inputs of length 1. (Hooray! Less checking necessary, see below.)

Until R 4.3.0, giving && inputs of length > 1 did not throw an error, but instead evaluated left to right examining only the first element of each vector, so the above gave:

((-2:2) >= 0) && ((-2:2) <= 0)
# [1] FALSE

As the help page says, this makes the longer form "appropriate for programming control-flow and [is] typically preferred in if clauses."

So you want to use the long forms only when you are certain the vectors are length one, and as of 4.3.0, R enforces this.

If you're using a previous version, you should be absolutely certain your vectors are only length 1, such as in cases where they are functions that return only length 1 booleans. You want to use the short forms if the vectors are length possibly >1. So if you're not absolutely sure, you should either check first, or use the short form and then use all and any to reduce it to length one for use in control flow statements, like if.

The functions all and any are often used on the result of a vectorized comparison to see if all or any of the comparisons are true, respectively. The results from these functions are sure to be length 1 so they are appropriate for use in if clauses, while the results from the vectorized comparison are not. (Though those results would be appropriate for use in ifelse.)

One final difference: the && and || only evaluate as many terms as they need to (which is often called short-circuiting). For example, here's a comparison using an undefined value a; if it didn't short-circuit, as & and | don't, it would give an error.

a
# Error: object 'a' not found
TRUE || a
# [1] TRUE
FALSE && a
# [1] FALSE
TRUE | a
# Error: object 'a' not found
FALSE & a
# Error: object 'a' not found

Finally, see section 8.2.17 in The R Inferno, titled "and and andand".

Upvotes: 457

IRTFM
IRTFM

Reputation: 263451

The answer about "short-circuiting" is potentially misleading, but has some truth (see below). In the R/S language, && and || only evaluate the first element in the first argument. All other elements in a vector or list are ignored regardless of the first ones value. Those operators are designed to work with the if (cond) {} else{} construction and to direct program control rather than construct new vectors.. The & and the | operators are designed to work on vectors, so they will be applied "in parallel", so to speak, along the length of the longest argument. Both vectors need to be evaluated before the comparisons are made. If the vectors are not the same length, then recycling of the shorter argument is performed.

When the arguments to && or || are evaluated, there is "short-circuiting" in that if any of the values in succession from left to right are determinative, then evaluations cease and the final value is returned.

> if( print(1) ) {print(2)} else {print(3)}
[1] 1
[1] 2
> if(FALSE && print(1) ) {print(2)} else {print(3)} # `print(1)` not evaluated
[1] 3
> if(TRUE && print(1) ) {print(2)} else {print(3)}
[1] 1
[1] 2
> if(TRUE && !print(1) ) {print(2)} else {print(3)}
[1] 1
[1] 3
> if(FALSE && !print(1) ) {print(2)} else {print(3)}
[1] 3

The advantage of short-circuiting will only appear when the arguments take a long time to evaluate. That will typically occur when the arguments are functions that either process larger objects or have mathematical operations that are more complex.

Update: The most recent edition of news(“R”) says that supplying vectors of length greater than 1 to && or || is deprecated with a warning and the intent of RCore is to make it an error in a subsequent version of R.

Upvotes: 48

Theo
Theo

Reputation: 132942

&& and || are what is called "short circuiting". That means that they will not evaluate the second operand if the first operand is enough to determine the value of the expression.

For example if the first operand to && is false then there is no point in evaluating the second operand, since it can't change the value of the expression (false && true and false && false are both false). The same goes for || when the first operand is true.

You can read more about this here: http://en.wikipedia.org/wiki/Short-circuit_evaluation From the table on that page you can see that && is equivalent to AndAlso in VB.NET, which I assume you are referring to.

Upvotes: 32

Related Questions