Mark Eichenlaub
Mark Eichenlaub

Reputation: 123

How does R dplyr decide precedence between comparison and "or" operators?

The book R for Data Science gives this example

library(nycflights13)
library(tidyverse)

filter(flights, month == 11 | 12)

The result is that dplyr filters for all flights where month == 1, because "it finds all months that equal 11 | 12, an expression that evaluates to TRUE"

Why does it do "11 | 12" first, then do the comparison? Shouldn't it first do "month == 11", then "or" that with 12, return a vector of all TRUE, and therefore return everything?

Below shows R giving precedence to == over |:

> 3 == 5 | 7

[1] TRUE
> 3 == (5 | 7)

[1] FALSE

Upvotes: 0

Views: 320

Answers (1)

akuiper
akuiper

Reputation: 215047

The result is that dplyr filters for all flights where month == 1

That's not true, the filter does nothing and returns the original data frame.

filter(flights, month == 11 | 12) %>% dim()
# [1] 336776     19
dim(flights)
# [1] 336776     19

Why does it do "11 | 12" first, then do the comparison?

Not true as well, it compares month == 11 first which gives a logical vector and turns into all TRUE when | 12 since 12 evaluates to TRUE.

R giving precedence to == over |

This is the correct statement.


You can refer to operator precedence in R here, from highest to lowest:

:: :::  access variables in a namespace
$ @ component / slot extraction
[ [[    indexing
^   exponentiation (right to left)
- + unary minus and plus
:   sequence operator
%any%   special operators (including %% and %/%)
* / multiply, divide
+ - (binary) add, subtract
< > <= >= == != ordering and comparison
!   negation
& &&    and
| ||    or
~   as in formulae
-> ->>  rightwards assignment
<- <<-  assignment (right to left)
=   assignment (right to left)
?   help (unary and binary)

Upvotes: 3

Related Questions