Reputation: 5779
I have a working filter
statement in dplyr that I can't translate to base R
library(dplyr)
x <- data.frame(
v1 = c("USA", "Canada", "Mexico"),
v2 = c(NA, 1:5)
)
x %>% filter(v1=="Canada",v2 %in% 3:5)
x[x$v1=="Canada" && x$v2 %in% 3:5,]
Any help would be appreciated.
Upvotes: 2
Views: 210
Reputation: 11514
To illustrate:
library(dplyr)
x <- data.frame(
v1 = c("USA", "Canada", "Mexico"),
v2 = c(NA, 1:5)
)
# filter
x %>% filter(v1=="Canada",v2 %in% 3:5)
v1 v2
1 Canada 4
# your approach
x[x$v1=="Canada" && x$v2 %in% 3:5,]
v1 v2
<0 rows> (or 0-length row.names)
# second & removed
x[x$v1=="Canada" & x$v2 %in% 3:5,]
v1 v2
5 Canada 4
Apart from the rowname, it gives the same result.
Look at this example to understand what was happening before (taken from here)
-2:2 >= 0
[1] FALSE FALSE TRUE TRUE TRUE
-2:2 >= 0 & -2:2 <= 0
[1] FALSE FALSE TRUE FALSE FALSE
-2:2 >= 0 && -2:2 <= 0
[1] FALSE
In some situations, you may encounter issues with NA
s. Then it is advisable to wrap logical statements into which
. filter
filters out NA
s by default. E.g.
# will include NA:
x[x$v2 > 3,]
v1 v2
NA <NA> NA
5 Canada 4
6 Mexico 5
# will exclude NA
x[which(x$v2 > 3),]
v1 v2
5 Canada 4
6 Mexico 5
Upvotes: 2
Reputation: 131
subset
is in base R, and functions similarly to filter
in dplyr
. Is subset sufficient for you, or do you need the bracket notations for some reason?
> x <- data.frame(
+ v1 = c("USA", "Canada", "Mexico"),
+ v2 = c(NA, 1:5)
+ )
Via dplyr
:
> x %>% filter(v1=="Canada",v2 %in% 3:5)
v1 v2
1 Canada 4
Via base R/subset
:
> subset(x, v1 == 'Canada' & v2 %in% 3:5)
v1 v2
5 Canada 4
Upvotes: 1