Reputation: 2133
I would like to ask you if there is a way to filter depending on the combination of more than one variable. To be more specific:
library(dplyr)
library(plyr)
library(data.table)
data <- iris %>% cbind( group = rep(c("a", "b", "c"), nrow(iris))) %>% as.data.table()
Sepal.Length Sepal.Width Petal.Length Petal.Width Species group
1: 5.1 3.5 1.4 0.2 setosa a
2: 4.9 3.0 1.4 0.2 setosa b
3: 4.7 3.2 1.3 0.2 setosa c
4: 4.6 3.1 1.5 0.2 setosa a
5: 5.0 3.6 1.4 0.2 setosa b
6: 5.4 3.9 1.7 0.4 setosa c
and i want to filter them based on the following datatable
filter <- data.table(Species = c("setosa", "versicolor", 'setosa'), group = c('a', "b", 'c'))
Species group filter1
1: setosa a setosa a
2: versicolor b versicolor b
3: setosa c setosa c
I could do that in that way:
data[paste(Species, group) %in% filter[, filter1 := paste(Species, group)]$filter1]
However I would like to know if there is a way to do it more efficiently/faster/easier : something perhaps like:
data[.(Species, group) %in% filter] # does not work
Upvotes: 2
Views: 808
Reputation: 66819
In this case, you can do
data[filter, on=names(filter), nomatch=0]
See Perform a semi-join with data.table for similar filtering joins.
Upvotes: 4