Reputation: 1699
I'd like to generate a subset based on factor values contained in a vector. I've included my working code for a simple example. However, what if I have many columns (>10) and I don't wish to list each column using "|" (OR), is there a better way to do this? My example below uses LETTERS but I am dealing with factors (people's names).
set.seed(37)
df <- data.frame(id1=sample(LETTERS, 20),id2=sample(LETTERS, 20))
L <- c("A","B","E")
subset(df, id1 %in% L | id2 %in% L )
id1 id2
2 B V
10 C B
11 F A
14 A F
19 E S
Upvotes: 1
Views: 122
Reputation: 32548
df[sort(unique(unlist(lapply(df, function(x) which(x %in% L))))),]
# id1 id2
#2 B V
#10 C B
#11 F A
#14 A F
#19 E S
Upvotes: 1
Reputation: 214917
You can use Reduce
to construct the OR condition:
subset(df, Reduce("|", lapply(df, `%in%`, L)))
# id1 id2
#2 B V
#10 C B
#11 F A
#14 A F
#19 E S
Or use rowSums
to check if there is any letter matching in each row:
subset(df, rowSums(sapply(df, `%in%`, L)) != 0)
# id1 id2
#2 B V
#10 C B
#11 F A
#14 A F
#19 E S
Upvotes: 3