val
val

Reputation: 1699

In R, how do you subset rows of a dataframe based on values in a vector

I'd like to generate a subset based on factor values contained in a vector. I've included my working code for a simple example. However, what if I have many columns (>10) and I don't wish to list each column using "|" (OR), is there a better way to do this? My example below uses LETTERS but I am dealing with factors (people's names).

set.seed(37)
df <- data.frame(id1=sample(LETTERS, 20),id2=sample(LETTERS, 20))
L <- c("A","B","E")
subset(df, id1 %in% L | id2 %in% L )
   id1 id2
2    B   V
10   C   B
11   F   A
14   A   F
19   E   S

Upvotes: 1

Views: 122

Answers (2)

d.b
d.b

Reputation: 32548

df[sort(unique(unlist(lapply(df, function(x) which(x %in% L))))),]
#   id1 id2
#2    B   V
#10   C   B
#11   F   A
#14   A   F
#19   E   S

Upvotes: 1

akuiper
akuiper

Reputation: 214917

You can use Reduce to construct the OR condition:

subset(df, Reduce("|", lapply(df, `%in%`, L)))

#   id1 id2
#2    B   V
#10   C   B
#11   F   A
#14   A   F
#19   E   S

Or use rowSums to check if there is any letter matching in each row:

subset(df, rowSums(sapply(df, `%in%`, L)) != 0)

#   id1 id2
#2    B   V
#10   C   B
#11   F   A
#14   A   F
#19   E   S

Upvotes: 3

Related Questions