How to find rows of data.frame that matches a vector in R

Question

pattern <- c("apple", "banana")
dat <- data.frame(fruit1 = c("melon", "apple", "mango", "apple"),
                  fruit2 = c("banana", "melon", "papaya", "banana"))

> dat
  fruit1 fruit2
1  melon banana
2  apple  melon
3  mango papaya
4  apple banana

I want to find out if there's a match between pattern and the rows in dat. In the example above, there is a match in the 4th row of dat.

I tried using match, but that does not seem to work on data.frames. An alternative is to loop over each row of dat:

output <- vector()
for(i in 1:nrow(dat)){
  output[i] <- all(dat[i, ] %in% pattern)
}
> which(output)
[1] 4

This is inefficient if there are many rows in dat. Is there a faster way?

Darren Tsai · Accepted Answer

You could filter the data like

dat |>
  subset(fruit1 == pattern[1] & fruit2 == pattern[2])

#   fruit1 fruit2
# 4  apple banana

If you just want the index:

which(colSums(t(dat) == pattern) == 2)
# [1] 4

or shorter

which(!colSums(t(dat) != pattern))
# [1] 4

How to find rows of data.frame that matches a vector in R

Answers (2)

Approach 1: "manual" approach with indexing

Approach 2: create a unique key across both data sources, then match with %in%.

Related Questions