Ferrick4
Ferrick4

Reputation: 23

Subset a data frame with multiple match conditions in R

With the sample data

> df1 <- data.frame(x=c(1,1,2,3), y=c("a","b","a","b"))
> df1
  x y
1 1 a
2 1 b
3 2 a
4 3 b
> df2 <- data.frame(x=c(1,3), y=c("a","b"))
> df2
  x y
1 1 a
2 3 b

I want to remove all the value pairs (x,y) of df2 from df1. I can do it using a for loop over each row in df2 but I'm sure there is a better and simpler way that I just can't think of at the moment. I've been trying to do something starting with the following:

> df1$x %in% df2$x & df1$y %in% df2$y
[1]  TRUE  TRUE FALSE  TRUE

But this isn't what I want as df1[2,] = (1,b) is pulled out for removal. Thank you very much in advance for your help.

Upvotes: 2

Views: 3543

Answers (2)

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

You could go the other way around: rbind everything and remove duplicates

out <-rbind(df1,df2)
out[!duplicated(out, fromLast=TRUE) & !duplicated(out),]

  x y
2 1 b
3 2 a

Upvotes: 1

IRTFM
IRTFM

Reputation: 263332

Build a set of pairs from df2:

prs <- with(df2, paste(x,y,sep="."))

Test each row in df1 with similarly process for membership in the pairset:

df1[ paste(df1$x, df1$y, sep=".") %in% prs , ]

Upvotes: 4

Related Questions