user6380453
user6380453

Reputation: 23

Remove rows from a dataframe that match two columns in another dataframe R

I am struggling to remove rows from a data frame in R, where values from different columns match two values from different columns in a second data frame.

For example, given the following pseudo-data:

ID1 <- c(5,10,6)
ID2 <- c(3,5,4)
Value <- rnorm(3)
DF1 <- data.frame(ID1, ID2, Value)

x <- c()
y <- c()
z <- c()

for (i in 1:10){
a <- rep(i, 10)
b <- c(1:10)
c <- rnorm(10)
x <- c(x, a)
y <- c(y, b)
z <- c(z, c)
}

DF2 <- data.frame(x, y, z)

I would like to remove the rows from DF2 where the combination of x and y matches ID1 and ID2 from DF1 (ie x = 5 and y = 3, x = 10 and y = 5, x = 6 and y = 4, but also x = 3 and y = 5, x = 5 and y = 10, x = 4 and y = 6).

Upvotes: 1

Views: 2855

Answers (2)

user2100721
user2100721

Reputation: 3587

Another option by using @zx8754's excl and match_df function of plyr package

library(plyr)
DF2[-as.numeric(rownames(match_df(DF2,excl))),]

Upvotes: 0

zx8754
zx8754

Reputation: 56004

Make exclude list

excl <- data.frame(
  x = c(DF1$ID1, DF1$ID2),
  y = c(DF1$ID2, DF1$ID1))

Then use anti join:

library(dplyr)
anti_join(DF2, excl, by = c("x", "y"))

Or using paste as suggested in the comments:

DF2[! paste(DF2$x, DF2$y) %in% 
      c(paste(DF1$ID1, DF1$ID2),
        paste(DF1$ID2, DF1$ID1)), ]

Upvotes: 3

Related Questions