Reputation: 23
I am struggling to remove rows from a data frame in R, where values from different columns match two values from different columns in a second data frame.
For example, given the following pseudo-data:
ID1 <- c(5,10,6)
ID2 <- c(3,5,4)
Value <- rnorm(3)
DF1 <- data.frame(ID1, ID2, Value)
x <- c()
y <- c()
z <- c()
for (i in 1:10){
a <- rep(i, 10)
b <- c(1:10)
c <- rnorm(10)
x <- c(x, a)
y <- c(y, b)
z <- c(z, c)
}
DF2 <- data.frame(x, y, z)
I would like to remove the rows from DF2
where the combination of x
and y
matches ID1
and ID2
from DF1
(ie x = 5 and y = 3
, x = 10 and y = 5
, x = 6 and y = 4
, but also x = 3 and y = 5
, x = 5 and y = 10
, x = 4 and y = 6
).
Upvotes: 1
Views: 2855
Reputation: 3587
Another option by using @zx8754's excl
and match_df
function of plyr
package
library(plyr)
DF2[-as.numeric(rownames(match_df(DF2,excl))),]
Upvotes: 0
Reputation: 56004
Make exclude list
excl <- data.frame(
x = c(DF1$ID1, DF1$ID2),
y = c(DF1$ID2, DF1$ID1))
Then use anti join:
library(dplyr)
anti_join(DF2, excl, by = c("x", "y"))
Or using paste as suggested in the comments:
DF2[! paste(DF2$x, DF2$y) %in%
c(paste(DF1$ID1, DF1$ID2),
paste(DF1$ID2, DF1$ID1)), ]
Upvotes: 3