Reputation: 47
I am new to R & need a bit of guidance here, my problem is like this: I have 2 dfs on both dfs I have performed series of operations and I need to perform this operation in the end
df1 & df2
df1 <- data.frame(name = c("A","B","C","D","E","F","F","G","s","x"))
#(1)
df1$newname <- c("A","V","C","D","c","v","x") #(name extracted from other column) (2)
df2 <- data.frame(Id_name = c("A","B","C","s","s", "x","G", "g"))
#(3)
Step1 = I need to match 2 with 3 first and extract common names, let's name it 4
Step2 = find names in 4 that have duplicate value = 1
Step3 = delete those values from 1 and 3
I tried using anti_join and semi_join but I guess that works for numeric values only, Is there any specific library available for this and how to solve this
Upvotes: 0
Views: 81
Reputation: 76402
The strategy followed below relies on intersect
/extraction:
intersect
.df1$name
that can be found in common
.df2$Id_name
.It is fully vectorized, no need for joins.
Note also argument drop = FALSE
. The examples posted in the question have only one column, and with the default drop = TRUE
the results would loose the dim
attribute, becoming vectors.
common <- intersect(newname, df2$Id_name)
df1 <- df1[!df1$name %in% common, , drop = FALSE]
df2 <- df2[!df2$Id_name %in% common, , drop = FALSE]
Upvotes: 1