Apricot
Apricot

Reputation: 3011

Validating values in column between two data frames in R based on conditions

I have two data frames. I have to match the first two columns between nndf and tndf and if there is a match i have to check if the values in the third column is the same or not and update the third dataframe. The problem is nndf is longer than tndf.

nndf <- data.frame("var1" = c("ABC","ABC","DEF", "FED","DGS"), "var2" = c("xyz","abc","def","dsf","dsf"), "var3" = c(1234.21,3432.12,0.12,1232.44,873.00))

tndf <- data.frame("var1" = c("ABC","ABC","DEF"), "var2" = c("xyz","abc","def"), "var3" = c(1234.21,3432.12,0.11))

ndf <- data.frame("var1" = c("ABC","ABC"), "var2" = c("xyz","abc"))

I want to populate the results in the third data frame. This data frame takes the common values from the first two columns of nndf and tndf and wherever they are common check the third column which is 1234.21 and 3432.12 and if the values are same, it returns TRUE and fill the column. The desired output is

var1   var2    var3
ABC    xyz     TRUE (indicating 1234.21 and 1234.21 in first two df are same)
ABC    abc     TRUE
DEF    def     FALSE (indicating 0.12 is not equal to 0.11)

I tried using forloop + if condition. However it iterates through each line multiple times and fills the results.

Upvotes: 1

Views: 118

Answers (2)

akrun
akrun

Reputation: 887223

We can use %in% in base R to create the logical vector

tndf$var3 <- do.call(paste, tndf) %in% do.call(paste, nndf)
tndf
#  var1 var2  var3
#1  ABC  xyz  TRUE
#2  ABC  abc  TRUE
#3  DEF  def FALSE

Or using a join

library(data.table)
setDT(tndf)[nndf, var3n := var3 == i.var3, on = .(var1, var2)]
tndf[, .(var1, var2, var3 = var3n)]
#   var1 var2  var3
#1:  ABC  xyz  TRUE
#2:  ABC  abc  TRUE
#3:  DEF  def FALSE

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389047

We could do an inner_join and then compare the values in two columns

library(dplyr)

inner_join(nndf, tndf, by = c("var1", "var2")) %>%
   mutate(var3 = var3.x == var3.y) %>%
   dplyr::select(var1, var2, var3)


#  var1 var2  var3
#1  ABC  xyz  TRUE
#2  ABC  abc  TRUE
#3  DEF  def FALSE

Or similarly in base R

df1 <- merge(nndf, tndf, by = c("var1", "var2"))
df1$var3 <- df1$var3.x == df1$var3.y

Upvotes: 4

Related Questions