user14176250
user14176250

Reputation: 55

validate two column from different dataframes basis unique column

I want to validate the Name column from both data frames if they have different names for the same ID. For the same ID, how do you check to see if the names in df and df1 are different names. If so, then mutate a new column to show duplicate names.

doubt : I want to learn while handling this. How can we ignore case of Names?

df <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
                 city=c("DEL","mum","DEL","MUM","DEL","del","MUM","DEL","del","MUM","mum","mum","mum","mum","DEL","DEL"),
                 Name= c("dev,akash","singh,Ajay","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))


df1 <- data.frame(ID =c("DEV2962","KTN2251","ANA2719","ITI2624","DEV2698","HRT2923","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
                  city=c("DEL","mum","DEL","MUM","DEL","del","MUM","DEL","del","MUM","mum","mum","mum","mum","DEL","DEL"),
                  Name= c("dev,akash","singh,rahul","abbas,salman","lal","singh,nkunj","garg","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))



df[[colname1]] <- factor(as.integer(!df[[colname1l]] %in% df[[colname1]]))

expected output

enter image description here

Upvotes: 0

Views: 49

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389047

You can join the two dataframe and use ifelse to check the two Name columns.

library(dplyr)
full_join(df, df1, by = 'ID') %>%
  mutate(diff_name = if_else(Name.x != Name.y,'Different Name', '', missing = ''))

In base R -

transform(merge(df, df1, by = 'ID', all = TRUE), 
          diff_name = ifelse(Name.x != Name.y, 'Different Name', ''))

Another approach is to use match -

df$diff_name <- ''
df$diff_name[df$Name != df1$Name[match(df$ID, df1$ID)]] <- 'Different Name'

Upvotes: 1

akrun
akrun

Reputation: 887223

We could do this in base R

out <- merge(df, df1, by = 'ID', all = TRUE)
out$diff_name <- with(out, c('', 'Different Name')[1 + (Name.x != Name.y)])

Upvotes: 0

Related Questions