Reputation: 55
I want to validate the Name
column from both data frames if they have different names for the same ID
.
For the same ID
, how do you check to see if the names in df and df1 are different names. If so, then mutate a new column to show duplicate names.
doubt : I want to learn while handling this. How can we ignore case of Names
?
df <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("DEL","mum","DEL","MUM","DEL","del","MUM","DEL","del","MUM","mum","mum","mum","mum","DEL","DEL"),
Name= c("dev,akash","singh,Ajay","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))
df1 <- data.frame(ID =c("DEV2962","KTN2251","ANA2719","ITI2624","DEV2698","HRT2923","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("DEL","mum","DEL","MUM","DEL","del","MUM","DEL","del","MUM","mum","mum","mum","mum","DEL","DEL"),
Name= c("dev,akash","singh,rahul","abbas,salman","lal","singh,nkunj","garg","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))
df[[colname1]] <- factor(as.integer(!df[[colname1l]] %in% df[[colname1]]))
expected output
Upvotes: 0
Views: 49
Reputation: 389047
You can join the two dataframe and use ifelse
to check the two Name
columns.
library(dplyr)
full_join(df, df1, by = 'ID') %>%
mutate(diff_name = if_else(Name.x != Name.y,'Different Name', '', missing = ''))
In base R -
transform(merge(df, df1, by = 'ID', all = TRUE),
diff_name = ifelse(Name.x != Name.y, 'Different Name', ''))
Another approach is to use match
-
df$diff_name <- ''
df$diff_name[df$Name != df1$Name[match(df$ID, df1$ID)]] <- 'Different Name'
Upvotes: 1
Reputation: 887223
We could do this in base R
out <- merge(df, df1, by = 'ID', all = TRUE)
out$diff_name <- with(out, c('', 'Different Name')[1 + (Name.x != Name.y)])
Upvotes: 0