Reputation: 129
I have two data frame and i want to check if the id is consistent in both data frame and if id is consistent then check if email is consistent for same id in both data frame if email, id is different then create a new data frame to show consistent and not consistent data. at last i want to display only inconsistent data frame.
Not: sometimes email can have lower or upper case or id can also have lower or upper case.
df1 <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","KTN2633","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("del","mum","nav","pun","bang","chen","triv","vish","del","mum","bang","vish","bhop","kol","noi","gurg"),
email = c("[email protected]","[email protected]",NA,NA,NA,NA,"[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"),
Name= c("dev,akash","singh,rahul","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))
df2 <- data.frame(ID =c("DEV2962","KTN2152","ANA2719","ITs2624","DEV2698","HRT2921","KTN2633","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("del","mum","nav","pun","bang","chen","ddgy","vish","del","mum","bang","vish","bhol","nhus","huay","gurg"),
email = c("[email protected]","[email protected]",NA,NA,"shoayahau",NA,"[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"),
Name= c("dev","singh,rahul","abbas,salman","lal,ram","singh,nkunj","huna,ghalak","khan,fhalt","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))
output should be look like
Upvotes: 0
Views: 32
Reputation: 78927
Something like this?
library(dplyr)
library(tidyr)
df1 %>%
inner_join(df2, by="ID") %>%
select(ID, contains("email")) %>%
mutate(consistent = ifelse(email.x == email.y, "consistent", "Inconsistent")) %>%
pivot_longer(
cols = contains("email"), values_to = "email"
) %>%
select(ID, email, consistent) %>%
data.frame()
ID email consistent
1 DEV2962 [email protected] consistent
2 DEV2962 [email protected] consistent
3 ANA2719 <NA> <NA>
4 ANA2719 <NA> <NA>
5 DEV2698 <NA> <NA>
6 DEV2698 shoayahau <NA>
7 HRT2921 <NA> <NA>
8 HRT2921 <NA> <NA>
9 KTN2633 [email protected] consistent
10 KTN2633 [email protected] consistent
11 KTN2624 [email protected] Inconsistent
12 KTN2624 [email protected] Inconsistent
13 ANA2548 [email protected] consistent
14 ANA2548 [email protected] consistent
15 ITI2535 [email protected] consistent
16 ITI2535 [email protected] consistent
17 DEV2732 [email protected] consistent
18 DEV2732 [email protected] consistent
19 HRT2837 [email protected] consistent
20 HRT2837 [email protected] consistent
21 ERV2951 [email protected] consistent
22 ERV2951 [email protected] consistent
23 KTN2542 [email protected] consistent
24 KTN2542 [email protected] consistent
25 ANA2813 [email protected] consistent
26 ANA2813 [email protected] consistent
27 ITI2210 [email protected] consistent
28 ITI2210 [email protected] consistent
Upvotes: 1