Reputation: 2597
Suppose I have two Data Frames:
Data frame 1 (let's call this Data1):
V1 V2
1 "AB"
3 "XY"
5 "DH"
8 "ST"
7 "RE"
code for Data1:
V1 <- c(1,3,5,8,7)
V2 <- c("AB","XY", "DH", "ST","RE")
Data1 <- data.frame(V1,V2)
Data frame 2 (lets call this Data2):
V1 V2
1 "AB"
2 "ZZ"
3 "XY"
5 "DH"
8 "ST"
code for Data2:
V1 <- c(1,2,3,5,8)
V2 <- c("AB","ZZ","XY","DH","ST")
Data2 <- data.frame(V1,V2)
If you notice, Data2's second row (where V2's value is "ZZ") is not present in Data1 AND the last row in Data1 (where V2's value is "RE") is not present in Data2.
A) I would like to make a list of all V2 values that are NOT present in either of the data frames.
For this example that would be "ZZ" and "RE".
B) I would like to make a list of all V2 values that ARE present in both data frames.
For this example, the result would be "AB", "XY", "DH", "ST".
Upvotes: 1
Views: 1022
Reputation: 55420
you are looking for ?setdiff
and ?intersect
inters <- intersect(DF2$V2, DF1$V2)
[1] "AB" "XY" "DH" "ST"
setdf <- c(setdiff(DF2$V2, DF1$V2), setdiff(DF1$V2, DF2$V2))
[1] "ZZ" "RE"
Upvotes: 2
Reputation: 2393
You can use the %in%
expression to find whether values of V2
exist in both dataframes. Use the not expression (!
) to find those that do not exist in both dataframes, and then bind the results together from both of those.
> rbind(Data1[!Data1$V2 %in% Data2$V2,], Data2[!Data2$V2 %in% Data1$V2,])
V1 V2
5 7 RE
2 2 ZZ
> unique(rbind(Data1[Data1$V2 %in% Data2$V2,], Data2[Data2$V2 %in% Data1$V2,]))
V1 V2
1 1 AB
2 3 XY
3 5 DH
4 8 ST
On this last piece: if every V1,V2 combination will be the same, you can simply write
Data1[Data1$V2 %in% Data2$V2,]
and save yourself some lines of code.
Upvotes: 2