Unique and non-unique lists of values in a data frame in R

Question

Suppose I have two Data Frames:

Data frame 1 (let's call this Data1):

V1     V2     
1     "AB"    
3     "XY"
5     "DH"
8     "ST"
7     "RE"

code for Data1:

V1 <- c(1,3,5,8,7)
V2 <- c("AB","XY", "DH", "ST","RE")
Data1 <- data.frame(V1,V2)

Data frame 2 (lets call this Data2):

V1     V2     
1     "AB"    
2     "ZZ"
3     "XY"
5     "DH"
8     "ST"

code for Data2:

V1 <- c(1,2,3,5,8)
V2 <- c("AB","ZZ","XY","DH","ST")
Data2 <- data.frame(V1,V2)

If you notice, Data2's second row (where V2's value is "ZZ") is not present in Data1 AND the last row in Data1 (where V2's value is "RE") is not present in Data2.

A) I would like to make a list of all V2 values that are NOT present in either of the data frames.
For this example that would be "ZZ" and "RE".

B) I would like to make a list of all V2 values that ARE present in both data frames.
For this example, the result would be "AB", "XY", "DH", "ST".

Ricardo Saporta · Accepted Answer

you are looking for ?setdiff and ?intersect

inters <- intersect(DF2$V2, DF1$V2)
[1] "AB" "XY" "DH" "ST"

setdf <- c(setdiff(DF2$V2, DF1$V2), setdiff(DF1$V2, DF2$V2))
[1] "ZZ" "RE"

Unique and non-unique lists of values in a data frame in R

Answers (2)

Related Questions