curious
curious

Reputation: 710

Make table of unique and intersecting elements of many vectors

l

For example this code:

library(VennDiagram)
 
# Generate 3 sets of 200 words
set1 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")
set2 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")
set3 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")
 
# Chart
venn.diagram(
  x = list(set1, set2, set3),
  category.names = c("Set 1" , "Set 2 " , "Set 3"),
  filename = '#14_venn_diagramm.png',
  output=TRUE
)

Gives this image:

enter image description here

The image is just for example, I am hoping to get a dataframe with the results (after being given any number of vectors of elements):

df <- data.frame('Set1_only'=c(118),'Set2_only'=c(130),'Set3_only'=c(117), 'All_sets'=c(5))

I can code this up, but it seems to be taking me a long time and wondering if a simple solution or function already exists.

Upvotes: 0

Views: 38

Answers (2)

Onyambu
Onyambu

Reputation: 79208

Another method could be:

sets <- list(set1, set2, set3)
nms <- lapply(seq_along(sets), combn, x = seq_along(sets), paste0, collapse = "")
data.frame(set = unlist(rev(nms)), val = lengths(calculate.overlap(sets)))

  set val
a5 123  10
a2  12  39
a4  13  32
a6  23  36
a1   1 119
a3   2 115
a7   3 122

DATA:

set.seed(22)
set1 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")
set2 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")
set3 <- paste(rep("word_" , 200) , sample(c(1:1000) , 200 , replace=F) , sep="")

Upvotes: 0

MrFlick
MrFlick

Reputation: 206197

You can use the built in set operators

interesting <- function (a,b,c) {
  only <- function(x,y,z) length(setdiff(setdiff(x, y), z))
  list(
    set1only=only(a,b,c), 
    set2only=only(b,a,c), 
    set3only=only(c,a,b),
    allsets=length(intersect(intersect(a,b), c)))
}
interesting(set1, set2, set3)

Upvotes: 1

Related Questions