How to make a overlap table from multiple lists/vectors in R?

Question

I have multiple lists of genes, for example:

listA <- c("geneA", "geneB", "geneC")

listB <- c("geneA", "geneB", "geneD", "geneE")

listC <- c("geneB", "geneF")

...

I'd like to get a table to show the # of overlapping elements between the lists, like:

       listA   listB  listC  ...
listA   3       2      1
listB   2       4      1
listC   1       1      2
...

I know how to get the # of overlaps between each pair, like length(intersect(listA, listB)). But what are the easier ways to generate the overlap table?

markus · Accepted Answer

Here is a way in base R

crossprod(table(stack(mget(ls(pattern = "^list")))))
#       ind
#ind     listA listB listC
#  listA     3     2     1
#  listB     2     4     1
#  listC     1     1     2

mget(ls(pattern = "^list")) will give you a list of elements from your global environment whose names begin with "list".

stack will turn this list into the following data frame

stack(mget(ls(pattern = "^list")))
#  values   ind
#1  geneA listA
#2  geneB listA
#3  geneC listA
#4  geneA listB
#5  geneB listB
#6  geneD listB
#7  geneE listB
#8  geneB listC
#9  geneF listC

Calling table returns.

out <- table(stack(mget(ls(pattern = "^list"))))
out
#       ind
#values  listA listB listC
#  geneA     1     1     0
#  geneB     1     1     1
#  geneC     1     0     0
#  geneD     0     1     0
#  geneE     0     1     0
#  geneF     0     0     1

crossprod then calculates

t(out) %*% out

which returns

#       ind
#ind     listA listB listC
#  listA     3     2     1
#  listB     2     4     1
#  listC     1     1     2

How to make a overlap table from multiple lists/vectors in R?

Answers (2)

Related Questions