Meng
Meng

Reputation: 13

How to make a overlap table from multiple lists/vectors in R?

I have multiple lists of genes, for example:

listA <- c("geneA", "geneB", "geneC")

listB <- c("geneA", "geneB", "geneD", "geneE")

listC <- c("geneB", "geneF")

...

I'd like to get a table to show the # of overlapping elements between the lists, like:

       listA   listB  listC  ...
listA   3       2      1
listB   2       4      1
listC   1       1      2
...

I know how to get the # of overlaps between each pair, like length(intersect(listA, listB)). But what are the easier ways to generate the overlap table?

Upvotes: 1

Views: 2222

Answers (2)

markus
markus

Reputation: 26343

Here is a way in base R

crossprod(table(stack(mget(ls(pattern = "^list")))))
#       ind
#ind     listA listB listC
#  listA     3     2     1
#  listB     2     4     1
#  listC     1     1     2

mget(ls(pattern = "^list")) will give you a list of elements from your global environment whose names begin with "list".

stack will turn this list into the following data frame

stack(mget(ls(pattern = "^list")))
#  values   ind
#1  geneA listA
#2  geneB listA
#3  geneC listA
#4  geneA listB
#5  geneB listB
#6  geneD listB
#7  geneE listB
#8  geneB listC
#9  geneF listC

Calling table returns.

out <- table(stack(mget(ls(pattern = "^list"))))
out
#       ind
#values  listA listB listC
#  geneA     1     1     0
#  geneB     1     1     1
#  geneC     1     0     0
#  geneD     0     1     0
#  geneE     0     1     0
#  geneF     0     0     1

crossprod then calculates

t(out) %*% out

which returns

#       ind
#ind     listA listB listC
#  listA     3     2     1
#  listB     2     4     1
#  listC     1     1     2

Upvotes: 3

IceCreamToucan
IceCreamToucan

Reputation: 28685

Create a list of all objects

list.all <- list(listA, listB, listC)

use outer

outer(list.all, list.all, Vectorize(function(x, y) sum(x %in% y)))

#      [,1] [,2] [,3]
# [1,]    3    2    1
# [2,]    2    4    1
# [3,]    1    1    2

or use sapply

sapply(list.all, function(x) sapply(list.all, function(y) sum(y %in% x)))

#      [,1] [,2] [,3]
# [1,]    3    2    1
# [2,]    2    4    1
# [3,]    1    1    2

Upvotes: 3

Related Questions