Reputation: 25
I have several lists of genes (all of various lengths) that I would like to visually compare by venn diagram. I wrote a little function that uses ReShape2 to convert any dataframe of gene names into a matrix of 1's and 0's that Venneuler can use to plot a Venn Diagram. My problem is that I cannot figure out how to extract/calculate the value associated with each part of the venn diagram. Additionally, it'd be nice if I could add those values to my plot in R.
Here's an example of what my data look like:
A <- c("gene1", "gene2", "gene3", "gene5", "gene12", "", "")
B <- c("gene1", "gene2", "gene6", "gene7", "", "", "")
C <- c("gene2", "gene6", "gene7", "gene8", "gene9", "gene13", "gene14")
D <- c("gene7", "gene8", "gene9", "gene10", "gene11", "gene12", "")
dat <- data.frame(A,B,C,D)
Function that will convert a table of gene names into a presence/absence matrix that Vennueler can use:
vennfun <- function(x) {
x$id <- seq(1, nrow(x)) #add a column of numbers (required for melt)
xm <- melt(x, id.vars="id", na.rm=TRUE) #melt table into two columns (value & variable)
xc <- dcast(xm, value~variable, fun.aggregate=length) #remove NA's, list presence/absence of each value for each variable (1 or 0)
rownames(xc) <- xc$value #value column = rownames (required for Venneuler)
xc$value <- NULL #remove redundent value column
xc #output the new dataframe
}
Load required packages:
library(reshape2)
library(venneuler)
Run vennfun and use the output to plot a venn diagram with venneuler:
VennDat <- vennfun(dat)
genes.venn <- venneuler(VennDat)
plot(genes.venn)
My question is: how do I get the number of genes associated with all possible conditions (i.e. A, AB, ABC, ABCD, B, BC, BCD, ABD, ACD, etc.), and/or how do I add these values to my venn diagram?
Thanks!!
Upvotes: 0
Views: 2249
Reputation: 3694
If you're willing to change package, you could accomplish this with eulerr (that I am the author of):
library(eulerr)
genes.venn <- euler(VennDat)
plot(genes.venn, quantities = TRUE)
As an aside, this problem does not really lend itself well to a Euler diagram. (The fit is quite poor.) Perhaps you should consider an alternative?
Upvotes: 1
Reputation: 2628
I think my nVennR package would be a good tool for this:
library(nVennR)
A <- c("gene1", "gene2", "gene3", "gene5", "gene12", "", "")
B <- c("gene1", "gene2", "gene6", "gene7", "", "", "")
C <- c("gene2", "gene6", "gene7", "gene8", "gene9", "gene13", "gene14")
D <- c("gene7", "gene8", "gene9", "gene10", "gene11", "gene12", "")
dat <- data.frame(A,B,C,D)
myV <- plotVenn(as.list(dat))
This would plot the diagram (empty values are discarded):
You can then explore the diagram:
getVennRegion(nVennObj = myV, region = c('C', 'D'))
[1] "gene8" "gene9"
Or:
listVennRegions(nVennObj = myV)
$`0, 0, 0, 1 (D)`
[1] "gene10" "gene11"
$`0, 0, 1, 0 (C)`
[1] "gene13" "gene14"
$`0, 0, 1, 1 (C, D)`
[1] "gene8" "gene9"
$`0, 1, 1, 0 (B, C)`
[1] "gene6"
$`0, 1, 1, 1 (B, C, D)`
[1] "gene7"
$`1, 0, 0, 0 (A)`
[1] "gene3" "gene5"
$`1, 0, 0, 1 (A, D)`
[1] "gene12"
$`1, 1, 0, 0 (A, B)`
[1] "gene1"
$`1, 1, 0, 1 (A, B, D)`
[1] ""
$`1, 1, 1, 0 (A, B, C)`
[1] "gene2"
You can also use a more simple web interface for up to six sets.
Upvotes: 0