user12
user12

Reputation: 91

Volcano Plot in R

I have this line of code :

plot(gene_list$logFC, -log10(gene_list$P.Value),xlim=c(-10, 10), ylim=c(0, 15),xlab="log2 fold change", ylab="-log10 p-value")

which results in a volcano plot; however I want to find a way where I can color in red the points >log(2) and

Edit: Okay so as an example I'm trying to do the following to get a volcano plot:

install.packages("ggplot2")

and then

gene_list <- read.table("/Users/Javi/Desktop/gene_list.csv", header=T, sep=",")

require(ggplot2)
##Highlight genes that have an absolute fold change > 2 and a p-value <    0.05
gene_list$threshold = as.factor(abs(gene_list$logFC) > 2 & gene_list$P.Value < 0.05)

Construct the plot object

g = ggplot(data=gene_list, aes(x=logFC, y=-log10(P.Value),  colour=my_palette)) +
geom_point(alpha=0.4, size=5) +
theme(legend.position = "none") +
xlim(c(-10, 10)) + ylim(c(0, 15)) +
xlab("log2 fold change") + ylab("-log10 p-value")

What I want to do is to color the logFC values > 2 and in blue the logFC values < -2

Upvotes: 0

Views: 25465

Answers (3)

Kobina
Kobina

Reputation: 1

You will need to create another column and then insert some text lets say upregulated, downregulated same. Then set the values based on the values for the logFC

de_genes$diffexpressed <- "NO"
de_genes$diffexpressed[de_genes$logFC>0.58]<-"UP"
de_genes$diffexpressed[de_genes$logFC<0.58]<-"DOWN"

Then you use ggplot to generate the plot and label them using the column with the labels

Upvotes: 0

Al14
Al14

Reputation: 1814

You need to add a threshold column reporting the labels for logFC values > 2, logFC values < -2, and values in between:

mydata<-mydata%>%mutate(threshold = ifelse(logFC >= 2,"A", ifelse(logFC<=-2 , "B", "C")))

Next you code for the volcano plot and assign colours by geom_point and scale_colour_manual. B which is the label for logFC<=-2 is blue

ggplot(mydata, aes(x=logFC, y=log10)) +
geom_point(aes(colour = threshold), size=2.5) +
scale_colour_manual(values = c("A"= "yellow", "B"="blue",  "C"= "black"))

Upvotes: 1

cmbarbu
cmbarbu

Reputation: 4534

You need to use the col argument, something like that should do it:

# first set up the plot
plot(gene_list$logFC, -log10(gene_list$P.Value),
     xlim=c(-10, 10), ylim=c(0, 15),
     xlab="log2 fold change", ylab="-log10 p-value",
     type="n")
# then add the points
sel <- which(gene_list$logFD<=log(2)) # or whatever you want to use
points(gene_list[sel,"logFC"], -log10(gene_list[sel,"P.value"]),col="black")
sel <- which(gene_list$logFD>log(2)) # or whatever you want to use
points(gene_list[sel,"logFC"], -log10(gene_list[sel,"P.value"]),col="red")

Upvotes: 2

Related Questions