Reputation: 91
I have this line of code :
plot(gene_list$logFC, -log10(gene_list$P.Value),xlim=c(-10, 10), ylim=c(0, 15),xlab="log2 fold change", ylab="-log10 p-value")
which results in a volcano plot; however I want to find a way where I can color in red the points >log(2) and
Edit: Okay so as an example I'm trying to do the following to get a volcano plot:
install.packages("ggplot2")
and then
gene_list <- read.table("/Users/Javi/Desktop/gene_list.csv", header=T, sep=",")
require(ggplot2)
##Highlight genes that have an absolute fold change > 2 and a p-value < 0.05
gene_list$threshold = as.factor(abs(gene_list$logFC) > 2 & gene_list$P.Value < 0.05)
g = ggplot(data=gene_list, aes(x=logFC, y=-log10(P.Value), colour=my_palette)) +
geom_point(alpha=0.4, size=5) +
theme(legend.position = "none") +
xlim(c(-10, 10)) + ylim(c(0, 15)) +
xlab("log2 fold change") + ylab("-log10 p-value")
What I want to do is to color the logFC values > 2 and in blue the logFC values < -2
Upvotes: 0
Views: 25465
Reputation: 1
You will need to create another column and then insert some text lets say upregulated, downregulated same. Then set the values based on the values for the logFC
de_genes$diffexpressed <- "NO"
de_genes$diffexpressed[de_genes$logFC>0.58]<-"UP"
de_genes$diffexpressed[de_genes$logFC<0.58]<-"DOWN"
Then you use ggplot to generate the plot and label them using the column with the labels
Upvotes: 0
Reputation: 1814
You need to add a threshold column reporting the labels for logFC values > 2, logFC values < -2, and values in between:
mydata<-mydata%>%mutate(threshold = ifelse(logFC >= 2,"A", ifelse(logFC<=-2 , "B", "C")))
Next you code for the volcano plot and assign colours by geom_point
and scale_colour_manual
. B which is the label for logFC<=-2 is blue
ggplot(mydata, aes(x=logFC, y=log10)) +
geom_point(aes(colour = threshold), size=2.5) +
scale_colour_manual(values = c("A"= "yellow", "B"="blue", "C"= "black"))
Upvotes: 1
Reputation: 4534
You need to use the col
argument, something like that should do it:
# first set up the plot
plot(gene_list$logFC, -log10(gene_list$P.Value),
xlim=c(-10, 10), ylim=c(0, 15),
xlab="log2 fold change", ylab="-log10 p-value",
type="n")
# then add the points
sel <- which(gene_list$logFD<=log(2)) # or whatever you want to use
points(gene_list[sel,"logFC"], -log10(gene_list[sel,"P.value"]),col="black")
sel <- which(gene_list$logFD>log(2)) # or whatever you want to use
points(gene_list[sel,"logFC"], -log10(gene_list[sel,"P.value"]),col="red")
Upvotes: 2