Reputation: 1
I am trying to plot a volcano plot with ggplot2. I would like to have three different colors based on the following criteria:
Similar questions have been asked here before and I tried following them but keep getting error messages. Any suggestions would be highly appreciated.
Here is the raw data file:
chr start end strand pvalue qvalue meth.diff
16 chr1 37801 38100 * 2.246550e-05 4.487042e-04 -36.485769
17 chr1 38101 38400 * 5.699781e-06 1.376471e-04 55.755181
29 chr1 49501 49800 * 1.453030e-18 2.442391e-16 -18.381131
35 chr1 62701 63000 * 5.547627e-03 3.686303e-02 -31.871711
54 chr1 122401 122700 * 3.917230e-03 2.845933e-02 63.443366
57 chr1 130201 130500 * 8.941091e-04 9.253737e-03 -8.347167
myDiff1p$threshold = factor(ifelse(myDiff1p$meth.diff>25 & myDiff1p$qvalue< 0.05, 1,
ifelse(myDiff1p$meth.diff<-25 & myDiff1p$qvalue< 0.05,-1,0)))
ggplot(data=myDiff1p, aes(x=meth.diff, y=-log10(qvalue))) +
geom_point(aes(color=myDiff1p$threshold), alpha=0.4, size=1.75)+
geom_vline(xintercept=c(-25,25), color="red", alpha=1.0)+
geom_hline(yintercept=2, color="blue", alpha=1.0)+
xlab("Differential Methylation")+
ylab("-log10 (qvalue)")+
theme_bw()+
xlim(c(-75, 75)) +
ylim(c(0, 300))
Error: Discrete value supplied to continuous scale
Upvotes: 0
Views: 2927
Reputation: 29085
You have an almost unnoticeable mistake in this line:
myDiff1p$threshold = factor(ifelse(myDiff1p$meth.diff>25 & myDiff1p$qvalue< 0.05, 1,
ifelse(myDiff1p$meth.diff<-25 & myDiff1p$qvalue< 0.05,-1,0)))
As there's no space in myDiff1p$meth.diff<-25
, it's interpreted as myDiff1p$meth.diff <- 25
rather than myDiff1p$meth.diff < -25
. As a result, meth.diff
got messed up.
Here's what I recommend:
library(dplyr)
myDiff1p <- myDiff1p %>%
mutate(threshold = factor(case_when(meth.diff > 25 & qvalue < 0.05 ~ "cond1",
meth.diff < -25 & qvalue < 0.05 ~ "cond2",
TRUE ~ "cond3")))
ggplot(data=myDiff1p, aes(x=meth.diff, y=-log10(qvalue))) +
geom_point(aes(color=myDiff1p$threshold), alpha=0.4, size=1.75)+
geom_vline(xintercept=c(-25,25), color="red", alpha=1.0)+
geom_hline(yintercept=2, color="blue", alpha=1.0)+
xlab("Differential Methylation")+
ylab("-log10 (qvalue)")+
theme_bw()+
xlim(c(-75, 75)) +
ylim(c(0, 300)) +
scale_color_manual(name = "Threshold",
values = c("cond1" = "red", "cond2" = "green", "cond3" = "grey"))
I labelled the threshold factor by condition, & defined the mapping between condition & colour in a named vector in scale_color_manual()
. Also, a matter of personal preference, but I think dplyr::case_when()
looks neater than nested ifelse()
statements.
Upvotes: 2