Reputation: 1436
I'm using ggplot2 to plot two cumulative distributions on a single plot. This is straightforward using the example from ?stat_ecdf
. My difficulty is in adding vertical lines through the median value of each distribution.
It's easy enough to do this with a single distribution:
df <- data.frame(x = c(rnorm(100, 5, 10), rnorm(200, 0, 10)),
g = as.factor(c(rep(1, 100), rep(2, 200))))
ggplot(df, aes(x)) +
stat_ecdf() +
geom_vline(aes(xintercept = median(x)))
But I can't figure out an easy way to add vertical lines for multiple distributions. I've tried the following without success:
ggplot(df, aes(x, colour = g)) +
stat_ecdf() +
geom_vline(aes(xintercept = median(x), colour = g))
I can sort of get the desired result by assembling the plot in stages:
p <- ggplot(df[df$g == 1, ], aes(x)) +
stat_ecdf() +
geom_vline(aes(xintercept = median(x)))
p +
stat_ecdf(data = df[df$g == 2, ]) +
geom_vline(data = df[df$g == 2, ], aes(xintercept = median(x)))
But this seems like an untidy way of doing it and also leaves me to set the different line colours by hand.
Surely there's a better way?
Upvotes: 3
Views: 6684
Reputation: 921
Try this:
ggplot(df, aes(x, colour = g)) + stat_ecdf() + geom_vline(aes(xintercept = median(x[g==1]), color=g[g==1])) + geom_vline(aes(xintercept = median(x[g==2]), color=g[g==2]))
By adding a second geom_vline
argument and specifying which group each line and color is referring to, you can get two lines with the corresponding colors. Hope that helps!
Upvotes: 5