Reputation: 500
I'm trying to make a plot, and show different colors when p > 0.5, but when I use the color aes, the line appears to be disconnected.
library(tidyverse)
data <- tibble(n = 1:365)
prob <- function (x) {
pr <- 1
for (t in 2:x) {
pr <- pr * ((365 - t + 1) / 365)
}
return(1 - pr)
}
data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100) %>%
ggplot(aes(x = n, y = prob, color = prob > 0.5)) + geom_line() +
scale_x_continuous(breaks = seq(0,100,10))
Anyone knows why? Removing the color aes()
provides an unique line.
Upvotes: 1
Views: 348
Reputation: 627
This is because prob
is a discrete variable and condition prob > 0.5
is splitting your data into two parts, with gap between them: the first half has max(prob)
= .476 and the second half has min(prob)
= .507. Hence, the (vertical) gap on the line plot is the gap between this numbers.
you can see it, if you filter modified data for values close to .5:
data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100) %>%
filter(between(prob, .4, .6))
if we modify your example:
data2 <- data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100)
#bringing extremes closer together
data2$prob[22] <- .49999999999999
data2$prob[23] <- .50000000000001
data2 %>%
ggplot(aes(x = n, y = prob, color = prob >= 0.5)) + geom_line() +
scale_x_continuous(breaks = seq(0,100,10))
The gap becomes significantly smaller:
However, it is still present (mostly on horizontal level) - because x variable is also discrete
A simple way of fixing this is to add dummy aesthetic group = 1
inside aes()
, which overrides default grouping by x
variable.
data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100) %>%
#add 'group = 1' below
ggplot(aes(x = n, y = prob, color = prob >= 0.5, group = 1)) + geom_line() +
scale_x_continuous(breaks = seq(0,100,10))
Upvotes: 2