Reputation: 105
I have a line plot that tracks counts over time for multiple factors. A mock version of the data I am working with would be:
step factor count
1 a 10
1 b 0
1 c 5
2 a 5
2 b 10
2 c 0
... etc.
The counts are influenced by an external event, and for each step I know whether that event is happening or not. This information could either be in a different dataframe or in the same one, it doesn't really matter, and it would look like this:
step event
1 FALSE
2 FALSE
...
10 TRUE
11 TRUE
...
30 FALSE
... etc.
I am writing this script to automate the plot creation since I will be dealing with lots of data, and while I know I could use geom_rect() to hard-code highlighting rectangles, it is absolutely not something that I could do manually without wasting way too much time, especially considering the event can turn on and off at different steps in different instances.
Is there any way that I can extract the x limits for geom_rect() dynamically from the data and create as many rectangles as the data set needs? Or is this completely hopeless?
Upvotes: 2
Views: 1481
Reputation: 24818
Here's an alternative approach to @Allan's excellent approach which relies on preprocessing the event data into groups with dplyr
:
library(dplyr)
data2 %>%
group_by(group = cumsum(c(1,diff(event))!=0)) %>%
dplyr::filter(event == TRUE & (step == min(step) | step == max(step))) %>%
ggplot() +
geom_ribbon(aes(x = step, group = group, ymax = Inf, ymin = -Inf),
fill = "yellow", alpha = 0.3) +
geom_line(data = data, aes(x = step, y = count, color = factor)) +
facet_wrap(.~factor, ncol = 1)
set.seed(3)
data <- data.frame(step = rep(1:30, each = 3), factor = rep(letters[1:3],times = 30), count = round(runif(90,0,100)))
data2 <- data.frame(step = 1:30, event = rep(c(TRUE,FALSE,TRUE,FALSE,TRUE,FALSE), c(3,7,2,8,4,6)))
data2
step event
#1 1 TRUE
#2 2 TRUE
#3 3 TRUE
#...
#28 28 FALSE
#29 29 FALSE
#30 30 FALSE
Upvotes: 3
Reputation: 173948
This may be a bit hacky, but I think it gives the result you are looking for. Let me create some data first that roughly corresponds to yours:
df <- data.frame(step = rep(1:100, 3), group = rep(letters[1:3], each = 100),
value = c(cumsum(c(50, runif(99, -1, 1))),
cumsum(c(50, runif(99, -1, 1))),
cumsum(c(50, runif(99, -1, 1)))))
df2 <- data.frame(step = 1:100, event = sample(c(TRUE, FALSE), 100, TRUE))
So the starting plot from df
would look like this:
ggplot(df, aes(step, value, colour = group)) + geom_line()
and the event data frame looks like this:
head(df2)
#> step event
#> 1 1 FALSE
#> 2 2 FALSE
#> 3 3 FALSE
#> 4 4 TRUE
#> 5 5 FALSE
#> 6 6 TRUE
The idea is that you add a semi-transparent red geom_area
to the plot, making FALSE
values way below the bottom of the range and TRUE
values way above the top of the range, then just set coord_cartersian
so that the y limits are near to the limits of your main data. This will give you red vertical bands whenever your event is TRUE
:
ggplot(df, aes(step, value, colour = group)) +
geom_line() +
geom_area(data = df2, aes(x = step, y = 1000 * event),
inherit.aes = FALSE, fill = "red", alpha = 0.2) +
coord_cartesian(ylim = c(40, 60)
Upvotes: 3