Reputation: 389
I want to create a plot (preferable using ggplot2
) where I visualize a timeline together with a time-trend plot.
To put it in a practical example, I have aggregated unemployment rates for each year. I also have a data set denoting important legislation changes that are related to the labor market. Hence, I want to create a timeline where the unemployment rate is shown following the same x-axis (time).
I have generated some toy-data, see code below:
set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))
year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit",
"Pre-school became free",
"Five-day workweek were introduced",
"Labor law reform 1976",
"Unemployment benefit were cut in half",
"Apprenticeship Act allows on-the-job training",
"Changes in discrimination law",
"Equal Pay for Equal Work was",
"9 weeks vacation were introduced",
"Unemployment benefit were removed")
imp_event <- data.frame(year, events)
I can easily plot the time-trend across the years:
library(tidyverse)
ggplot(data = un_emp, aes(x = year, y = unemployment)) +
geom_line(color = "#FC4E07", size = 0.5) +
theme_bw()
But how do I include the events (found in imp_event
) in the plot in a nice and efficient way? How can I do this?
My aim is to make a timeline looking like the one from here but to combine it with the time-trend plot shown above. How can I do this?
I have tried to use vline
but I cannot add the label of the event.
Thanks!
Upvotes: 6
Views: 655
Reputation: 535
I think this should do the trick:
First, I created the axis with hline, using the mean you set for the data as the y intercept. Then I added a variable "height" to the events' dataframe, which takes the value of the axis and adds a value drawn from a normal distribution. I used this to draw the segments that create the lines towards each point. Finally, I inverted the y position of the year label so it's always in the opposite side of the segment.
library(tidyverse)
set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))
year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit",
"Pre-school became free",
"Five-day workweek were introduced",
"Labor law reform 1976",
"Unemployment benefit were cut in half",
"Apprenticeship Act allows on-the-job training",
"Changes in discrimination law",
"Equal Pay for Equal Work was",
"9 weeks vacation were introduced",
"Unemployment benefit were removed")
imp_event <- data.frame(year, events) %>%
mutate(height = mean(unemployment) + rnorm(n(), 0, 0.02))
ggplot(un_emp) +
geom_hline(yintercept = 0.05) +
geom_line(aes(x = year,
y = unemployment),
color = "red",
alpha = 0.3,
size = 1) +
geom_segment(data = imp_event,
aes(x = year,
xend = year,
y = 0.05,
yend = height)) +
geom_text(data = imp_event,
aes(label = year,
x = year,
y = 0.05 + 0.002 * sign(0.05 - height)),
angle = 90,
size = 3.5,
fontface = "bold",
check_overlap = T) +
geom_point(data = imp_event,
aes(x = year,
y = height,
fill = as.factor(events)),
shape = 21,
size = 4) +
scale_x_continuous(name = NULL,
labels = NULL) +
scale_fill_discrete(name = "Event") +
scale_y_continuous(name = "Unemployment Rate") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line.y = element_line(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
legend.position="bottom")
Upvotes: 6
Reputation: 389
I worked with Jon Spring's solution but replaced geom_segment
with geom_vline
which gave a result close to what I wanted. The final code looked like this:
joined_data <- un_emp %>% left_join(imp_event, by = "year")
ggplot(data = joined_data, aes(x = year, y = unemployment)) +
geom_line(color = "red", size = 0.5) +
theme_classic() +
labs(y = "Unemployment rate",
x = "Years",
caption = "Data from XXXX") +
geom_vline(data = joined_data %>% filter(!is.na(events)), aes(xintercept = year), color = "gray70", linetype = "dashed") +
ggrepel::geom_text_repel(data = joined_data, aes(x = year, y = unemployment-0.03, label = str_wrap(events, 10)), color = "gray70", direction = "y", size = 2.5, lineheight = 0.7, point.padding = 0.8)
Which produces the following plot:
I want to reward @Jon Spring the bounty but not sure how I reward a comment.
Upvotes: 2
Reputation: 1102
You can achieve this by overlaying a geom_text()
call, but that requires the x
and y
values to be the same length as in the other plot so you can't just feed it a new df and overlay that.
Instead, you can achieve what you want by doing a left_join
from un_emp
to imp_events
on year
. Because there is only one row per year in imp_events
you'll be left with a majority of missing values for events
in the df which is perfect as I suspect you only want each event to appear as a label once.
For example:
joined_data <- un_emp %>% left_join(imp_event, by = "year")
ggplot(data = joined_data, aes(x = year, y = unemployment)) +
geom_line(color = "#FC4E07", size = 0.5) +
geom_text(data = joined_data, aes(x = year, y = unemployment, label = (events), size = 3)) +
theme_bw()
Which gives you something like this:
You can have a look at the available options and play around with geom_text()
here.
Upvotes: 1