Reputation: 53
Please, help in guiding to the correct approach on drawing a plot...
I have a dataset with some monuments and the year they were inscribed as municipal heritage
monument | year |
---|---|
A | 1990 |
B | 1990 |
C | 1993 |
D | 1995 |
E | 1996 |
All monuments are different and unique, but there are some years in common.
I would like to visualize the xx axis with all the years, a bar plot with the count of monuments inscribed in each year (and show even those years that doesn't have any monuments inscribed to visualize the the gaps in time)
also, it would be awesome to have a secondary axis, and draw a line with the accumulated sum of the monuments inscribed..
the final result would be something similar to this
Thanks in advance!
Upvotes: 0
Views: 171
Reputation: 31
So I've always found creating a secondary axis in ggplot2
to be non-intuitive (which is by design - ggplot2
package authors discourage secondary axes because they are often misinterpreted). However, if they must be used, the echarts4r
package has a straightforward solution.
library(echarts4r)
library(dplyr)
library(zoo)
d <- data.frame(
monument = c("A","B","C","D","E"),
year = c(1990, 1990, 1993, 1995, 1996))
plot_dat <-
data.frame(year = seq.int(min(d$year), max(d$year))) %>%
left_join(d %>%
group_by(year) %>%
summarize(cnt = n()) %>%
mutate(cum_cnt = cumsum(cnt))
) %>%
mutate(year = paste(year),
cum_cnt = na.locf(cum_cnt),
show = T)
plot_dat %>%
e_charts(year) %>%
e_bar(cnt) %>%
e_add("label", show) %>%
e_line(cum_cnt, y_index = 1) %>%
e_hide_grid_lines("y")
The code above produces this result. I made the executive decision to only show y-axis gridlines for the secondary axis since the bars are easily annotated with labels.
Thanks for posting! I wanted a good excuse to learn echarts4r
!
Upvotes: 0
Reputation: 37913
The general thing to remember with secondary axes in ggplot2 is that (1) you need to transform the input data yourself and (2) your need to specify the inverse transform in the secondary axis. Here is an example with some dummy data, where we simply use a scaling factor of 10
.
library(ggplot2)
df <- data.frame(
year = sample(1990:2020, 50, replace = TRUE)
)
scale <- 10 # scaling factor for secondary axis
ggplot(df, aes(year)) +
geom_bar(width = 0.5) +
geom_line(aes(y = after_stat(cumsum(count)/scale)),
stat = "count", colour = "red") +
scale_y_continuous(
sec.axis = sec_axis(~ .x * scale, name = "cumulative count")
)
Created on 2021-02-03 by the reprex package (v1.0.0)
Perhaps also useful to point out, is that you can get the cumulative counts per year with aes(y = after_stat(cumsum(counts))
.
Upvotes: 1