Reputation: 437
I would like to plot the ratio of geom_bar using a plot_line on a second axis. Here is my dataframe :
df <- data.frame(code=c('F6', 'F6','D4', 'D4', 'F5', 'F5', 'C4', 'C4', 'F7', 'F7'),
group=c('0','1','0','1','0','1','0','1','0','1'),
count=c(80, 700, 30, 680, 100, 360, 70, 230, 40, 200))
For the moment, I plot the following figure :
ggplot(df, aes(x=code, y=count, fill=group)) +
geom_bar(stat ="identity", position="dodge")
And I would like to have also the ratio between groups. For example, for C4 it would be 70/230*100=30%. Here is what it could represent:
Any idea ?
Upvotes: 1
Views: 911
Reputation: 17648
You can try to normalise the ratios to the maximum y-value (count
).
library(tidyverse)
MAX= max(df$count)
df %>%
group_by(code) %>%
mutate(ratio = count[1]/count[2]) %>%
mutate(ratio_norm = MAX*ratio) %>%
ggplot(aes(x=code)) +
geom_col(aes(y=count, fill=group), position="dodge") +
geom_point(data = . %>% distinct(code, ratio_norm), aes(y=ratio_norm)) +
geom_line(data = . %>% distinct(code, ratio_norm), aes(y=ratio_norm, group = 1)) +
scale_y_continuous(sec.axis = sec_axis(~./MAX, labels = scales::percent))
Upvotes: 3
Reputation: 5861
You can do this by using the tidyverse
library to calculate the percentage for each group, then adding that to your plot using a secondary axis:
library(tidyverse)
df <- data.frame(code=c('F6', 'F6','D4', 'D4', 'F5', 'F5', 'C4', 'C4', 'F7', 'F7'),
group=c('0','1','0','1','0','1','0','1','0','1'),
count=c(80, 700, 30, 680, 100, 360, 70, 230, 40, 200))
Now, make another data frame that calculates the percentage as you directed. I used spread
to do this. Also, I calculated the percentage as 7 TIMES the percentage calculated, because you want to put the percentage (which goes from 0-100) on the same graph which goes from 0-700 counts. So 7*100 will fill the entire graph. I also added a new field called "order" because geom_line doesn't like using a factor (group) to connect a line.
percentage.df <- df %>%
spread(group, count) %>%
mutate(percentage = 7*(`0`/`1`)*100) %>%
mutate(order = c(1:nrow(.)))
Now, when you plot this, you can specify a secondary axis, but you have to remember to tell ggplot that you should divide the numbers by 7 for the secondary axis labels to make sense.
ggplot(df, aes(x=code, y=count)) +
geom_bar(stat ="identity", position="dodge", aes(fill=group)) +
geom_point(data = percentage.df, aes(code, percentage)) +
geom_line(data = percentage.df, aes(order, percentage)) +
scale_y_continuous(sec.axis = sec_axis(~ . /7))
Upvotes: 4