SamPassmore
SamPassmore

Reputation: 1365

Plot a line on a barchart in ggplot2

I have built a stacked bar chart showing the relative proportions of response to different questions. Now I want to show a particular response ontop of that barchart, to show how an individuals response relates to the overall proportions of responses.

I created a toy example here:

library(ggplot2)
n = 1000
n_groups = 5
overall_df = data.frame(
  state = sample(letters[1:8], n, replace = TRUE),
  frequency = runif(n, min = 0, max = 1),
  var_id = rep(LETTERS[1:n_groups], each = 1000 / n_groups)
)

row = data.frame(
  A = "a", B = "b", C = "c", D = "h", E = "b"
)

ggplot(overall_df, 
           aes(fill=state, y=frequency, x=var_id)) + 
  geom_bar(position="fill", stat="identity") 

The goal here is to have the responses in the object row plotted as a point in the corresponding barchart box, with a line connecting the points.

Here is a (poorly drawn) example of the desired result. Thanks for your help.

enter image description here

Upvotes: 0

Views: 53

Answers (2)

Jon Spring
Jon Spring

Reputation: 66425

Here's an automated approach using dplyr. I prepare the summary by joining the label data to the original data, and then using group_by + summarize to get those.

library(dplyr)
row_df <- data.frame(state = letters[1:n_groups], var_id = LETTERS[1:n_groups])

line_df <- row_df %>%
  left_join(overall_df, by = "var_id") %>%
  group_by(var_id) %>%
  summarize(state = last(state.x),
            frequency = (sum(frequency[state.x < state.y]) + 
                         sum(frequency[state.x == state.y])/2) / sum(frequency))

ggplot(overall_df, aes(fill=state, y=frequency, x=var_id)) + 
  geom_bar(position="fill", stat="identity") +
  geom_point(data = line_df) +
  geom_line(data = line_df, aes(group = 1))

enter image description here

Upvotes: 2

Allan Cameron
Allan Cameron

Reputation: 173793

This was trickier than I thought. I'm not sure there's any way round manually calculating the x/y co-ordinates of the line.

library(dplyr)
library(ggplot2)

df <- overall_df %>% group_by(state, var_id) %>%
  summarize(frequency = sum(frequency))

freq <- unlist(Map(function(d, val) {
  (sum(d$frequency[d$state > val]) + 0.5 * d$frequency[d$state == val]) /
    sum(d$frequency)
  }, d = split(df, df$var_id), val = row))
  
line_df <- data.frame(state = unlist(row),
                      frequency = freq,
                      var_id = names(row))

ggplot(df, aes(fill=state, y=frequency, x=var_id)) + 
  geom_col(position="fill") +
  geom_line(data = line_df, aes(group = 1)) +
  geom_point(data = line_df, aes(group = 1))

Created on 2022-03-08 by the reprex package (v2.0.1)

Upvotes: 2

Related Questions