Reputation: 1365
I have built a stacked bar chart showing the relative proportions of response to different questions. Now I want to show a particular response ontop of that barchart, to show how an individuals response relates to the overall proportions of responses.
I created a toy example here:
library(ggplot2)
n = 1000
n_groups = 5
overall_df = data.frame(
state = sample(letters[1:8], n, replace = TRUE),
frequency = runif(n, min = 0, max = 1),
var_id = rep(LETTERS[1:n_groups], each = 1000 / n_groups)
)
row = data.frame(
A = "a", B = "b", C = "c", D = "h", E = "b"
)
ggplot(overall_df,
aes(fill=state, y=frequency, x=var_id)) +
geom_bar(position="fill", stat="identity")
The goal here is to have the responses in the object row
plotted as a point in the corresponding barchart box, with a line connecting the points.
Here is a (poorly drawn) example of the desired result. Thanks for your help.
Upvotes: 0
Views: 53
Reputation: 66425
Here's an automated approach using dplyr. I prepare the summary by joining the label data to the original data, and then using group_by
+ summarize
to get those.
library(dplyr)
row_df <- data.frame(state = letters[1:n_groups], var_id = LETTERS[1:n_groups])
line_df <- row_df %>%
left_join(overall_df, by = "var_id") %>%
group_by(var_id) %>%
summarize(state = last(state.x),
frequency = (sum(frequency[state.x < state.y]) +
sum(frequency[state.x == state.y])/2) / sum(frequency))
ggplot(overall_df, aes(fill=state, y=frequency, x=var_id)) +
geom_bar(position="fill", stat="identity") +
geom_point(data = line_df) +
geom_line(data = line_df, aes(group = 1))
Upvotes: 2
Reputation: 173793
This was trickier than I thought. I'm not sure there's any way round manually calculating the x/y co-ordinates of the line.
library(dplyr)
library(ggplot2)
df <- overall_df %>% group_by(state, var_id) %>%
summarize(frequency = sum(frequency))
freq <- unlist(Map(function(d, val) {
(sum(d$frequency[d$state > val]) + 0.5 * d$frequency[d$state == val]) /
sum(d$frequency)
}, d = split(df, df$var_id), val = row))
line_df <- data.frame(state = unlist(row),
frequency = freq,
var_id = names(row))
ggplot(df, aes(fill=state, y=frequency, x=var_id)) +
geom_col(position="fill") +
geom_line(data = line_df, aes(group = 1)) +
geom_point(data = line_df, aes(group = 1))
Created on 2022-03-08 by the reprex package (v2.0.1)
Upvotes: 2