Reputation: 327
I have a stacked bar plot, and I would like to add geom_lines and p-values on top of it to illustrate p for trend change for the individual values of the variable "clarity". I've managed to add the lines, but they all group at the bottom of the graph instead of following the individual values of the clarity-variable. How can I "offset" the lines in the y-axis to its cumulative percentage that each value represents (preferably in the middle of each respective span)? Once the lines are at their right place, I would like the p-value to be displayed right above each lines' furthers to the right. I've created two dummy variables for p-values, VVS1_p and VVS2_p, since I will create these separately through CochranArmitageTest() for trend.
Here is my current code:
cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
data("diamonds")
diamonds_data <- diamonds
diamonds_data_prct <- diamonds[,c("clarity", "cut")]
diamonds_data_prct <- table(diamonds_data_prct)
diamonds_data_prct <- as.data.frame(diamonds_data_prct)
diamonds_data_prct <- diamonds_data_prct %>% group_by(cut) %>% mutate(Proportion = Freq / sum(Freq)) %>% ungroup() %>% as.data.frame()
diamonds_data_prct$clarity <- factor(diamonds_data_prct$clarity)
VVS1_p <- paste0("p = 0.002")
VVS2_p <- paste0("p = 0.543")
ggplot(diamonds_data_prct, aes(x=cut, y=Proportion, fill=clarity)) +
geom_bar(stat="identity", position="fill") +
theme_bw() +
scale_fill_manual(values=cbbPalette) +
labs(x = "Cut", y="**Cummulative percentage**", color="clarity") +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, 1), breaks = seq(0,1, 0.1)) +
labs(fill = "clarity") +
geom_text(aes(label=paste0(Freq, sprintf(" (%1.0f", Proportion*100),"%)")), size = (11*0.3527777), family = "Arial",
position=position_stack(), vjust=1.3) +
geom_line(aes(linetype = clarity, group = clarity, y = Proportion)) +
geom_point()
Here is the current look of the graph:
Thank you for your help!
Upvotes: 1
Views: 456
Reputation: 4338
I don't love this, but I think it get's what you're going for. I created a new column for cumulative proportion but then you need reverse it for the plot to work.
diamonds_data_prct <- diamonds_data_prct %>%
group_by(cut) %>%
mutate(cumulative_proportion = Proportion + lag(cumsum(Proportion))) %>%
mutate(cumulative_proportion = ifelse(is.na(cumulative_proportion),
Proportion,
cumulative_proportion))
ggplot(diamonds_data_prct, aes(x=cut, y=Proportion, fill=clarity)) +
geom_bar(stat="identity", position="fill") +
theme_bw() +
scale_fill_manual(values=cbbPalette) +
labs(x = "Cut", y="**Cummulative percentage**", color="clarity") +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, 1), breaks = seq(0,1, 0.1)) +
labs(fill = "clarity") +
geom_text(aes(label=paste0(Freq, sprintf(" (%1.0f", Proportion*100),"%)")), size = (11*0.3527777), family = "Arial",
position=position_stack(), vjust=1.3) +
#### This is the relevant part ####
geom_line(aes(linetype = clarity, group = clarity, y = rev(cumulative_proportion))) +
#### ####
geom_point()
Upvotes: 1