Nicola
Nicola

Reputation: 446

How to show labels in geom_text that is proportional to geom_bar group variable

I have been trying to output in ggplot a graph that shows labels in percentage value and in proportion to the grouping factor defined in geom_bar. Instead of % values proportionate to the overall population, I would like to output a label value that is proportionate to each sub-group (in this case Place A and Place B) but I have not managed to. See below the reproducible example

Reproducible dataframe

Random<-data.frame(replicate(3,sample(0:3,3024,rep=TRUE)))
Random$Trxn_type <- sample(c("Debit", "Credit"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)
Random$YN <- sample(c("Yes", "No"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)
Random$Place <- sample(c("PlaceA", "PlaceB"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)

Random<-Random[, 4:6]

Then applied the following code

Share<-ggplot(Random, aes(x = YN, fill=Place)) +
scale_fill_brewer(palette="Greens")+
geom_bar(aes(y = ..prop.., group = Place),position = position_dodge()) + 
facet_wrap(~ Random$Trxn_type, scales = "free_x", ncol=2)+ 
theme(strip.text.x = element_text(size = 15, colour = "black"))+
theme(panel.background = element_rect(fill = "white"),legend.position = "bottom")+
scale_y_continuous(labels = percent)+
ylab("Frequency") + 
coord_flip()+ 
xlab("Answers") + 
theme(plot.title = element_text(size = 16, face = "bold"),
      axis.text=element_text(size=12),
      axis.title=element_text(size=12))+
geom_text(aes(y=..prop..,label=scales::percent((..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..])),
          stat="count", vjust=-.5, position=position_dodge(.9)) 
Share

And got the following output

enter image description here

Instead of this percentage distribution I would like to see the % value of replies considering Place A and Place B as two separate populations. Put it more simply I would like the labels to show the % value corresponding to the size of the histogram bars in a way that histograms for Place A in credit to sum up to 100 and histograms for Place B in credit to sum up to 100. The same would apply to debit.

Thanks!

Upvotes: 1

Views: 985

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a solution that computes the proportions with dplyr and then pipes the result to ggplot.
I have also put all theme settings in the same call to theme().
I have reposted the data creation code, this time setting the RNG seed in order to make the data example reproducible.

library(dplyr)
library(ggplot2)

Random %>%
  count(Trxn_type, YN, Place) %>%
  left_join(Random %>% count(Trxn_type, name = "m"), by = "Trxn_type") %>%
  mutate(Prop = n/m) %>%
  ggplot(aes(x = YN, y = Prop, fill = Place)) +
  geom_col(position = position_dodge()) +
  geom_text(aes(label = scales::percent(Prop)),
            hjust = -0.25, 
            position = position_dodge(0.9)) +
  facet_wrap(~ Trxn_type, scales = "free_x", ncol = 2) +
  scale_fill_brewer(palette = "Greens") +
  scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
  xlab("Answers") +
  ylab("Frequency") +
  coord_flip() +
  theme(panel.background = element_rect(fill = "white"),
        legend.position = "bottom",
        strip.text.x = element_text(size = 15, colour = "black"),
        plot.title = element_text(size = 16, face = "bold"),
        axis.text = element_text(size = 12),
        axis.title = element_text(size = 12))

enter image description here

Edit.

Following the OP's comment, here is a way to also count by Place. The only change to the code above is the left_join instruction.

  left_join(Random %>% count(Trxn_type, Place, name = "m"),
            by = c("Trxn_type", "Place")) %>%

enter image description here

Data creation code.

set.seed(1234)
Random <- data.frame(replicate(3,sample(0:3,3024,rep=TRUE)))
Random$Trxn_type <- sample(c("Debit", "Credit"),
                           size = nrow(Random),
                           prob = c(0.76, 0.24), replace = TRUE)
Random$YN <- sample(c("Yes", "No"),
                    size = nrow(Random),
                    prob = c(0.76, 0.24), replace = TRUE)
Random$Place <- sample(c("PlaceA", "PlaceB"),
                       size = nrow(Random),
                       prob = c(0.76, 0.24), replace = TRUE)

Random <- Random[, 4:6]

Upvotes: 1

Related Questions