Larissa Cury
Larissa Cury

Reputation: 868

plot 95% CI for proportions tables in ggplot2

Concerning error bars, as far as I'm concerned, the most informative one is the 95% CI. That being said, I want to plot it for my proportions table. How do I calculate the 95% correctly for a proportion table? and plot it with ggplot2 ?

Data contains the proportions of data (number of schools) collected per regions (A:E)

## calculate proportions:

region <- data %>% count(Q8) %>% 
  mutate(prop = round((prop.table(n) * 100), digits = 2), sd = round(sd(prop.table(n)), 
  digits = 2), Q8 = fct_reorder(Q8, n)) %>% arrange(n) 

## output
> region
  Q8  n  prop   sd
1  E  3 10.34 0.12
2  C  3 10.34 0.12
3  B  4 13.79 0.12
4  A  9 31.03 0.12
5  D 10 34.48 0.12
region_ci  <- data.frame(DescTools::MultinomCI(region$n, conf.level = 0.95)) %>%
               mutate_if(is.numeric, round, 2)
 
> region_ci
   est lwr.ci upr.ci
1 0.10   0.00   0.29
2 0.10   0.00   0.29
3 0.14   0.00   0.32
4 0.31   0.14   0.49
5 0.34   0.17   0.53
 region %>% 
  ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) + 
  geom_bar(stat = "identity", width = 0.3) +
  geom_errorbar(aes(ymin= region_ci$lwr.ci, ymax= region_ci$upr.ci, 
                    width= .1)) + 
  geom_text(aes(label = round(prop, 1.5)),
            nudge_y = 2) + # so the labels don't hit the tops of the bars
  labs(x = "place",
       y = '(%)')

my pic

> dput(region)
structure(list(Q8 = structure(1:5, .Label = c("E", "C", "B", 
"A", "D"), class = "factor"), n = c(3L, 3L, 4L, 9L, 10L), prop = c(10.34, 
10.34, 13.79, 31.03, 34.48), sd = c(0.12, 0.12, 0.12, 0.12, 0.12
)), row.names = c(NA, -5L), class = "data.frame")]

Upvotes: 0

Views: 454

Answers (1)

SAL
SAL

Reputation: 2140

Your code is correct. The only issue is that the prop values in the region data are the percentages (prop*100) but the CI values in the region_ci not. So in the ggplot multiply lower and upper ci values by 100 too:

 region %>% 
  ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) + 
  geom_bar(stat = "identity", width = 0.3) +
  geom_errorbar(aes(ymin= region_ci$lwr.ci*100, ymax= region_ci$upr.ci*100, 
                    width= .1)) + 
  geom_text(aes(label = round(prop, 1.5)),
            nudge_y = 2) + # so the labels don't hit the tops of the bars
  labs(x = "place",
       y = '(%)'

enter image description heregraph output

Upvotes: 1

Related Questions