pauschig
pauschig

Reputation: 11

How to plot a stacked and grouped bar plot in rstudio?

I already created a grouped barchart, that shows the number of clients (n=25) grouped by agegroup and gender. But now I need to include all clients (n=52) that were screened for the study to show quality of data. It should look like the attached barchart, but with the total number stacked in a lighter color, so one can see: In Age Group 1, we screened for example 15 females and 9 males, but in the end only 8 females signed up for the study. I hope it is understandable like this. I'll add the existing bar chart.

the code for this chart is the following: enter image description here

ggplot(data=daten, aes(x=AG, y=stat(count), group=factor(GESCHLECHT), fill=factor(GESCHLECHT)))+
        geom_bar(position=position_dodge2(preserve="single", padding=0))+
  scale_fill_manual(values = c("steelblue","darkred"), labels=c("Männlich", "Weiblich"))+
  labs(y="Anzahl", x="Altersgruppen", fill="Geschlecht:")+
  scale_x_continuous(breaks=seq(1,3, by=1), labels=c("25 bis <51","51 bis <65", ">65"))+
  scale_y_continuous(limits=c(0, 10))+
  geom_text(aes(label=stat(count)),stat="count", vjust=-0.5, position=position_dodge(width = 1))+
  theme(legend.position="bottom")

I have no idea how to organize my data so that I could do something like that. In the end it should look like that above the first bar (n=8) the bar should continue, only with for example a lighter color and then lets say with n=6 more females (in the end that would show that we screened 15 females in total in that agegroup)

I think first I would need to create a data frame with all n=52 clients, with a code for Age Group and gender. But how can I divide the bars so that the difference between screened and recruited clients become clear?

Upvotes: 0

Views: 275

Answers (1)

danlooo
danlooo

Reputation: 10627

Something like this?

library(tidyverse)

set.seed(1)
# example data
n <- 100
data <- tibble(
  gender = sample(c("m", "f"), n, replace = TRUE),
  age = runif(n, 25, 70) %>% as.integer(),
  screened = rep(TRUE, 0.8 * n) %>% c(rep(FALSE, 0.2 * n)) %>% sample()
)
data
#> # A tibble: 100 × 3
#>    gender   age screened
#>    <chr>  <int> <lgl>   
#>  1 m         54 TRUE    
#>  2 f         40 FALSE   
#>  3 m         37 FALSE   
#>  4 m         69 TRUE    
#>  5 f         53 TRUE    
#>  6 m         34 TRUE    
#>  7 m         30 TRUE    
#>  8 m         46 TRUE    
#>  9 f         66 TRUE    
#> 10 f         51 TRUE    
#> # … with 90 more rows

data %>%
  mutate(
    age_group = age %>% cut(breaks = c(0, 25, 51, 65, Inf))
  ) %>%
  ggplot(aes(x = age_group, fill = gender, alpha = screened)) +
  geom_bar() +
  scale_alpha_manual(values = c(`TRUE` = 1, `FALSE` = 0.5))

Created on 2022-04-13 by the reprex package (v2.0.0)

Upvotes: 1

Related Questions