Cyrus Mohammadian
Cyrus Mohammadian

Reputation: 5193

Workflow to convert data used for stacked bar ggplot to one usable for stacked percentage plot

I have the following dataset:

(df<-structure(list(age_group = structure(c(3L, 3L, 5L, 3L, 5L, 5L, 
5L, 3L, 5L, 5L, 4L, 4L, 4L, 3L, 5L), .Label = c("65+", "55-64", 
"45-54", "35-44", "25-34", "18-24"), class = "factor"), Gender = c("F", 
"M", "M", "M", "F", "M", "M", "M", "F", "M", "M", "F", "M", "F", 
"M")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-15L), .Names = c("age_group", "Gender")))
    # A tibble: 15 x 2
   age_group Gender
   <fct>     <chr> 
 1 45-54     F     
 2 45-54     M     
 3 25-34     M     
 4 45-54     M     
 5 25-34     F     
 6 25-34     M     
 7 25-34     M     
 8 45-54     M     
 9 25-34     F     
10 25-34     M     
11 35-44     M     
12 35-44     F     
13 35-44     M     
14 45-54     F     
15 25-34     M 

From this, I created the following stacked barplot using ggplot:

enter image description here

I would now like to make a stacked percentage plot like shown in the following SO question: Create stacked barplot where each stack is scaled to sum to 100%

What is the workflow to get my data prepared to produce the stacked percentage plot? In the SO question I posted above, the data has an additional field for values, which I do not have.

Upvotes: 0

Views: 39

Answers (1)

d.b
d.b

Reputation: 32548

dat = aggregate(list(value = 1:NROW(df)), df[c("age_group", "Gender")], length)
dat$proportion = ave(dat$value, dat$age_group, FUN = function(x) x/sum(x))
ggplot(dat, aes(x = age_group, y = proportion, fill = Gender)) +
    geom_col() +
    coord_flip()

Upvotes: 1

Related Questions