unknown
unknown

Reputation: 883

How to order geom_col by the largest average proportion for a group and then plot a stacked bar

I have data that I would like to order by the highest average by a group and then plot them as a stacked bar. I have managed to do this by producing several dataframes, but it's verbose and I am wondering if there was a less verbose way of doing this?

set.seed(3)
   x <- rep(letters[1:5], 3)
fill <- rep(letters[24:26], 5)
   n <- runif(15, 0, 1)
  df <- data.frame(x, fill, n)

df2 <- df %>%
  group_by(x) %>%
  mutate(percent = n/sum(n))

df3 <- df2 %>%
  group_by(fill) %>%
  summarise(mean = mean(percent))  %>%
  ungroup() %>%
  arrange(desc(mean))

df3 <- df2[df2$fill == df3$fill[1], ] %>%
  arrange(desc(percent))
df$x <- factor(df$x, levels = df3$x)    

ggplot(data = df, aes(x, y, fill = fill)) +
  geom_col(position = position_fill()) 

enter image description here

Upvotes: 1

Views: 467

Answers (1)

JasonAizkalns
JasonAizkalns

Reputation: 20483

I am not sure if this is necessarily better, but here's one approach that yields the same graph in your question:

df %>%
  group_by(x) %>%
  mutate(pct = n / sum(n)) %>%
  ungroup() %>%
  arrange(fill != "z", desc(pct)) %>%
  group_by(fill) %>%
  mutate(order = row_number()) %>%
  ggplot(aes(fct_reorder(x, order), pct, fill = fill)) +
  geom_col()

Depending on what you are actually trying to compare, you may want to consider a different ordering or perhaps facets. For example, consider what happens when you choose to facet vs. stacking:

df %>% 
  group_by(x) %>%
  mutate(pct = n / sum(n)) %>%
  ggplot(aes(x, pct, fill = fill)) +
  geom_col() +
  facet_wrap(~ fill)

enter image description here

Update 2019-02-18 (per comments) Updating to abstract away knowing z ahead of time. Ordering by mean(pct) for each fill followed by pct:

df %>%
  group_by(x) %>%
  mutate(pct = n / sum(n)) %>%
  group_by(fill) %>%
  mutate(mean_pct = mean(pct)) %>%
  arrange(desc(mean_pct), desc(pct)) %>%
  mutate(order = row_number()) %>%
  ggplot(aes(fct_reorder(x, order), pct, fill = fill)) +
  geom_col()

Upvotes: 1

Related Questions