Reputation: 883
I have data that I would like to order by the highest average by a group and then plot them as a stacked bar. I have managed to do this by producing several dataframes, but it's verbose and I am wondering if there was a less verbose way of doing this?
set.seed(3)
x <- rep(letters[1:5], 3)
fill <- rep(letters[24:26], 5)
n <- runif(15, 0, 1)
df <- data.frame(x, fill, n)
df2 <- df %>%
group_by(x) %>%
mutate(percent = n/sum(n))
df3 <- df2 %>%
group_by(fill) %>%
summarise(mean = mean(percent)) %>%
ungroup() %>%
arrange(desc(mean))
df3 <- df2[df2$fill == df3$fill[1], ] %>%
arrange(desc(percent))
df$x <- factor(df$x, levels = df3$x)
ggplot(data = df, aes(x, y, fill = fill)) +
geom_col(position = position_fill())
Upvotes: 1
Views: 467
Reputation: 20483
I am not sure if this is necessarily better, but here's one approach that yields the same graph in your question:
df %>%
group_by(x) %>%
mutate(pct = n / sum(n)) %>%
ungroup() %>%
arrange(fill != "z", desc(pct)) %>%
group_by(fill) %>%
mutate(order = row_number()) %>%
ggplot(aes(fct_reorder(x, order), pct, fill = fill)) +
geom_col()
Depending on what you are actually trying to compare, you may want to consider a different ordering or perhaps facets. For example, consider what happens when you choose to facet vs. stacking:
df %>%
group_by(x) %>%
mutate(pct = n / sum(n)) %>%
ggplot(aes(x, pct, fill = fill)) +
geom_col() +
facet_wrap(~ fill)
Update 2019-02-18 (per comments)
Updating to abstract away knowing z
ahead of time. Ordering by mean(pct)
for each fill
followed by pct
:
df %>%
group_by(x) %>%
mutate(pct = n / sum(n)) %>%
group_by(fill) %>%
mutate(mean_pct = mean(pct)) %>%
arrange(desc(mean_pct), desc(pct)) %>%
mutate(order = row_number()) %>%
ggplot(aes(fct_reorder(x, order), pct, fill = fill)) +
geom_col()
Upvotes: 1