Turan
Turan

Reputation: 142

Ordering within groups not preserved in geom_col dodged position

I have a dataframe that is a summary of a larger dataset that i tried to replicate below. I set the score column as a factor so that the naming in the plots is correct.

I want to sort this dataframe on the score group, and within the group on the count column (n). Hence, the order i would like to show my horizontal bars using ggplot: from min to max (bottom to top of graph) on the X-axis (or Y in the output since its flipped/horizontal bars), and within a score group, starting from the min score group, a descending order of count (n). The same order should be preserved for the next group (i.e. low), but theme's that didn't appear in the previous score could be inserted accordingly to their count (n) value.

I tried sorting my dataframe, but my results are not what i expect. for example, rows 5 and 6 should be switched in my sorted dataframe since cos appeared before foo in the previous score (i.e. min). I tried changing the factor levels order using reorder and also with forcats, but to no extent...

require(tidyverse)

df = tribble(
~score, ~ theme, ~ n,
5, "foo", 1,
5, "bar", 1,
4, "let", 3,
3, "let", 1,
3, "cos", 1,
3, "foo", 2,
2, "foo", 3,
2, "let", 4,
2, "cos", 5
)

data = df %>%
      group_by(score, theme) %>%
      arrange(desc(score), n) %>%
      mutate_at("score", function(x) factor(x, levels = c(1, 2, 3, 4, 5), labels = c("min", "low", "avg", "high", "max")))
data

plot = data %>%
      ggplot(mapping = aes(x = score, y = n, fill = theme)) +
      geom_col(position = position_dodge2(width = 0.9, preserve = "single")) +
      coord_flip() +
      scale_y_continuous(expand = expansion(mult = c(0, .1))) +
      guides(fill = guide_legend(ncol = 2, byrow = TRUE)) +
      labs(y = "n", x = "scoring", fill = "vars")

plot

My expected graph would be:

MAX BAR (<- unsure since equal)
MAX FOO
HIGH LET
AVG FOO
AVG LET
AVG COS
LOW FOO
LOW LET
LOW COS

Upvotes: 0

Views: 631

Answers (1)

r2evans
r2evans

Reputation: 160407

You factord the score but not the theme. Know this, though: ggplot2 is going to order them from the y-axis origin, so your order of "BAR before FOO" is better stated "BAR above FOO" or "BAR after FOO", which means in factors "FOO before BAR".

df$theme <- factor(df$theme, levels = rev(c("bar", "foo", "let", "cos")))
# run the 'data' and 'plot' code, unchanged

(It's not strictly necessary to use rev here, its use is purely demonstrative, declaring that the order of what we think is more important -- "bar" on top -- is opposite the direction ggplot is using.)

enter image description here

If you want the same colors as before my change, then add

... +
  scale_fill_manual(values = c(bar="#F8766D", foo="#00BFC4", cos="#7CAE00", let="#C77CFF"))

(I derived the colors by using gg_color_hue(4) and then reordering the values= vector to get it right.)

Upvotes: 1

Related Questions