Reputation: 21400
I have a dataframe with three variables; one ("group") is a factor with two levels, one ("word") is a character vector, and one ("duration") is numeric. For example:
DATA <- data.frame(
group = c(rep("prefinal",10), rep("final", 10)),
word = c(sample(LETTERS[1:5], 10, replace = T), sample(LETTERS[1:5], 10, replace = T)),
duration = rnorm(20)
)
DATA
group word duration
1 prefinal C 0.16378771
2 prefinal E 0.13370196
3 prefinal A 0.69112398
4 prefinal B 0.21499187
5 prefinal D -0.28998279
6 prefinal D -2.00353522
7 prefinal A 0.37842555
8 prefinal E 1.62326170
9 prefinal A -0.26294929
10 prefinal B -0.54276322
11 final D 1.32772171
12 final E -1.84902285
13 final C 0.01058158
14 final E 1.49529743
15 final B 0.55291290
16 final A -0.35484820
17 final D -0.16822110
18 final A 0.88667458
19 final E 0.70889916
20 final B 1.12217332
I'd like to depict the durations of the words by group in boxplots:
boxplot(DATA$duration ~ DATA$group + DATA$word,
xaxt="n",
col = rep(c("blue", "red"), 5))
axis(1, at = seq(from=1.5, to= 10.5, by=2), labels = sort(unique(DATA$word)), cex.axis = 0.9)
R seems to order the boxes in alphabetical order (of the "word" variable) by default.
EDIT:
However I'd prefer that the boxes be sorted by the median durations (in descending order) the items in the "word" variable have in the "prefinal" group. How can that be achieved?
Upvotes: 1
Views: 409
Reputation: 6441
You can reorder the levels of DATA$word
according to their median. The -
before DATA$duration
is to sort it in descending order.
DATA$word <- reorder(DATA$word, -DATA$duration, FUN = median)
boxplot(DATA$duration ~ DATA$group + DATA$word,
xaxt="n",
col = rep(c("blue", "red"), 5))
axis(1, at = seq(from=1.5, to= 10.5, by=2), labels = levels(DATA$word), cex.axis = 0.9)
You can do the same for the subgroup of prefinal
. But it requires an additional step:
ordered_levels <- levels(with(DATA[DATA$group == "prefinal",], reorder(word, -duration, FUN = median)))
DATA$word <- factor(DATA$word, levels = ordered_levels)
Upvotes: 1