order data.frame based on 2 columns and vector of variables

Question

I need to order a data.frame based on 2 columns and a given vector of variables.

Here n example of my df:

df = data.frame(A = rnorm(45),
                B = rep(c('a', 'b', 'c'), each= 5, times = 3),
                C = rep(c(10, 20, 30), each = 15))

I need to change the order of col B from c('a', 'b', 'c') to c('c', 'a', 'b') while still keeping col C fixed to the 3 variables groups.

Here the first 30 rows of my desired output:

  A          B  C
 -0.11451485 c 10
 -0.11860742 c 10
  0.08156183 c 10
  1.11850750 c 10
 -0.79072556 c 10
  1.24141030 a 10
  0.88538811 a 10
 -1.35548712 a 10
  0.05723677 a 10
  0.14660464 a 10
 -0.28587107 b 10
  0.59452832 b 10
  1.00163605 b 10
  1.15892322 b 10
 -1.41771696 b 10
 -2.05743546 c 20
 -1.22835358 c 20
  1.50060736 c 20
 -0.14956114 c 20
 -1.13126592 c 20
  1.08571256 a 20
 -1.04991699 a 20
 -1.50655996 a 20
 -0.63675392 a 20
 -0.26485423 a 20
  0.30509657 b 20
  0.85471772 b 20
 -0.54064736 b 20
  0.24578056 b 20
  0.14917900 b 20

Any help will be really appreciated, thanks

www · Accepted Answer

The key is to change the level of the factor column. After that, we can use arrange from the dplyr package to sort multiple columns. Notice that in your original post, sorting column A is not a requirement. I just add column A to the arrange call to show it is easy to include more than two columns to the arrange function.

library(dplyr)

df2 <- df %>%
  # Change the level of the factor
  mutate(B = factor(B, levels = c("c", "a", "b"))) %>%
  # Arrange the column
  arrange(C, B, A)
df2
#              A B  C
# 1  -2.39317699 c 10
# 2  -1.48901928 c 10
# 3  -0.42562766 c 10
# 4   0.03383395 c 10
# 5   0.66362189 c 10
# 6  -0.65324997 a 10
# 7  -0.59408686 a 10
# 8   0.37012883 a 10
# 9   0.53238177 a 10
# 10  3.03972004 a 10
# 11 -2.03192274 b 10
# 12 -1.05138447 b 10
# 13 -0.80795342 b 10
# 14  1.74526091 b 10
# 15  2.07681466 b 10
# 16 -1.90573715 c 20
# 17 -0.72626244 c 20
# 18 -0.48017481 c 20
# 19 -0.42995920 c 20
# 20  0.17729002 c 20
# 21 -0.62947278 a 20
# 22 -0.40038152 a 20
# 23 -0.23368555 a 20
# 24  0.44218806 a 20
# 25  1.58561071 a 20
# 26 -0.66270426 b 20
# 27 -0.50256255 b 20
# 28 -0.19890974 b 20
# 29  0.26562533 b 20
# 30  1.84093124 b 20
# 31 -0.93702848 c 30
# 32  0.10804529 c 30
# 33  0.25758608 c 30
# 34  1.33084399 c 30
# 35  1.67204875 c 30
# 36 -1.88922564 a 30
# 37 -1.74551938 a 30
# 38 -1.32215854 a 30
# 39 -0.43743607 a 30
# 40  1.07554466 a 30
# 41 -0.38154167 b 30
# 42  0.53823057 b 30
# 43  0.83401316 b 30
# 44  1.04418363 b 30
# 45  2.45985490 b 30

order data.frame based on 2 columns and vector of variables

Answers (2)

Related Questions