user213544
user213544

Reputation: 2126

Sort dataframe in R (based on column values)

I would like to semi-reversely sort a dataframe in R based on values (character) in a column.

I have the following sample dataset:

# Sample data
df <- read.table(text="id value
                 cx-01    1
                 cx-01    2
                 cx-02    1
                 cx-02    2
                 cx-02    3
                 cx-03    1
                 cx-03    2 
                 px-01    1
                 px-01    2
                 px-02    1
                 px-02    2
                 px-02    3
                 px-03    1
                 px-03    2
                 rx-01    1
                 rx-01    2
                 rx-02    1
                 rx-02    2
                 rx-02    3
                 rx-03    1
                 rx-03    2", header=TRUE)

Expected output:

      id value
1  cx-03     2
2  cx-03     1
3  cx-02     3
4  cx-02     2
5  cx-02     1
6  cx-01     2
7  cx-01     1
8  rx-03     2
9  rx-03     1
10 rx-02     3
11 rx-02     2
12 rx-02     1
13 rx-01     2
14 rx-01     1
15 px-03     2
16 px-03     1
17 px-02     3
18 px-02     2
19 px-02     1
20 px-01     2
21 px-01     1

I tried to use base R's order() function, but sadly without succes. Furthermore, I tried to use the arrange function of the plyr package, however, I did not manage to order the data as desired.

Is it possible to sort the labels in the first column based on a self-provided sequence (so not alphabetically)?

Upvotes: 3

Views: 1056

Answers (2)

nsinghphd
nsinghphd

Reputation: 2022

Using with() and order() from base R

# sample data
df <- read.table(text="id value
                 cx-01    1
                 cx-01    2
                 cx-02    1
                 cx-02    2
                 cx-02    3
                 cx-03    1
                 cx-03    2 
                 px-01    1
                 px-01    2
                 px-02    1
                 px-02    2
                 px-02    3
                 px-03    1
                 px-03    2
                 rx-01    1
                 rx-01    2
                 rx-02    1
                 rx-02    2
                 rx-02    3
                 rx-03    1
                 rx-03    2", header=TRUE, stringsAsFactors=F)

# create another data frame with variables to order on
col.ord <- data.frame(t(sapply(strsplit(df$id, "-"), print)), df$value, stringsAsFactors = F)

# reorder data frame
df[with(col.ord, order(X1, -as.integer(X2), -df.value)), ]
#>       id value
#> 7  cx-03     2
#> 6  cx-03     1
#> 5  cx-02     3
#> 4  cx-02     2
#> 3  cx-02     1
#> 2  cx-01     2
#> 1  cx-01     1
#> 14 px-03     2
#> 13 px-03     1
#> 12 px-02     3
#> 11 px-02     2
#> 10 px-02     1
#> 9  px-01     2
#> 8  px-01     1
#> 21 rx-03     2
#> 20 rx-03     1
#> 19 rx-02     3
#> 18 rx-02     2
#> 17 rx-02     1
#> 16 rx-01     2
#> 15 rx-01     1

Created on 2019-04-27 by the reprex package (v0.2.1)

Upvotes: 3

akrun
akrun

Reputation: 887711

We can arrange on the numeric and the letters part of 'id' separately, along with arranging the 'value' in descending order. The letter part seems to be custom order, so either convert to factor with levels specified or use match with a vector in the same order as the expected to get the index in that order

library(tidyverse)
df %>%  
   arrange(match(str_remove(id, "-\\d+"), c("cx", "rx", "px")), 
          readr::parse_number(as.character(id)), desc(value))
#      id value
#1  cx-03     2
#2  cx-03     1
#3  cx-02     3
#4  cx-02     2
#5  cx-02     1
#6  cx-01     2
#7  cx-01     1
#8  rx-03     2
#9  rx-03     1
#10 rx-02     3
#11 rx-02     2
#12 rx-02     1
#13 rx-01     2
#14 rx-01     1
#15 px-03     2
#16 px-03     1
#17 px-02     3
#18 px-02     2
#19 px-02     1
#20 px-01     2
#21 px-01     1

Upvotes: 2

Related Questions