Reputation: 181
I have a dataframe that contains a rating column.
I would like to sort using a custom order, e.g.
AAA, AA, A, BBB, BB, B
However, default R sorting (using dplyr::arrange)
results in
A AA AAA B BB BBB
data.frame(Rating=c('AAA','AA','A','B','BB','BBB'),
Value1=c(1,2,3,4,5,6),
Value2=c(2,3,4,5,3,2)) %>%
arrange(Rating)
I found many links referring to same problem but they are not related to dataframe eg customize the sort function in R
How can I sort my data using a custom order?
Upvotes: 1
Views: 846
Reputation: 28685
If you want to keep the column as character rather than factor, you can arrange based on a match
to the order vector
rating_order <- c("AAA", "AA", "A", "BBB", "BB", "B")
df <-
data.frame(
Rating = c("A", "AA", "AAA", "B", "BB", "BBB"),
Value1 = c(1, 2, 3, 4, 5, 6),
Value2 = c(2, 3, 4, 5, 3, 2)
)
library(dplyr, warn.conflicts = FALSE)
df %>%
arrange(match(Rating, rating_order))
#> Rating Value1 Value2
#> 1 AAA 3 4
#> 2 AA 2 3
#> 3 A 1 2
#> 4 BBB 6 2
#> 5 BB 5 3
#> 6 B 4 5
Created on 2022-01-20 by the reprex package (v2.0.1)
Upvotes: 2
Reputation: 44867
If you know the complete list of possible ratings, the easiest way is to make it a factor with the values in order, e.g.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data.frame(Rating=factor(c('A', 'AA','AAA','B','BB','BBB'),
levels=c('AAA','AA','A','BBB','BB','B')),
Value1=c(1,2,3,4,5,6),
Value2=c(2,3,4,5,3,2)) %>%
arrange(Rating)
#> Rating Value1 Value2
#> 1 AAA 3 4
#> 2 AA 2 3
#> 3 A 1 2
#> 4 BBB 6 2
#> 5 BB 5 3
#> 6 B 4 5
Created on 2022-01-20 by the reprex package (v2.0.1)
Upvotes: 1
Reputation: 4067
Here is one approach using dplyr
.
In short, first sort by the letter grade, and then again by the number of letters. This would not work for ratings such as AAB
, but from what I gather from your example this isn't the case.
library(dplyr)
data.frame(Rating=c('AAA','AA','A','B','BB','BBB'),
Value1=c(1,2,3,4,5,6),
Value2=c(2,3,4,5,3,2)) %>%
mutate(grade = substr(Rating, 1,1), # Create a column with letter grade
count = nchar(Rating)) %>% # Create a column with number of letters
arrange(grade, count) %>% # Sort by grade, then count
select(-grade, count) # Optional, removes intermediary columns
#> Rating Value1 Value2 count
#> 1 A 3 4 1
#> 2 AA 2 3 2
#> 3 AAA 1 2 3
#> 4 B 4 5 1
#> 5 BB 5 3 2
#> 6 BBB 6 2 3
Created on 2022-01-20 by the reprex package (v0.3.0)
Upvotes: 1