Priit Mets
Priit Mets

Reputation: 495

How sort stings in alphabetical and numerical order?

I have a vector of strings, which I want to sort alphabetically, and then sort by the number, which is at the end of the strings. Final output should be "AGSHIM1", "AGSHIU1", "AGSHIZ1","AGSHIH2", "AGSHIM2","AGSHIU2", "AGSHIZ2"

d<-c("AGSHIZ2", "AGSHIZ1", "AGSHIU1", "AGSHIM1", "AGSHIH2", "AGSHIM2", 
"AGSHIU2")
d[order(d,as.numeric(substr(d, nchar(d), nchar(d))))]

>"AGSHIH2" "AGSHIM1" "AGSHIM2" "AGSHIZ1" "AGSHIZ2" "AGSHIU1" "AGSHIU2"

Upvotes: 0

Views: 46

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 101753

Here is one base R option using gsub + order

> d[order(as.numeric(gsub("\\D", "", d)), d)]
[1] "AGSHIM1" "AGSHIU1" "AGSHIZ1" "AGSHIH2" "AGSHIM2" "AGSHIU2" "AGSHIZ2"

Upvotes: 1

WilliamGram
WilliamGram

Reputation: 683

What you can do is separate the number from the string, and sort by the number first, and then within each group of numbers sort alphabetically:

sortSpecial <- function(d) {
  df <- data.frame(
    original = d,
    chars = gsub("[[:digit:]]", "", d),
    nums = gsub("[^[:digit:]]", "", d)
  )
  df <- df[with(df, order(nums, chars)),]
  return(df$original)
}


d <- sortSpecial(d)

d
# [1] "AGSHIM1" "AGSHIU1" "AGSHIZ1" "AGSHIH2" "AGSHIM2" "AGSHIU2" "AGSHIZ2"

There should be a more elegant approach, I just don't know it. Nevertheless, let me know if it helps.

Update
I could not help but get inspired by Karthik S's approach. If you don't want to generate the function first, you can do the same steps as before using dplyr:

library(dplyr)
d <- data.frame(d = d) %>% 
  mutate(
    chars = gsub("[[:digit:]]", "", d),
    nums = gsub("[^[:digit:]]", "", d)
  ) %>% 
  arrange(nums, chars) %>% 
  pull(d)

Again, the steps are identical so the choice of approach is a matter of preference.

Upvotes: 2

Karthik S
Karthik S

Reputation: 11584

Another approach. But I am sure a shorter solution most likely exists.

library(dplyr)
library(stringr)
library(tibble)
d %>% as.tibble() %>% 
   transmute(dig = str_extract(value,'\\d'), ltrs = str_remove(value, '\\d')) %>% type.convert(as.is = 1) %>% 
     arrange(dig,ltrs) %>% transmute(d = str_c(ltrs,dig, sep = '')) %>% pull(d)
[1] "AGSHIM1" "AGSHIU1" "AGSHIZ1" "AGSHIH2" "AGSHIM2" "AGSHIU2" "AGSHIZ2"

Upvotes: 1

Related Questions