Reputation: 17631
I have a column in a data.table
full of strings in the format string+integer. e.g.
string1, string2, string3, string4, string5,
When I use sort()
, I put these strings in the wrong order.
string1, string10, string11, string12, string13, ..., string2, string20,
string21, string22, string23, ....
How would I sort these to be in the order
string01, string02, string03, string04, strin0g5, ... , string10,, string11,
string12, etc.
One method could be to add a 0
to each integer <10
, 1-9
?
I suspect you would extract the string with str_extract(dt$string_column, "[a-z]+")
and then add a 0
to each single-digit integer...somehow with sprintf()
Upvotes: 2
Views: 182
Reputation: 11128
Assuming the string is something like below:
library(data.table)
library(stringr)
xstring <- data.table(x = c("string1","string11","string2",'string10',"stringx"))
extracts <- str_extract(xstring$x,"(?<=string)(\\d*)")
y_string <- ifelse(nchar(extracts)==2 | extracts=="",extracts,paste0("0",extracts))
fin_string <- str_replace(xstring$x,"(?<=string)(\\d*)",y_string)
sort(fin_string)
Output:
> sort(fin_string)
[1] "string01" "string02" "string10" "string11"
[5] "stringx"
Upvotes: 1
Reputation: 32548
You could use the str_extract
of stringr
package to obtain the digits and order
according to that
x = c("string1","string3","stringZ","string2","stringX","string10")
library(stringr)
c(x[grepl("\\d+",x)][order(as.integer(str_extract(x[grepl("\\d+",x)],"\\d+")))],
sort(x[!grepl("\\d+",x)]))
#[1] "string1" "string2" "string3" "string10" "stringX" "stringZ"
Upvotes: 1
Reputation: 12937
You could go for mixedsort
in gtools
:
vec <- c("string1", "string10", "string11", "string12", "string13","string2",
"string20", "string21", "string22", "string23")
library(gtools)
mixedsort(vec)
#[1] "string1" "string2" "string10" "string11" "string12" "string13"
# "string20" "string21" "string22" "string23"
Upvotes: 1
Reputation: 887213
We can remove the characters that are not numbers to do the sort
ing
dt[order(as.integer(gsub("\\D+", "", col1)))]
Upvotes: 6