Reputation: 198
I had a question related to the sorting algorithm in R.
if I use order() to sort a particular column, the shorter string is not what is sorted first.
To give you an example: I had to sort a column of character type and it puts firearm_weight above fire_weigh and this is not how the dictionary way of sorting strings anyways.
How can I change this while using the order() command?
Thanks!
Upvotes: 1
Views: 262
Reputation: 132706
"_" < "a"
is TRUE
on my system and locale.
help("Comparison")
is relevant here:
Comparison of strings in character vectors is lexicographic within the strings using the collating sequence of the locale in use: see locales. The collating sequence of locales such as en_US is normally different from C (which should use ASCII) and can be surprising. Beware of making any assumptions about the collation order: [...] Collation of non-letters (spaces, punctuation signs, hyphens, fractions and so on) is even more problematic.
You could substitute "_" with something that is ordered after "z" on your system. E.g., a "µ" on my system.
Upvotes: 2