srikantrao
srikantrao

Reputation: 198

Sorting Algorithm in R

I had a question related to the sorting algorithm in R.

if I use order() to sort a particular column, the shorter string is not what is sorted first.

To give you an example: I had to sort a column of character type and it puts firearm_weight above fire_weigh and this is not how the dictionary way of sorting strings anyways.

How can I change this while using the order() command?

Thanks!

Upvotes: 1

Views: 262

Answers (1)

Roland
Roland

Reputation: 132706

"_" < "a" is TRUE on my system and locale.

help("Comparison") is relevant here:

Comparison of strings in character vectors is lexicographic within the strings using the collating sequence of the locale in use: see locales. The collating sequence of locales such as en_US is normally different from C (which should use ASCII) and can be surprising. Beware of making any assumptions about the collation order: [...] Collation of non-letters (spaces, punctuation signs, hyphens, fractions and so on) is even more problematic.

You could substitute "_" with something that is ordered after "z" on your system. E.g., a "µ" on my system.

Upvotes: 2

Related Questions