gazdac
gazdac

Reputation: 189

Explain this sorting in JavaScript

var a = ['a100', 'a1', 'a10'];
a.sort();

This logs: ["a1", "a10", "a100"]

var a = ['f_a100_', 'f_a1_', 'f_a10_'];
a.sort();

But this logs: ["f_a100_", "f_a10_", "f_a1_"]

Can you please advise me why is that?

Upvotes: 4

Views: 127

Answers (5)

z33m
z33m

Reputation: 6043

Array.sort sorts value by converting the item to string and then doing a lexicographical sort.

Now, by lexicographical sorting all that they mean is that they compare the string characters one by one till a non matching character is found. The position of the non matching character in the character-set (where letters are ordered alphabetically) decides the rank of the string.

f_a100_
    ^
f_a1_
    ^
f_a10_
    ^

See the first non matching character. Here _ is greater than 0 (check their ascii codes) so f_a100_ and f_a10_ comes above f_a1_. Now between those two we go to the next character

f_a100_
     ^
f_a10_
     ^

Here, applying the same logic f_a100_ comes first. So the final order is ["f_a100_", "f_a10_", "f_a1_"]

This sorting order would seem logical for simple strings. But for certain other cases like yours it works weirdly because of the way the charsets are arranged. To get a desired behaviour you should write your own compare function that strips out the number part and return a positive, negative or 0 value as shown in the example.

Upvotes: 4

Halcyon
Halcyon

Reputation: 57729

Array.sort uses string sorting (even if the array is a list of numbers).

The sorting you're looking for is known as natural order sorting. In this sorting numbers are treated as numbers, 100 comes after 10, 2 comes before 10 etc.

JavaScript does not natively have a natsrt. I have written my own implementation but it's quite involved. Here is a reference implementation: http://phpjs.org/functions/strnatcmp/

If you just need to sort strings of the form f_a[0-9]+_ you can write a regular expression to extract the number part.

Upvotes: 0

Ivey
Ivey

Reputation: 499

In the first case "a1" < "a10" because when comparing two strings the "a1" portion matches but then it decides that "a1" has a shorter length.

But in the second case "f_a1_" > "f_a10_", because when comparing these two the "f_a1" portion matches and then "_" is compared to "0". And '_' > '0' because they are compared by their ascii value.

Upvotes: 0

semirturgay
semirturgay

Reputation: 4201

The javascript sort function does sorting alphanumerically not arithmetically so you get the results such. See this question that is almost same with yours Array Sort in JS

Upvotes: 0

S. A.
S. A.

Reputation: 3754

Javascript sorting is string based:

var a = ['a100', 'a1', 'a10'];
a.sort();

Will return:

["a1", "a10", "a100"]

Because of string comparison: "a1" < "a10" < "a100". In the other example, "f_a100_" < "f_a10_" because "0" < "_", and "f_a10_" < "f_a1_" for the same reason.

Indeed this:

[15, 13, 8].sort();

will return:

[13, 15, 8]

This is something a little weird, but that's how it's designed. If you want to change the ordering criteria you can pass a function as a parameter. E.g. (From here)

var points = [40,100,1,5,25,10];
points.sort(function(a,b){return a-b});

Upvotes: 1

Related Questions