Reputation: 1839
Today I noticed an interesting sorting behavior in C#. I have two lists and I sort them:
var list1 = new List<string> { "A", "B", "C" };
var list2 = new List<string> { "AA", "BB", "CC" };
list1.Sort();
list2.Sort();
The two lists now contain:
>> list1
[0]: "A"
[1]: "B"
[2]: "C"
>> list2
[0]: "BB"
[1]: "CC"
[2]: "AA"
Why is the AA put in the end?
Here is a demonstration: http://ideone.com/QCeUjx
Upvotes: 6
Views: 1393
Reputation: 460138
You can also use the overload of List.Sort
to ignore the current culture. Ordinal
performs a simple byte comparison that is independent of the current language:
list1.Sort(StringComparer.Ordinal);
Here are some informations: Normalization and Sorting
Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. Consequently, two strings can look identical but actually consist of different characters. The existence of multiple representations for a single character complicates sorting operations. The solution to this problem is to normalize each string, then use an ordinal comparison to sort the strings....
Upvotes: 2
Reputation: 5802
Yeah you can change that current locale setting use following code line.
var list1 = new List<string> { "A", "B", "C" };
var list2 = new List<string> { "BB", "AA", "CC" };
Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US");
list1.Sort();
list2.Sort();
Upvotes: 0
Reputation: 1839
It turns out that since I am using Danish culture settings, .NET assumes that "AA" is the Danish letter "Å" which is at the end of the Danish alphabet.
Setting the locale to en-US
gives me the sort order I expected ("AA", "BB", "CC").
This article has some background information.
Upvotes: 10