dlebech
dlebech

Reputation: 1839

Sorting while using specific culture - "BB" may come first before "AA" in Danish and Norwegian

Today I noticed an interesting sorting behavior in C#. I have two lists and I sort them:

var list1 = new List<string> { "A", "B", "C" };
var list2 = new List<string> { "AA", "BB", "CC" };
list1.Sort();
list2.Sort();

The two lists now contain:

>> list1
[0]: "A"
[1]: "B"
[2]: "C"

>> list2
[0]: "BB"
[1]: "CC"
[2]: "AA"

Why is the AA put in the end?

Here is a demonstration: http://ideone.com/QCeUjx

Upvotes: 6

Views: 1393

Answers (3)

Tim Schmelter
Tim Schmelter

Reputation: 460138

You can also use the overload of List.Sort to ignore the current culture. Ordinal performs a simple byte comparison that is independent of the current language:

list1.Sort(StringComparer.Ordinal);

Demonstration

Here are some informations: Normalization and Sorting

Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. Consequently, two strings can look identical but actually consist of different characters. The existence of multiple representations for a single character complicates sorting operations. The solution to this problem is to normalize each string, then use an ordinal comparison to sort the strings....

Upvotes: 2

Thilina H
Thilina H

Reputation: 5802

Yeah you can change that current locale setting use following code line.

var list1 = new List<string> { "A", "B", "C" };
var list2 = new List<string> { "BB", "AA", "CC" };

Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US");

list1.Sort();
list2.Sort();

Upvotes: 0

dlebech
dlebech

Reputation: 1839

It turns out that since I am using Danish culture settings, .NET assumes that "AA" is the Danish letter "Å" which is at the end of the Danish alphabet.

Setting the locale to en-US gives me the sort order I expected ("AA", "BB", "CC").

This article has some background information.

Upvotes: 10

Related Questions