Mocking Bird
Mocking Bird

Reputation: 176

When does std::sort stop the comparison

as you can see in the title, when do the STL std::sort() stop the comparison?

I mean I have a vector like that:

city name :: Marseille
city name :: Mont Saint Martin
city name :: Mont de Marsan

and the sort doesn't change this order, for me, it should be:

city name :: Marseille
city name :: Mont de Marsan
city name :: Mont Saint Martin

I've already tried those syntaxes:

std::sort(vector.begin(), vector.end());
std::sort(vector.begin(), vector.end(), std::less<std::string>());

If std::sort() stop at the first space, is it a way to get around that and how, please?

Upvotes: 1

Views: 156

Answers (1)

Jerry Coffin
Jerry Coffin

Reputation: 490728

Most typical systems use ASCII or some derivative of it. In ASCII, all the lower case letters come after all the upper case letters, so 'A' < 'Z', and 'a' < 'z' and (the part you may not have previously expected, 'Z' < 'a'. That is, the order (with some others interspersed in between) is A..Za..z.

When alphabetizing, most people (apparently including you) would generally prefer something like AaBbCc...Zz instead.

We could implement this with a table specifying the relative order we want--but this requirement is common enough that the standard library already provides for it. std::locale includes a collate facet, which overloads operator() to do a comparison suitable for that locale. That overload will be used automatically by std::sort if we specify the locale as the comparison operator, so we can do something like this:

std::sort(cities.begin(), cities.end(), std::locale(""));

The "locale without a name" selects the locale for which the computer has been configured, so it's typically a fairly save choice. It looks like you're dealing with French, where you also have letters with accents and graves and such. The locale should know how to sort those correctly as well.

If you need to specify sorting for some specific locale (regardless of how the user's computer is configured) you can do that as well. For example, if I wanted to use French-Canadian sorting even though my computer is configured for US English, I could specify:

std::sort(cities.begin(), cities.end(), std::locale("fr-CA"));

The exact set of strings that are accepted varies with the compiler. The only ones listed in the standard are "C" (which is what you already got by default), and "". It's up to the implementer to decide on what others to support. The "fr-CA" I used above is supported by Microsoft's compiler, but if you were using gcc on Linux (for example) you might need to specify some other string to get the same result.

At least with Microsoft's current compiler, either "" or "fr-CA" will do to sort these strings as you want them:

Marseille
Mont de Marsan
Mont Saint Martin

For these characters, almost any locale other than "C" will probably do the job. If you might have diacritical marks, however, you'll just about need the right locale to get them correct.

Upvotes: 3

Related Questions