Reputation: 176
as you can see in the title, when do the STL std::sort() stop the comparison?
I mean I have a vector like that:
city name :: Marseille
city name :: Mont Saint Martin
city name :: Mont de Marsan
and the sort doesn't change this order, for me, it should be:
city name :: Marseille
city name :: Mont de Marsan
city name :: Mont Saint Martin
I've already tried those syntaxes:
std::sort(vector.begin(), vector.end());
std::sort(vector.begin(), vector.end(), std::less<std::string>());
If std::sort() stop at the first space, is it a way to get around that and how, please?
Upvotes: 1
Views: 156
Reputation: 490728
Most typical systems use ASCII or some derivative of it. In ASCII, all the lower case letters come after all the upper case letters, so 'A' < 'Z'
, and 'a' < 'z'
and (the part you may not have previously expected, 'Z' < 'a'
. That is, the order (with some others interspersed in between) is A..Za..z
.
When alphabetizing, most people (apparently including you) would generally prefer something like AaBbCc...Zz
instead.
We could implement this with a table specifying the relative order we want--but this requirement is common enough that the standard library already provides for it. std::locale
includes a collate
facet, which overloads operator()
to do a comparison suitable for that locale. That overload will be used automatically by std::sort
if we specify the locale as the comparison operator, so we can do something like this:
std::sort(cities.begin(), cities.end(), std::locale(""));
The "locale without a name" selects the locale for which the computer has been configured, so it's typically a fairly save choice. It looks like you're dealing with French, where you also have letters with accents and graves and such. The locale should know how to sort those correctly as well.
If you need to specify sorting for some specific locale (regardless of how the user's computer is configured) you can do that as well. For example, if I wanted to use French-Canadian sorting even though my computer is configured for US English, I could specify:
std::sort(cities.begin(), cities.end(), std::locale("fr-CA"));
The exact set of strings that are accepted varies with the compiler. The only ones listed in the standard are "C" (which is what you already got by default), and "". It's up to the implementer to decide on what others to support. The "fr-CA"
I used above is supported by Microsoft's compiler, but if you were using gcc on Linux (for example) you might need to specify some other string to get the same result.
At least with Microsoft's current compiler, either ""
or "fr-CA"
will do to sort these strings as you want them:
Marseille
Mont de Marsan
Mont Saint Martin
For these characters, almost any locale other than "C" will probably do the job. If you might have diacritical marks, however, you'll just about need the right locale to get them correct.
Upvotes: 3