kovacsmarcell
kovacsmarcell

Reputation: 461

C++ Non ASCII letters

How do i loop through the letters of a string when it has non ASCII charaters? This works on Windows!

for (int i = 0; i < text.length(); i++)
{
    std::cout << text[i]
}

But on linux if i do:

std::string text = "á";
std::cout << text.length() << std::endl;

It tells me the string "á" has a length of 2 while on windows it's only 1 But with ASCII letters it works good!

Upvotes: 0

Views: 2168

Answers (2)

Baum mit Augen
Baum mit Augen

Reputation: 50061

In your windows system's code page, á is a single byte character, i.e. every char in the string is indeed a character. So you can just loop and print them.

On Linux, á is represented as the multibyte (2 bytes to be exact) utf-8 character 'C3 A1'. This means that in your string, the á actually consists of two chars, and printing those (or handling them in any way) separately yields nonsense. This will never happen with ASCII characters because the utf-8 representation of every ASCII character fits in a single byte.

Unfortunately, utf-8 is not really supported by C++ standard facilities. As long as you only handle the whole string and neither access individual chars from it nor assume the length of the string equals the number of actual characters in the string, std::string will most likely do fine.

If you need more utf-8 support, look for a good library that implements what you need.

You might also want to read this for a more detailed discussion on different character sets on different systems and advice regarding string vs. wstring.

Also have a look at this for information on how to handle different character encodings portably.

Upvotes: 3

Avilius
Avilius

Reputation: 25

Try using std::wstring. The encoding used isn't supported by the standard as far as I know, so I wouldn't save these contents to a file without a library that handles a specific format. of some sort. It supports multi-byte characters so you can use letters and symbols not supported by ASCII.

#include <iostream>
#include <string>

int main()
{
    std::wstring text = L"áéíóú";

    for (int i = 0; i < text.length(); i++)
        std::wcout << text[i];

    std::wcout << text.length() << std::endl;
}

Upvotes: 1

Related Questions