BullyWiiPlaza
BullyWiiPlaza

Reputation: 19243

Working with UTF-8 std::string objects in C++

I'm using Visual Studio and C++ on Windows to work with small caps text like ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ using e.g. this website. Whenever I read this text from a file or put this text directly into my source code using std::string, the text visualizer in Visual Studio shows it in the wrong encoding, presumably the visualizer uses Windows (ANSI). How can I force Visual Studio to let me work with UTF-8 strings properly?

std::string message_or_file_path = "...";
auto message = message_or_file_path;

// If the file path is valid, read from that file
if (GetFileAttributes(message_or_file_path.c_str()) != INVALID_FILE_ATTRIBUTES
    && GetLastError() != ERROR_FILE_NOT_FOUND)
{
    std::ifstream file_stream(message_or_file_path);
    std::string text_file_contents((std::istreambuf_iterator<char>(file_stream)),
        std::istreambuf_iterator<char>());
    message = text_file_contents; // Displayed in wrong encoding
    message = "ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ"; // Displayed in wrong encoding
   std::wstring wide_message = L"ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ"; // Displayed in correct encoding
}

I tried the additional command line option /utf-8 for compiling and setting the locale:

std::locale::global(std::locale(""));
std::cout.imbue(std::locale());

Neither of those fixed the encoding issue.

Upvotes: 2

Views: 1981

Answers (2)

R Sahu
R Sahu

Reputation: 206747

From What’s Wrong with My UTF-8 Strings in Visual Studio?, there are a couple of ways to see the contents of a std::string with UTF-8 encoding.

Let's say you have a variable with the following initialization:

std::string s2 = "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c";

Use a Watch window.

  • Add the variable to Watch.
  • In the Watch window, add ,s8 to the variable name to display its contents as UTF-8.

Here's what I see in Visual Studio 2015.

image

Use the Command Window.

  • In the Command Window, use ? &s2[0],s8 to display the text as UTF-8.

Here's what I see in Visual Studio 2015.

image

Upvotes: 5

BullyWiiPlaza
BullyWiiPlaza

Reputation: 19243

A working solution was simply rewriting all std::strings as std::wstrings and adjusting the code logic properly to work with std::wstrings, as indicated in the question as well. Now everything works as expected.

Upvotes: 0

Related Questions