Reputation:
I'm not familiar with Windows at all
I'm struggling to write a function which reads from a file containing Chinese characters & does some regex.
Roughly:
std::ifstream t(input_file);
std::stringstream buffer;
buffer << t.rdbuf();
std::string page_contents = buffer.str();
...
page_contents = std::regex_replace(page_contents, std::regex("([a-z]{3})你好"), "$1再见");
This works fine on Debian, but Windows can't seem to handle the Chinese characters in the file at all. I'm cross-compiling from Debian using MXE (mingw)
I did some further testing:
#ifdef _WIN32
SetConsoleOutputCP(CP_UTF8);
setvbuf(stdout, nullptr, _IOFBF, 1000);
#endif
std::cout << "你好" << std::endl;
And found that where Debian outputted "你好" (E4 BD A0 E5 A5 BD
), Windows outputted "ä½ å¥½" (C3 A4 C2 BD C2 A0 C3 A5 C2 A5 C2 BD
)
I'm completely at a loss for how to handle this. Thanks a million in advance to anyone who can point me in the right direction
Upvotes: 0
Views: 63