Reputation: 3614
If I am in the ANSI codepage only environment.
Does this conversion wide char
to char
:
char ansi_cstr[size_of_ansi_str];
WideCharToMultiByte(CP_ACP, 0, ansi_wstr.c_str(), -1, ansi_str, size_of_ansi_str, 0, 0);
std::string ansi_str = std::string(ansi_cstr);
equal to following
std::string ansi_str = std::string(ansi_wstr.begin(), ansi_wstr.end());
and char
to wide char
wchar_t ansi_wcstr[size_of_ansi_str];
MultiByteToWideChar(CP_ACP, 0, ansi_str.c_str(), -1, ansi_wcstr, size_of_ansi_str);
std::wstring ansi_wstr = std::wstring(ansi_wcstr);
equal to
std::wstring ansi_wstr = std::wstring(ansi_str.begin(), ansi_str.end());
Are these two cases remain the same behavior in the ansi codepage only environment?
Upvotes: 0
Views: 3915
Reputation: 595392
WideCharToMultiByte(CP_ACP, 0, ansi_wstr.c_str(), -1, ansi_str, size_of_ansi_str, 0, 0);
IS NOT the same as
std::string ansi_str = std::string(ansi_wstr.begin(), ansi_wstr.end());
WideCharToMultiByte()
performs a real conversion from UTF-16 to ANSI using the codepage that CP_ACP
refers to on that PC (which can be different on each PC based to user locale settings). std::string(begin, end)
merely loops through the source container type-casting each element to char
and does not perform any codepage conversion at all.
Likewise:
MultiByteToWideChar(CP_ACP, 0, ansi_str.c_str(), -1, ansi_wcstr, size_of_ansi_str);
IS NOT the same as
std::wstring ansi_wstr = std::wstring(ansi_str.begin(), ansi_str.end());
For the same reason. MultiByteToWideChar()
performs a real conversion from ANSI to UTF-16 using the CP_ACP
codepage, whereas std::wstring(begin, end)
simply type-casts the source elements to wchar_t
without any conversion at all.
The type-casts would be equivelent to the API conversions ONLY if the source strings are using ASCII characters in the 0x00-0x7F
range. But if they are using non-ASCII characters, all bets are off.
Upvotes: 1
Reputation: 179779
There's no such thing as the ANSI code page environment. There are dozens.
Your two "shortcut" conversions are incorrect in all of them.
The conversion from ASCII char to UTF-16 wchar_t
would work with your last method, but this fails with the second half of most ANSI code pages. It works best with the Western European code page, where it gets ~only 32 characters wrong. For instance. the Euro sign € will always be mis-converted.
Upvotes: 4