Reputation: 97
there is a question makes me feel confused. What the exactly difference between std::codecvt and std::codecvt_utf8? As the STL reference saying, std::codecvt_utf8 is a drived class from std::codecvt, but could you please tell me why this function would throw an exception?
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> cvtUtf8 { new std::codecvt_byname<wchar_t, char, std::mbstate_t>(".65001") };
std::wstring_convert<std::codecvt_utf8<wchar_t>> cvt_utf8;
std::string strUtf8 = cvt_utf8.to_bytes(L"你好");
std::string strUtf8Failed = cvtUtf8.to_bytes(L"你好"); // throw out an exception. bad conversion
Upvotes: 2
Views: 5065
Reputation: 238401
codecvt
is a template intended to be used as a base of a conversion facet for converting strings between different encodings and different sizes of code units. It has a protected destructor, which practically prevents it from being used without inheritance.
codecvt<wchar_t, char, mbstate_t>
specialization in particular is a conversion facet for "conversion between the system's native wide and the single-byte narrow character sets".
codecvt_utf8
inherits codecvt
and is facet is for conversion between "UTF-8 encoded byte string and UCS2 or UCS4 character string". It has a public destructor.
If the system native wide encoding is not UCS2 or UCS4 or if system native narrow encoding isn't UTF-8, then they do different things.
could you please tell me why this function would throw an exception?
Probably because the C++ source file was not encoded in the same encoding as the converter expects the input to be.
new std::codecvt<wchar_t, char, std::mbstate_t>(".65001")
codecvt
has no constructor that accepts a string.
It might be worth noting that codecvt
and wstring_convert
have been deprecated since C++17.
which one is the instead of codecvt?
The standard committee chose to deprecate codecvt
before providing an alternative. You can either keep using it - with the knowledge that it may be replaced by something else in future, and with the knowledge that it has serious shortcomings that are cause for deprecation - or you can do what you could do prior to C++11: implement the conversion yourself, or use a third party implementation.
Upvotes: 4