Reputation: 451
I want to know if is there any way to convert a unicode code to a string or char in C++ 11. I've been trying with extended latin unicode letter Á (as an example) which has this codification:
letter: Á
Unicode: 0x00C1
UTF8 literal: \xc3\x81
I've been able to do so if it's hardcoded as:
const char* c = u8"\u00C1";
But if i got the byte sequence as a short, how can I do the equivalent to get the char* or std::string 'Á'?
EDIT, SOLUTION:
I was finally able to do so, here is the solution if anyone needs it:
std::wstring ws;
for(short input : inputList)
{
wchar_t wc(input);
ws += wc;
}
std::wstring_convert<std::codecvt_utf8<wchar_t>> cv;
str = cv.to_bytes(ws);
Thanks for the comments they were very helpful.
Upvotes: 1
Views: 2043
Reputation: 71989
The C++11 standard contains codecvt_utf8
, which converts between some internal character type (try char16_t
if your compiler has it, otherwise wchar_t
) and UTF-8 encoding.
Upvotes: 3
Reputation: 6642
The problems is that char
is only one byte length, while unicode characters require a size of two bytes.
You can still treat it as char*, but you must remember that you are not dealing with an ascii string (there will be zeros).
You may have to switch to wchar_t
.
Upvotes: 1