Reputation: 91
What is the default unicode character encoding used in Windows? Specifically in Windows Programming (Win32 and WinRT). When I programmed in WinAPI, "char" maps to a 1 byte character storage and "wchar_t" maps to a 2 byte character storage. If UTF-16 encodes all the characters beyond 65536 in 4 bytes then how do Windows map these characters in a "wchar_t" data type? I know that my question is not clear enough but I hope you understand some of my concerns. Thank you very much!
Upvotes: 4
Views: 6323
Reputation: 595349
Windows uses UTF-16LE for all things Unicode (except for MultiByteToWideChar()
and WideCharToMultiByte()
, which support UTF-7, UTF-8, and UTF-16, amongst other charsets installed in the OS). UTF-16 uses surrogate pairs (2 16bit values working together) to encode Unicode values above 0xFFFF. For example, Unicode codepoint U+1D11E is encoded as 0xD834 0xDD1E
(bytes 0x34 0xD8 0x1E 0xDD
) in UTF-16LE.
Upvotes: 3