Reputation: 1809
Excuse me if the question is stupid, it's kind of confused me, suppose I have a application(no matter C, C++,.NET or Java) on my Windows XP, and this application will get data from a remote machine, the data contain Chinese characters, now if Chinese characters become junk, is it correct to say that Windows has nothing to do with this issue? because Windows uses UTF-16, and can handle Chinese characters properly.
On the other hand, suppose Windows uses ASCII as its internal encoding, does this mean that any applications on it can never display Chinese characters correctly?
Thanks in advance.
Upvotes: 2
Views: 4849
Reputation: 62048
The Windows NT kernel uses UNICODE_STRING for many (or is it most?) named objects (e.g. files). The encoding is UTF-16.
Many of user-mode callable APIs expose pairs of almost identical functions, where one in the pair accepts Unicode strings and, the other, ANSI strings. The ANSI string versions end up converting names from ANSI to Unicode.
For example, when you call C's fopen() function, which accepts 8-bit non-Unicode file names, it ends up invoking CreateFileA() (ANSI), and that eventually calls NtCreateFile(), which accepts Unicode file names. One of NtCreateFile()'s parameters, the OBJECT_ATTRIBUTES structure, contains a pointer to a UNICODE_STRING structure.
If you, on the other hand, call MSVC++'s _wfopen() function, it will reach NtCreateFile() through CreateFileW() (Unicode) without the conversion.
Upvotes: 5
Reputation: 522005
To store any text in memory and display it on screen, the OS needs to handle that text in some encoding behind the scenes. What encoding that is specifically shouldn't matter to you. It could handle it as HTML encoded ASCII for all you know, as long as the APIs accept certain text and it outputs the right thing.
"Windows uses UTF-16 internally" means Windows happens to store and handle text internally as UTF-16. It also supports Chinese text. These two things aren't necessarily connected. Yes, using UTF-16 internally makes it easier to support Chinese, which is probably why the Windows engineers chose to go with UTF-16.
Upvotes: 0