Dan Byström
Dan Byström

Reputation: 9244

Unicode string literals in C# vs C++/CLI

C#:
char z = '\u201D';
int i = (int)z;

C++/CLI:
wchar_t z = '\u201D';
int i = (int)z;

In C# "i" becomes, just as I expect, 8221 ($201D). In C++/CLI on the other hand, it becomes 65428 ($FF94). Can some kind soul explain this to me?

EDIT: Size of wchar_t can not be of issue here, because:

C++/CLI:
wchar_t z = (wchar_t)8221;
int i = (int)z;

Here too, i becomes 8221, so wchar_t is indeed up to the game of holding a 16-bit integer on my system. Ekeforshus

Upvotes: 1

Views: 2001

Answers (2)

plinth
plinth

Reputation: 49179

You want:

wchar_t z = L'\x201D';

from here. \u is undefined.

Upvotes: 4

kͩeͣmͮpͥ ͩ
kͩeͣmͮpͥ ͩ

Reputation: 7846

According to wikipedia:

"The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers."

You shouldn't make any assumptions about how it's implemented.

Upvotes: 0

Related Questions