Reputation: 153
What is the string terminator sequence for a UTF-16 string?
EDIT:
Let me rephrase the question in an attempt to clarify. How's does the call to wcslen()
work?
Upvotes: 15
Views: 12392
Reputation: 60902
Unicode does not define string terminators. Your environment or language does. For instance, C strings use 0x0 as a string terminator, as well as in .NET strings where a separate value in the String
class is used to store the length of the string.
To answer your second question, wcslen
looks for a terminating L'\0'
character. Which as I read it, is any length of 0x00
bytes, depending on the compiler, but will likely be the two-byte sequence 0x00
0x00
if you're using UTF-16 (encoding U+0000, 'NUL')
Upvotes: 17
Reputation: 108968
7.24.4.6.1 The wcslen function (from the Standard)
...
[#3] The wcslen function returns the number of wide characters that precede the terminating null wide character.
And the null wide character is L'\0'
Upvotes: 5
Reputation: 1038710
There isn't any. String terminators are not part of an encoding.
For example if you had the string ab
it would be encoded in UTF-16 with the following sequence of bytes: 61 00 62 00
. And if you had 大家
you would get 27-59-B6-5B
. So as you can see no predetermined terminator sequence.
Upvotes: 4