Reputation: 2090
I develop an application who has to be compatible with different charsets encoding. To do that, I always use TCHAR*
instead of char*
to define strings. Therefore I use _tcslen
to get the size of my strings.
Today, I saw on the versioning system of my company that one of my workmate edited the line where I wrote _tcslen
to use _tcsclen
instead.
The only link I found who is talking about the particularity of this function is this one and it doesn't explain the difference between those functions.
Can someone explain me the difference between _tcslen
and _tcsclen
?
Upvotes: 5
Views: 8815
Reputation: 1044
When the Windows _MBCS compiler flag is set, _tcslen maps to strlen and _tcsclen maps to _mbslen. When the Windows _UNICODE flag is set, the Generic functions both map to wcslen.
Upvotes: 2
Reputation: 91885
The _t
prefix means that these are text handling functions (actually macros) that map to different implementations, depending on whether you're compiling for "Unicode" (actually UTF-16) or not.
When you're compiling for Unicode (_UNICODE
is set), they map to the same function, wcslen
, which returns the length of the string in wide (two-byte) characters.
When you're not compiling for Unicode (_MBCS
is set), they map to different functions:
_tcslen
maps to strlen
, which returns the length of the string in bytes. This is intended so that you can allocate buffers of the correct size._tcsclen
maps to _mbslen
, the documentation for which is fairly sparse. I'm guessing, however that the c
in _tcsclen
is intended to mean characters.The difference between characters and byte is that, in a multi-byte encoding, a particular character can take between one and three bytes. Thus: _tcsclen
(_mbslen
) tells you how many characters are in the string, which is useful for rendering, and _tcslen
(strlen
) tells you how many bytes are in the string, which you need for memory allocation.
In general, if you're working primarily on Windows, you'll just compile for Unicode and be done with it. You only need to deal with other character encodings if you're talking to another system (reading/writing files, network messages, etc.), and you'll usually convert to and from UTF-8.
Note that when the Windows SDK documentation refers to "multi-byte", it means older multi-byte encodings, such as Shift-JIS, rather than UTF-8 (which is also a multi-byte encoding).
Upvotes: 5