user3162468
user3162468

Reputation: 435

PHP difference between Single-byte strings and Multi-byte strings

For dummies, in PHP what is the difference between single-byte strings and multi-byte strings and in which situations should we consider one or another?

For single-byte strings (e.g. US-ASCII, ISO 8859 family, etc.) use substr and for multi-byte strings (e.g. UTF-8, UTF-16, etc.) use mb_substr:

// singlebyte strings
$result = substr($myStr, 0, 5);
// multibyte strings
$result = mb_substr($myStr, 0, 5);

For instance, if I plan to develop something to be used in china, do I need to adopt any special measures because of their special characters ? Isnt' Utf-8 encoding good enough?

Upvotes: 3

Views: 2119

Answers (1)

ZhukV
ZhukV

Reputation: 3188

The function strlen (Single bytes) returned full count bytes, and function mb_strlen returned count characters!

The char can be have a more then 1 byte (UTF-8 for example).

For you example:

$myStr = '៘៥឴ឨឆ';
$result = substr($myStr, 0, 5);
$result = mb_substr($myStr, 0, 5, mb_detect_encoding($myStr));

Function substr in this example return invalid value, because chars have more the one byte, but function mb_substr returned correct data.

Upvotes: 3

Related Questions