nEAnnam
nEAnnam

Reputation: 1256

Unicode working in PHP

Can someone explain why all this code works normally if PHP is only supposed to support a 256-character set?

I know that Content-Type tag interpret these characters if is on UTF-8. But why PHP work it?

echo "匝";

if (preg_match('/啊/', "啊"))
    echo "Match";

if (preg_match('/\w/', "啊"))
    echo "Match";

Upvotes: 1

Views: 249

Answers (2)

zerkms
zerkms

Reputation: 255055

Compare your code to:

if (preg_match('/^\w$/', "啊"))
    echo "Match";

regex /\w/ works because your multibyte char contains of 2 bytes: 0x53 and 0x1D. And first one, 0x53 looks like a valid single-byte char S

PS: this is valid way to match one multibyte letter:

var_dump(preg_match('/^\p{L}$/u', "匝", $matches));

Upvotes: 1

Jay Sidri
Jay Sidri

Reputation: 6406

Most likely that your PCRE has been compiled with Unicode support enabled (--enable-utf8 --enable-unicode-properties) which would cause preg_match() to match unicode characters.

Upvotes: 0

Related Questions