Todd Jenk
Todd Jenk

Reputation: 311

substr() foreign characters gives me black diamonds with white questions marks

I'm taking foreign (Japanese) characters from a database and using substr() to limit the length of the string.

However when I do this it cuts off a character from the string and that leaves behind one of those question marks in black diamonds as a replacement character (�)

Everything (Documents, Charset, table encoding) are set to UTF-8.

Here is an example of what happens

$string = "日本最大級のポータルサイト。"
echo substr($string, 0,10); 

Which outputs 日本最�

How do you reccomend I find/replace this question mark icon?

Upvotes: 3

Views: 1558

Answers (2)

Alma Do
Alma Do

Reputation: 37365

You can not use substr() when dealing with UTF-strigs since each symbol there will be represented as multiple bytes, not single byte (for non-ASCII characters). And substr() works with bytes. Instead you should use mb_substr() which will safely and correct return desired result.

To work with multibyte strings in PHP there is mbstring extension, and mb_substr() is part of it.

Upvotes: 5

tamewhale
tamewhale

Reputation: 1061

You should use mb_substr() so long as it is enabled on your server.

http://php.net/manual/en/book.mbstring.php

Upvotes: 0

Related Questions