lovespring
lovespring

Reputation: 19559

about mb string and normal string in PHP

How do I know the string is mb string? so we use mb_strlen instead of strlen ?

Upvotes: 1

Views: 321

Answers (4)

turbod
turbod

Reputation: 1988

Compare the strlen and the mb_strlen results, and if they do not match, the string contains multibyte characters.

Upvotes: 2

ZZ Coder
ZZ Coder

Reputation: 75456

No. A string is a string. There is no way to tell if it contains multiple byte characters.

You can guess with something like mb_detect_encoding() but your mileage may vary depending on the charset and encoding. For example, UTF-8 has a very distinct pattern and you will get very good result. But other encodings like GB2312 are really hard to detect.

If you are designing a new protocol or system, it's best to keep the encoding information.

Upvotes: 2

Pekka
Pekka

Reputation: 449395

You need to always know what encoding a string is in, and whether it is a multibyte one. After all, you need to pass the string's encoding as the second parameter to mb_strlen() to get reliable results, right?

The encoding of incoming data will always be defined in some way - the page's encoding when processing form data; the database connection's and tables' encoding when processing database data; and so on. It is your job to build the flow in a way that you always know what is in what encoding where.

The only exception is when you're dealing with arbitrary third party data that don't declare their content's encoding properly. It is then (and only then) when it's okay to employ sniffing functions like mb-detect-encoding() and colleagues. Remember that those functions are very error-prone and can give you only an educated guess what encoding a string is in, not hard reliable info.

Upvotes: 7

CharlesLeaf
CharlesLeaf

Reputation: 3201

Isn't mb_check_encoding or mb_detect_encoding supposed to be used for that?

Upvotes: 1

Related Questions