Joseph U.
Joseph U.

Reputation: 4607

How do I detect and strip out UTF-8 characters within PHP

I am generating CSV files. Occasionally the data source will pass along characters with accents etc... that I would like to strip out. Is there a reasonably straightforward way to detect and strip out UTF-8 characters?

Upvotes: 0

Views: 98

Answers (2)

MatsLindh
MatsLindh

Reputation: 52822

If you're sure you're getting UTF-8 as input, use iconv to convert the values to the encoding you're using in your output - detecting UTF-8 chars isn't failsafe (as the values are valid iso-8859-1 characters as well (or all 8 bit encodings, really).

If you just want to use the regular ascii set of values (byte-values 0 - 127), you can let iconv convert to the 'ascii' encoding and transliterate:

iconv("utf-8", "ascii//TRANSLIT", "Hei og hå")

will result in

hei og ha

being returned.

Upvotes: 1

Fluffeh
Fluffeh

Reputation: 33512

utf8_decode($string)

This can however garble some characters which are available in utf-8 but not in iso88591

Upvotes: 0

Related Questions