Reputation: 6177
I like to match some specific UTF8 chars. In my case German Umlauts. Thats our example code:
{UTF-8 file}
<?php
$search = 'ä,ö,ü';
$replace = 'ae,oe,ue';
$string = str_replace(explode(',', $search), explode(',', $replace), $string);
?>
This code is UTF-8. Now I like to ensure that this will work independent of (most) used charsets of the code.
Is this the way I should go (used UTF-8 check)?
{ISO file}
<?php
$search = 'ä,ö,ü';
$search = preg_match('~~u', $search) ? $search : utf8_encode($search);
$replace = 'ae,oe,ue';
$string = str_replace(explode(',', $search), explode(',', $replace), $string);
?>
Upvotes: 1
Views: 236
Reputation: 522005
utf8_encode
is not guaranteed to fix your problem at all.To be 100% agnostic of your source code file's encoding, denote your characters as raw bytes:
$search = "\xC3\xA4,\xC3\xB6,\xC3\xBC"; // ä, ö and ü in UTF-8
Note that this still won't guarantee what encoding $string
will be in, you'll need to know and/or control its encoding separately from this issue at hand. At some point you just have to nail down your used encodings, you can't be agnostic of it all the way through.
Upvotes: 1