Reputation: 880
I've been googling for a bit, also search here but can find a solution. I'm using PHP. I'm reading a text string (part of X509 cert) and it encoded é to \xC3\xA9 (André => Andr\xC3\xA9).
I've tried MonkeyPhysics's solution:
preg_replace("#(\\\x[0-9A-F]{2})#ei", "chr(hexdec('\\1'))", $string);
but then I get André
I've played around with the replacement part;
mb_convert_encoding('&#' . hexdec('\\1') . ';', 'ISO-8859-1', 'UTF-8')
(Also the to_encoding and from_encoding)
I've also looked at How to transliterate non-latin scripts? but got no closer.
Surely this should be a standard conversion?
Upvotes: 1
Views: 261
Reputation: 785581
Use of e
modifier is deprecated in PHP now. You need to use preg_replace_callback
instead with /u
modifier for handling unicode strings.
$string = 'His nickname was \xE2\x80\x98the Angel\xE2\x80\x99,
which is kind of a clich\xC3\xA9 in my opinion.';
$repl = preg_replace_callback("#(\\\x[0-9A-F]{2})#ui",
function ($m) { return chr(hexdec($m[1])); }, $string);
His nickname was ‘the Angel’,
which is kind of a cliché in my opinion.
Upvotes: 1