Djave
Djave

Reputation: 9329

Convert to utf8 two byte encoded data PHP

I have some data in a database, showing as the below:

øåñÉé

Judging from this ø should be a Ÿ. I'm not sure of a few things, but so far my research seems to be pointing toward the fact that these are encoded using two byte UTF8, but are showing as single bytes, hence one character (Ÿ) shows as two (à and ¸).

So how do I convert it? At the moment I have tried the following:

$text = "øåñÉé"; 
echo "Original: " . $text . "<br/>";
echo "iconv detect: " . iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text) . "<br/>";
echo "ASCII convert: " . iconv('ASCII', 'UTF-8//IGNORE', $text) . "<br/>";  
echo "MB Convert: " . mb_convert_encoding($text, "UTF-8", "iso-8859-1") . "<br/>";  

// Wrong way around?

echo "ASCII convert: " . iconv('UTF-8', 'ASCII//IGNORE', $text) . "<br/>";  
echo "MB Convert: " . mb_convert_encoding($text, "iso-8859-1", "UTF-8") . "<br/>";  

Original: øåñÉé

iconv detect: øåñÉé

ASCII convert:

MB Convert: øÃ¥ñÃâ°Ã©

ASCII convert:

MB Convert: øåñ�?é

Its worth noting that this is just for the special characters, all of abcdefghijkl.... are all fine, its just accented and special characters that are going insane.

Upvotes: 2

Views: 1307

Answers (1)

Djave
Djave

Reputation: 9329

Ah, I have it – but in case any one in future needs it:

$text = "Jørgen Furøy Håkansson Sahlén";

echo "Original: ". $text . "<br/>";
echo "Windows iconv: " . iconv("UTF-8","Windows-1252",$text) . "<br/>"; 

Gives:

Original: Jørgen Furøy Håkansson Sahlén
Windows iconv: JørgenFurøy Håkansson Sahlén

So its the all important Windows-1252:

iconv("UTF-8","Windows-1252",$text)

Upvotes: 2

Related Questions