Reputation: 9329
I have some data in a database, showing as the below:
Judging from this ø
should be a Ÿ
. I'm not sure of a few things, but so far my research seems to be pointing toward the fact that these are encoded using two byte UTF8, but are showing as single bytes, hence one character (Ÿ) shows as two (Ã and ¸).
So how do I convert it? At the moment I have tried the following:
$text = "øåñÉé";
echo "Original: " . $text . "<br/>";
echo "iconv detect: " . iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text) . "<br/>";
echo "ASCII convert: " . iconv('ASCII', 'UTF-8//IGNORE', $text) . "<br/>";
echo "MB Convert: " . mb_convert_encoding($text, "UTF-8", "iso-8859-1") . "<br/>";
// Wrong way around?
echo "ASCII convert: " . iconv('UTF-8', 'ASCII//IGNORE', $text) . "<br/>";
echo "MB Convert: " . mb_convert_encoding($text, "iso-8859-1", "UTF-8") . "<br/>";
Original: øåñÉé
iconv detect: øåñÉé
ASCII convert:
MB Convert: øÃ¥ñÃâ°Ã©
ASCII convert:
MB Convert: øåñ�?é
Its worth noting that this is just for the special characters, all of abcdefghijkl.... are all fine, its just accented and special characters that are going insane.
Upvotes: 2
Views: 1307
Reputation: 9329
Ah, I have it – but in case any one in future needs it:
$text = "Jørgen Furøy Håkansson Sahlén";
echo "Original: ". $text . "<br/>";
echo "Windows iconv: " . iconv("UTF-8","Windows-1252",$text) . "<br/>";
Gives:
Original: Jørgen Furøy Håkansson Sahlén
Windows iconv: JørgenFurøy Håkansson Sahlén
So its the all important Windows-1252
:
iconv("UTF-8","Windows-1252",$text)
Upvotes: 2