Andrewboy
Andrewboy

Reputation: 364

PHP read a line from a csv file return wrong in charset

I got a csv file, if I set the charset to ISO-8859-2(eastern europe) in Libre Calc, than it renders the characters correctly, but since the server's locale set to EN-UK.

I can not read the characters correctly, for example: it returns : T�t insted of Tót.

I tried many things like:

echo (mb_detect_encoding("T�t","ISO-8859-2","UTF-8"));

I know probably the char does not exist in UTF-8 but I tried.

Also tried to setup the correct charset in the header:

header('Content-Type: text/html; charset=iso-8859-2');
echo "T�th";

but its returns : TÄĹźËth insted of Tóth.

Please help me solve this, thanks in advance

Upvotes: -1

Views: 661

Answers (1)

jspit
jspit

Reputation: 7703

I advise against setting the header to charset=iso-8859-2'. It is usual to work with UTF-8. If the data is available with a different encoding, it should be converted to UTF-8 and then processed as CSV. The following example code could be kept as simple as the newline characters in UTF-8 and iso-8859-2 are the same.

$fileName = "yourpath/Iso8859_2.csv";
$fp = fopen($fileName,"r");
while($row = fgets($fp)){
  $strUtf8 = mb_convert_encoding($row,'UTF-8','ISO-8859-2');
  $arr = str_getcsv($strUtf8);
  var_dump($arr);
}
fclose($fp);

The exact encoding of the CSV file must be known. mb_detect_encoding is not suitable for determining the encoding of a file.

Upvotes: 1

Related Questions