user006779
user006779

Reputation: 1031

problem with fgetcsv( ) and Unicode

i have a code. on localhost i have not problem with reading csv file (with Unicode chars). but when upload code on host output is nothing. why? what is solution?

while (($data=fgetcsv($fin,5000,","))!==FALSE) 
{
 var_dump($data[0]);  //on host output is `string(0) ""` but on local i can see output
 var_dump($data[1]);  //$data[1] is integer and  i can see output
}

Upvotes: 3

Views: 12593

Answers (3)

dogan
dogan

Reputation: 11

I used iconv for unicode encoding, and it works almost perfect in my situation. I hope it will help someone else too.

$csvFile = fopen('file/path', "r");
fgetcsv($csvFile);
while(($row = fgetcsv($csvFile, 1000, ";")) !== FALSE){        
  for ($c=0; $c < count($row); $c++) {
    echo iconv( "Windows-1252", "UTF-8", $row[$c]);
  }
}
fclose($csvFile);

Upvotes: 1

timdream
timdream

Reputation: 5922

Note:

Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function.

http://php.net/fgetcsv

One possible solution is to use setlocale().

Upvotes: 8

Jaro
Jaro

Reputation: 3887

One such thing is the occurrence of the UTF byte order mark, or BOM. The UTF-8 character for the byte order mark is U+FEFF, or rather three bytes – 0xef, 0xbb and 0xbf – that sits in the beginning of the text file. For UTF-16 it is used to indicate the byte order. For UTF-8 it is not really necessary.

So you need to detect the three bytes and remove the BOM. Below is a simplified example on how to detect and remove the three bytes.

$str = file_get_contents('file.utf8.csv');
$bom = pack("CCC", 0xef, 0xbb, 0xbf);
if (0 == strncmp($str, $bom, 3)) {
    echo "BOM detected - file is UTF-8\n";
    $str = substr($str, 3);
}

That's all

Upvotes: 2

Related Questions