Justin808
Justin808

Reputation: 21522

PHP and UTF-8 string compairs

I'm using this bit of code and things like it quite a lot to parse a file...

if ($dataLines[0] == "0 HEAD" && 
    ($dataLines[count($dataLines) - 1] == "0 TRLR" ||
     $dataLines[count($dataLines) - 2] == "0 TRLR")) {
              // More Code Here
}

I've added the following else for debugging...

} else {
    $this->error("import(): File is not a gedcom datafile: " . $filename);
    $this->debug("import(): Lines: " . count($dataLines));
    $this->debug("import(): Lines: dataLines[0] = [" . $dataLines[0] ."]");
    $this->debug("import(): Lines: dataLines[count($dataLines) - 1] = [" . $dataLines[count($dataLines) - 1] ."]");
}

When I parse a ANSII file things work. I've been given a file in UTF-8 and things break. My output is:

Starting gedcom read
import(): File is not a gedcom datafile: /Users/jzaun/Development/www/assets/trees/greek/tree.ged
import(): Lines: 10712
import(): Lines: dataLines[0] = [0 HEAD ]

and I get an error too:

PHP Fatal error: Uncaught exception 'ErrorException' with message 'Array to string conversion' in /Users/jzaun/Development/www/classes/App/Gedcom.php:478 Stack trace:

To load in the file I'm using:

function file_get_contents_utf8($fn) {
    $content = file_get_contents($fn);
    return mb_convert_encoding($content, 'UTF-8', mb_detect_encoding($content));
}

$data = $this->file_get_contents_utf8($filename);
$dataLines = explode("\n", trim($data));
if (count($dataLines) == 1) {
    $dataLines = explode("\r", trim($data));
}

I'm guessing I am either loading the file wrong or I shouldn't be doing things like $dataLines[0] == "0 HEAD". How should I go about parsing the file so it works with UTF-8?

Upvotes: 0

Views: 126

Answers (1)

Pekka
Pekka

Reputation: 449613

This



is the Byte Order Mark (BOM). It's likely your problem, as it's changing the first line and your comparison fails.

You will have to ignore/remove the first three bytes if they equal . See this answer for one example.

Upvotes: 1

Related Questions