Faiz
Faiz

Reputation: 283

eliminating special character issue in php/xml

I dont know much about this topic. I have a feed that runs everyday. Its been running fine for months until yesterday, it threw the error (in the output xml document):

<b>Warning</b>:  simplexml_load_string(): Entity: line 93191: parser error : Entity 'frasl' not defined in <b> folderpath

<b>Warning</b>:  simplexml_load_string():           &lt;g:color&gt;Gold &amp;frasl; White&lt;/g:color&gt; in 

Having had a look in the feed document theres a &frasl; (this is whats causing all the problem I think, its the first time I have seen it in out source doc using which the feeds are made) component which is simply a forward slash /. I had a look online on this issue and this is the answer I deemed appropriate:

The code I now have is:

function sxe($feed)
{   
$feed = file_get_contents($feed);
foreach ($http_response_header as $header)
{   
    if (preg_match("&frasl;", $header, $m))
    {   
        switch (strtolower($m[1]))
        {   
            case 'utf-8':
                // do nothing
                break;

            case 'iso-8859-1':
                $feed = utf8_encode($feed);
                break;

            default:
                $feed = iconv($m[1], 'utf-8', $feed);
        }
        break;
    }
}

return simplexml_load_string($feed);

}

I changed it a bit to suit my needs. That outputs errors in the xml:

1) It swaps the character < to &lt, > to &gt and " to &quot.

2) The errors are:

Undefined variable: http_response_header in <b> folderpath

Invalid argument supplied for foreach() in <b> folderpath

Anyone know what I can do to resolve the issue?

Upvotes: 0

Views: 83

Answers (1)

trincot
trincot

Reputation: 350127

The fix you have tried with makes no sense after you made an edit to it:

if (preg_match("&frasl;", $header, $m))
{   
    switch (strtolower($m[1]))
    {   
        case 'utf-8':
            // do nothing
            break;

        case 'iso-8859-1':
            $feed = utf8_encode($feed);
            break;

        default:
            $feed = iconv($m[1], 'utf-8', $feed);
    }
    break;
}

Because:

  • The first argument of preg_match must be a regular expression, which &frasl; is not -- this will produce a warning and not assign a value to $m. It will be null;
  • It is very unlikely that the file headers will contain the string &frasl;
  • There is no way you can get a match this way with utf-8 or iso-8859-1, so the default case will always be applied, ruining the format of your $feed.

The code you copied only makes sense if you leave the preg_match arguments as they are.

The real problem seems to be that the content of your XML document has the &frasl;, which is not being recognised by the parser. You could replace it with its equivalent &#8260;.

Code:

$feed = str_replace('&frasl;', '&#8260;', $feed);

Upvotes: 1

Related Questions