TheFrack
TheFrack

Reputation: 2873

Issue parsing XML, unknown encoding

I'm trying to read an XML feed, I'm not sure the encoding is proper, but it's set to UTF-8 and when I try to parse it in PHP via SimpleXML, it errors on "BöðVar" (note the special "o" characters).

libxml_use_internal_errors(TRUE);
$XMLOutputXMLObj = simplexml_load_string($xml_string);
if($XMLOutputXMLObj !== FALSE)
{
//do stuff
}

This is all I get for an error:

Entity 'ouml' not defined

Entity 'eth' not defined

I tried using "mb_convert_encoding", in various ways, but that failed.

How can I resolve this issue for any character? IE WITHOUT manually replacing ö with &214; (with # of course)?

Even better... is there a way to make it so SimpleXML doesn't care what it is parsing, as long as the tags are intact?

Thanks

Upvotes: 0

Views: 834

Answers (1)

pho
pho

Reputation: 530

Have you tried to escape the XML data in the node using the <![CDATA[ and ]]> tags before and after the node's text/value? E.g.

<?xml version="1.0" encoding="UTF-8"?>
<fmsdata>
  <result><![CDATA[Success !@#$%^&*()]]></result>
</fmsdata>

Upvotes: 2

Related Questions