Reputation: 621
I'm trying to implement XML signing to my API response, but I'm having an issue where I can't make it into an DOMDocument due to it having 'unknown' XML entities (e.g. ä
and other converted HTML characters).
I want it to return it as is, but the signing library needs me to convert it to DOMDocument, but it throws warnings about the entities and then the documentElement
is not set.
I could parse it as HTML instead, but this added dupe <xml>
tag and <p>
tags even with LIBXML_HTML_NOIMPLIED
flag given.
I could also pass LIBXML_DTDLOAD
which allows it to parse it correctly, but this prints out my XML entities file directly to the response which increased the return length multiple times over.
So, is there a way I can just make DOMDocument load my XML string without it failing due to unknown entities, or without touching the XML string, or parsing the DTD without making it a standalone XML?
I'm doing the following at the moment which does not parse it correctly.
// minified XML string
$output = '<?xml version="1.0" standalone="no"?>
<!DOCTYPE api_result [
<!ELEMENT api_result (#PCDATA)>
<!ENTITY % xhtml-all SYSTEM "http://server.local/entities/xhtml-all.ent">
%xhtml-all;
]>
<api_result>ä</api_result>';
$xml = new DOMDocument();
// Whitespaces must be preserved
$xml->preserveWhiteSpace = true;
$xml->formatOutput = false;
$xml->strictErrorChecking = false;
$xml->loadXML($output);
// Canonicalize the content, exclusive and without comments
if (!$xml->documentElement) { // this fails to have any content
throw new XmlSignerException('Undefined document element');
}
Upvotes: 2
Views: 73
Reputation: 22959
How about decoding your output entities and re-encoding them as XML? You can do it easily with preg_replace_callback()
like this:
$converted = preg_replace_callback(
'/&[a-zA-Z][a-zA-Z0-9]*;/',
function ($m) {
return htmlentities(html_entity_decode($m[0]), ENT_XML1);
},
$output
);
Alternately you could define the entity in your DOCTYPE:
<!DOCTYPE api_result [
<!ELEMENT api_result (#PCDATA)>
<!ENTITY % xhtml-all SYSTEM "http://server.local/entities/xhtml-all.ent">
<!ENTITY auml "ä">
%xhtml-all;
]>
Note, both of my workarounds involve modifying your xml which you've clarified is not acceptable. Unfortunately, I don't think there are any other options if you need to use DOMDocument
.
Upvotes: 0