Reputation: 897
I have an XML/SVG. Part of it:
<text id="p6_segmentMainLabel5-outer" class="p6_segmentMainLabel-outer" style="font-size: 11px; font-family: arial; fill: rgb(170, 170, 170);">BüG [349]</text>
There is a special character Inside of it. How Do I clean the entire XML of such special characters without escaping all the "<" and ">" to < and >? I could make an array of all the characters I want to convert but I would like a mthod that only excludes <> and Quotes to have a clean XML.
Upvotes: 0
Views: 264
Reputation: 19482
Encoding the umlauts does not make your XML "cleaner", but more difficult to read.
Here is not need to encode umlauts and other characters not belonging to ASCII - except if you want to create ASCII XML. This is not needed often.
Use UTF-8 as the encoding for you XML and you will be fine 99% of the time.
If you need ASCII specify the encoding on the XML-API (default is UTF-8):
$dom = new DOMDocument('1.0', 'ASCII');
$dom
->appendChild($dom->createElement('text'))
->appendChild($dom->createTextNode('ÄÖÜ'));
echo $dom->saveXml();
Output:
<?xml version="1.0" encoding="ASCII"?>
<text>ÄÖÜ</text>
It is possible to load the XML into a DOM and copy all the nodes to a new DOM defined to use ASCII:
$source = new DOMDocument();
$source->loadXml(
'<?xml version="1.0" encoding="utf-8" ?><text>ÄÖÜ</text>'
);
$target = new DOMDocument('1.0', 'ASCII');
$target->appendChild(
$target->importNode(
$source->documentElement, TRUE
)
);
echo $target->saveXml();
If you generate XML as text, you can use the htmlentities()
function to convert a string.
Upvotes: 2