Rose
Rose

Reputation: 35

Stripping line breaks pre-XML leaves spaces- what is the proper method?

I'm wondering the advised way to strip line-breaks from XML-destined PHP strings. Using the following method, I'm left with a varying 2-4 spaces between my XML tags.

$current = $xml->saveXML();
$current = str_replace(array("\r\n", "\r", "\n"), "", $current);

What is the proper syntax to remove line breaks so XML tags appear end-to-end, without having spaces added between them?

Upvotes: 1

Views: 4838

Answers (1)

hakre
hakre

Reputation: 197757

First some basic things: $xml->saveXML() suggests you're using SimpleXML. It does only use one line-separator character in its output: "\n".

So searching for "\r\n" and "\r" is wrong. Also using str_replace() is not a good idea, you should use strtr() instead:

$current = strtr($current, array("\n" => ''));

As this only replaces the line-breaks space-characters between XML elements are not removed or changed here.

However those space characters depend a lot on your input XML. And in XML you can have significant (removing it would be a fail) and non-significant whitespace (save to remove) but Simplexml or DOMDocument do not (and can not) know which one is which.

As the libraries itself do not know, it is you who needs to tell them. For example it looks like you're looking for a trimming of all text-nodes. As the SimpleXMLElement does not allow to access the text-nodes, you need to use DOMXPath. But have no fear, it's not that complicated:

$doc   = dom_import_simplexml($xml)->ownerDocument;
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//text()') as $text) {
    $text->data = trim($text->data);
}

That is just iterating over all text-nodes in document order and trimming them.

Then you only need to get the XML starting with the document element. That will strip the XML-declaration and any preceding comments and processing instructions (I assume you want that):

$current = $doc->saveXML($doc->documentElement);

In case not, the line-separator rules from above apply. You can then instead:

$current = $xml->saveXML();
$current = strtr($current, array("\n" => ''));

And that's it. I hope this is helpful.

Upvotes: 3

Related Questions