Reputation: 22959
If I use saveHTML()
without the optional DOMnode
parameter it works as expected:
$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
echo $dom->saveHTML();
<html><body><div>123</div><div>456</div></body></html>
But when I add a DOMNode
parameter to output a subset of the document it seems to ignore the formatOutput
property and adds a bunch of unwanted whitespace:
$body = $dom->getElementsByTagName('body')->item(0);
echo $dom->saveHTML($body);
<body> <div>123</div> <div>456</div> </body>
What gives? Is this a bug? Is there a workaround?
Upvotes: 6
Views: 1884
Reputation: 3936
Is this a bug?
Yes, it's a bug and it's reported here
Is there a workaround?
Stick with Nigel's solution for now
Did they fix it?
Yes, as of 7.3.0 alpha3 this is a fixed bug
Check it here
Upvotes: 5
Reputation: 57131
If you know your document is going to be valid XML as well, you can use saveXML()
instead...
$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
$body = $dom->getElementsByTagName('body')->item(0);
echo $dom->saveXML($body);
which gives...
<body><div>123</div><div>456</div></body>
Upvotes: 4
Reputation: 6393
Well, it's a pretty ugly workaround, but it gets the job done:
$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
$dom->loadHTML(str_replace("\n", "", $dom->saveHTML($dom->getElementsByTagName('body')->item(0))), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
echo $dom->saveHTML();
Since saveHTML()
returns the string, pass the Node to that, then replace the line breaks, then pass that to loadHTML()
.
Upvotes: 2