domdocument formatting

I am trying to read in the body of a certain webpage to display on a seperate webpage, but I am having a bit of trouble with it. Right now, I use the following code

<?php
@$doc = new DOMDocument();
@$doc->loadHTMLFile('http://foo.com');
@$tags = $doc->getElementsByTagName('body');
foreach ($tags as $tag) {
    $index_text .= $tag->nodeValue;
    print nl2br($tag->nodeValue).'<br />';
}
?>

This code works, however it seems to remove alot of formatting, which is important to me, such as line breaks. How do I stop that from happening

Upvotes: 3

Views: 1829

Answers (1)

Jon Cram
Jon Cram

Reputation: 17309

The formatOutput attribute of a DOMDocument will do this.

$doc->formatOutput = true;

This will cause the DOM output to be output more for human consumption, with line breaks where you'd need them and indentation i.e. 'pretty print'.

The default value for this value is false, so you have to explicitly set it to true when needed.

Upvotes: 7

Related Questions