altern
altern

Reputation: 5949

Getting content of the node having childs via DOMDocument

I have following html:

<html ><body >Body text <div >div content</div></body></html>

How could I get content of body without nested <div>? I need to get 'Body text', but do not have a clue how to do this.

result of running

$domhtml = DOMDocument::loadHTML($html);
print $domhtml->getElementsByTagName('body')->item(0)->nodeValue;

is 'Body textdiv content', which is not exactly what I want to get

Upvotes: 3

Views: 9714

Answers (3)

John
John

Reputation: 1540

Based on the comments from php.net This should work for you:

$domhtml = DOMDocument::loadHTML($html); 
print $domhtml->getElementsByTagName('body')->firstChild->nodeValue;

Upvotes: 1

dnagirl
dnagirl

Reputation: 20456

I prefer DOMXPath for problems like this. It's very flexible

$domhtml = DOMDocument::loadHTML($html); 
$xpath = new DOMXPath($domhtml);
$query="/html/body/text()"; //gets all text nodes that are direct children of body

$txtnodes = $xpath->query($query);

foreach ($txtnodes as $txt) {
    echo $txt->nodeValue;
}

Upvotes: 7

mcandre
mcandre

Reputation: 24602

$domhtml = DOMDocument::loadHTML($html);
print $domhtml->getElementsByTagName('body')->item(0)->textContent;

Upvotes: 4

Related Questions