Tin Amaranth
Tin Amaranth

Reputation: 701

PHP Simple HTML Dom: Get childNodes nodeValue?

a.php:

<ul id="ul1">
    <li id="pt1">Point 1
         <ul id="ul2">
             <li id="pt11">Point 1.1</li>
             <li id="pt12">Point 1.2</li>
                <pre class="CodeDisplay">
                some codes
                </pre>
             <li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>
         </ul>
    </li> 
</ul>

I would like to get the nodeValue "Point 1" only. In JS, it is:

alert(document.getElementsByTagName("li")[0].childNodes[0].nodeValue);

But I would like to get the nodeValue in PHP (Simple HTML Dom); Here's the code snippet in another PHP page (b.php):

<?php

include('simple_html_dom.php');
$html = file_get_html('http://lifelearning.net63.net/a.php');

// stuck here:
echo $html->getElementsByTagName('ul',0)->getElementsByTagName('li',0)->nodeValue;
//

?>

I have used textContent but it just extracts the content descendents under Point 1. This is not what I want. I only want "Point 1". Any help is appreciated!

Upvotes: 1

Views: 6442

Answers (3)

Tin Amaranth
Tin Amaranth

Reputation: 701

With the help of others online, a simpler solution is suggested:

$html = new DOMDocument();
$html->loadHTMLFile('http://lifelearning.net63.net/a.php');
echo $html->getElementsByTagName('li')->item(0)->childNodes->item(0)->textContent; // returns "Point 1"

What I've learnt is that

first, any external library is not required in my case, DOMDocument does the job of getting the HTML DOM of a webpage.

Second, use item() and childNodes. Very much like what it is in JS:

document.getElementsByTagName("li")[0].childNodes[0].nodeValue

But thank you for all your replies.

Upvotes: 1

AKS
AKS

Reputation: 4658

Try this:

<?php
include('simple_html_dom.php');
$html = file_get_html('http://lifelearning.net63.net/a.php');
echo $html->find('li[id=pt1] li', 0)->innertext;

Above snippet finds the first (descent to li#pt1)matching li tag and gives your the inner text (content between the text, including all HTML in it, if any).

Have a look at SimpleHTMLDom docs. There are many ways and examples that your can find content (ID, classes, etc) from the HTML output. SimpleHTMLDom mostly follows jQuery/CSS selectors.

Note that if you do not use innertext method, it returns a SimpleHTMLDom node that you need to process before displaying.

If there were no matching elements, it will return an E_WARNING error message. So make sure your input contain the require elements or make sure the element is present with an isset()

Upvotes: 1

echo_Me
echo_Me

Reputation: 37233

u may looking for this

 <?php  $str2 =     ' <ul id="ul1"> ' ;?>
 <?php  $str2 .=    '<li id="pt1"><div>Point 1</div> ' ;?>
 <?php  $str2 .=    ' <ul id="ul2"> ' ; ?>
 <?php  $str2 .=    '     <li id="pt11">Point 1.1</li>' ; ?>
 <?php  $str2 .=    '    <li id="pt12">Point 1.2</li>' ; ?>
 <?php  $str2 .=    '     <pre class="CodeDisplay">' ; ?>
 <?php  $str2 .=    '     some codes' ; ?>
 <?php  $str2 .=    '     </pre>' ; ?>
 <?php  $str2 .=    '    <li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>' ; ?>
 <?php  $str2 .=    '  </ul>' ; ?>
 <?php  $str2 .=    '   </li> ' ; ?>
 <?php  $str2 .=    ' </ul>' ; ?>

 <?php

 function getTextBetweenTags($string, $tagname) {
     $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
     preg_match($pattern, $string, $matches);
     return $matches[1];
     }

   $txt = getTextBetweenTags($str2, "div");
   echo $txt;
   ?>

   will output : -->  Point 1 

Upvotes: 0

Related Questions