WhiteLine
WhiteLine

Reputation: 1991

Get DIV Element contents thru DOMDocument PHP

I have to recover some news from a div of a site. The div is structured as follows:

The HTML Markup:

<ul id="news-accordion" class="rounded" style="padding: 2px;">
   <li class="o">
         <h3>
            <span>TITLE ARTICLE</span>
            <span>30/10/2014</span>
         </h3>
         <div style="display: none;">
              <p>text of article</p>
         </div>
   </li>
   <li class="e">
         <h3>
            <span>TITLE ARTICLE</span>
            <span>28/10/2014</span>
         </h3>
         <div style="display: none;">
              <p>text of article</p>
         </div>
   </li>
   <li class="o">
         <h3>
            <span>TITLE ARTICLE</span>
            <span>29/10/2014</span>
         </h3>
         <div style="display: none;">
              <p>text of article</p>
         </div>
   </li>                                                     
</ul>

PHP

<?php 

$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml(file_get_contents('http://www.xxxxxxxxx/news.php'));

$news = $doc->getElementById('news-accordion');

$li = $news->getElementsByTagName('li'); 

foreach ($li as $row){ 

    $title = $row->getElementsByTagName('h3'); 
    echo $title->item(0)->nodeValue."<br><br>"; 

    /*foreach ($title as $row2){ 
    echo $row2->nodeValue."<br><br>";
    //echo $row2->item(0)->nodeValue."<br><br>"; 
    }*/

    $text = $row->getElementsByTagName('p'); 
    echo utf8_decode($text->item(0)->nodeValue)."<br><br><br>"; 

}

?>

The code works correctly, but when I print the contents of the span tag echo $title->item(0)->nodeValue;,

The text of the two span is printed together.

How can I take the contents of the two span separately? Thanks.

Upvotes: 2

Views: 1179

Answers (2)

Amit Kumar Sahu
Amit Kumar Sahu

Reputation: 495

$title = $row->getElementsByTagName('h3'); 
echo $title->item(0)->nodeValue."<br><br>"; 

Replace above two line with below (instead of using h3 tag use span tag)

$title = $row->getElementsByTagName('span'); 
echo $title->item(0)->nodeValue."<br><br>"; 
echo $title->item(1)->nodeValue."<br><br>"; 

It's working for me.

Upvotes: -1

Kevin
Kevin

Reputation: 41885

Yes you can, just adjust the ->item() index. Just like what you have done already in the other elements, point it to that header element, then just explicitly point it to those span children:

foreach ($li as $row){ 

    $h3 = $row->getElementsByTagName('h3')->item(0);
    $title = $h3->getElementsByTagName('span')->item(0); // first span
    $date = $h3->getElementsByTagName('span')->item(1); // second span

    echo $title->nodeValue . '<br/>';
    echo $date->nodeValue . '<br/>';

    $text = $row->getElementsByTagName('p'); 
    echo utf8_decode($text->item(0)->nodeValue)."<br><br><br>"; 

}

Upvotes: 2

Related Questions