Reputation: 474
I've tried this a few different ways and nothing seems to work. (I tried all examples at How to imitate child selector with Simple HTML DOM?) Used code as is, changing what I needed for my needs, ie class=xxx and the url.
So I'm trying pull out some information from a web page. There are no children to work with as far as DOM is concerned and using the xpath method got me nothing returned. I'm guessing I'm doing something wrong.
<div id="wpp-6" class="widget popular-posts">
<div class="widget_title">POPULAR</div><!-- Wordpress Popular Posts Plugin v2.3.2 [Widget] [daily] [regular] -->
<ul>
<li>
<a href="http://link.html" title="Title of post" class="wpp-post-title">THE DATA I WANT</a> <span class="post-stats"></span>
</li>
<!-- More lists -->
</ul>
</div>
There are about 9 more list statements after that. Any suggestions?
Upvotes: 0
Views: 307
Reputation: 146201
Useing PHP Simple HTML DOM Parser you can do it easily, Just download the simple_html_dom.php
file from here and use it as follows.
include('simple_html_dom.php');
$html=file_get_html('http://psfk.com');
foreach($html->find('div#wpp-6 ul li a') as $a){
echo $a->innertext.'<br />';
}
Google Flu Map Depicts Worst Outbreaks In The USA
Scotch-Tape Portraits Contort Human Faces [Pics]
New Design For Orwell’s Nineteen Eighty-Four Highlights Theme Of Censorship
Vodka Made From Filtering The Liquor Over Nude Models [Video]
Samsung Debuts Flexible Screens
McDonald’s Changes Its Name In Australia
Samsung’s Transparent Screen Is The Retail Window Of The Future [CES]
Dita Von Teese Sews QR Codes Directly Into Her Clothing
Abercrombie & Fitch Boss Makes Flight Attendants Wear Only Boxers & Sandals On Private Jet
Mirror App Shows Women How They Will Age If They Keep Drinking
If you want to print the title with link
like <a>...</a>
then just use echo $a
Upvotes: 2
Reputation: 854
It's been a while since I used xpath so here is my solution You can traverse DOM tree this way, checking for id and class of needed element
<?php
error_reporting(0); //needed because of invalid xml
$url = "http://www.psfk.com";
$xml = new DOMDocument("1.0", 'UTF-8');
$str = file_get_contents($url);
// Load the url's contents into the DOM
$xml->loadHTML($str);
//Loop through all divs in the dom until we find what we need
foreach($xml->getElementsByTagName('div') as $div) {
if($div->getAttribute('id') == 'wpp-6' && $div->getAttribute('class') == 'widget popular-posts') {
if($div->parentNode->getAttribute('id') == 'right') {
foreach($div->getElementsByTagName('li') as $li) {
foreach($li->getElementsByTagName('a') as $link) {
echo $link->textContent . "<br>";;
}
}
}
}
}
?>
Upvotes: 0