user1510709
user1510709

Reputation: 1

Get content from complex html page?

I've been trying to loop through this html webpage to get the business names located within the below section of code which is nested pretty deep. All the id's are unique. I've tried using simple_html_dom but had trouble with that. I'm pretty new to PHP but an avide learner all the same, so with a point in the right direction I hope I'll crack this.

The webpage I'm trying to use is http://yellow.co.nz/yellow+pages/funeral+home/New+Zealand?page=1&stageName=Composite+search&activeSort=name-asc&suppressMobileListings=false

<div class="result standard">
    <div class="resultBody"> 
        <div class="listingMain">
            <div class="vcard">
                <a class="fn openPreview">
                    <span>Biz Name</span>

Upvotes: 0

Views: 247

Answers (2)

deefour
deefour

Reputation: 35360

You might try Goutte and do something like

use Goutte\Client;

$client = new Client();
$crawler = $client->request('GET', 'http://yellow.co.nz/yellow+pages/funeral+home/New+Zealand?page=1&stageName=Composite+search&activeSort=name-asc&suppressMobileListings=false');

$businessNames = array();
$crawler->filter('vcard > fn > span')->each(function($node, $i){
  $businessNames[] = $node->text();   
});

Upvotes: 1

Chris Trahey
Chris Trahey

Reputation: 18290

When I have had similar problems in the past (digging through an arbitrary hierarchy to my target nodes), I found XPath to be the most helpful solution:

PHP DOM Xpath documentation

It allows you to use a very straightforward XPath query to immediately target the nodes of interest.

Upvotes: 0

Related Questions