Lukasz B
Lukasz B

Reputation: 23

Missing elements in getElementsByTagName

I'm trying to get all the links from this site: https://www.supremecourt.uk/cases/search-results.html?q=affidavit

with the following code:

libxml_use_internal_errors(true);

$html = file_get_contents("https://www.supremecourt.uk/cases/search-results.html?q=affidavit");

$docs = new domDocument; 

$docs->loadHTML($html); 


$anchors = $docs->getElementsByTagName('a');

$links = array();

foreach($anchors as $anchor) {
    echo $links[] = $anchor->getAttribute('href');
    echo '<br>';
}

but the returned links do not include links from the search results. Why is that, and how can I fix it?

Upvotes: 2

Views: 144

Answers (1)

CrazyCrow
CrazyCrow

Reputation: 4235

Search results on this site are provided by Google CSE via JSONP request and probably (not sure as I never tried to "break" CSE but there is signature in request to Google so this task is not easy for sure) couldn't be obtained from PHP or other ways that don't include the headless browser which can do all JS things (PhantomJS, CasperJS, Selenium).

Upvotes: 1

Related Questions