Reputation: 1099
I'm trying to scrape this page http://www.buddytv.com/trivia/game-of-thrones-trivia.aspx and it's not working.
I tried
$html = new simple_html_dom();
$html->load_file($url);
But for the question I'm looking to grab (.trivia-question) can't be found. Can anybody tell me what I'm doing wrong ?
Thanks a lot!
And I tried
<?php
$Page = file_get_contents('http://www.buddytv.com/trivia/game-of-thrones-trivia.aspx');
$dom_document = new DOMDocument();
//errors suppress because it is throwing errors due to mismatched html tags
@$dom_document->loadHTML($Page);
$dom_xpath_admin = new DOMXpath($dom_document_admin);
$elements = $dom_xpath->query('//*[@id="id60questionText"]');
var_dump($elements);
Upvotes: 3
Views: 4442
Reputation: 10074
Ok then here is phantomjs example:
You need to download phantomjs from: http://phantomjs.org/, put somewhere where you can easily access by a script.
Test it by running {installationdir}/bin/phantomjs (phantomjs.exe on windows) --version
Then create JS file somewhere in your project, ex browser.js
var page = require('webpage').create();
page.open('http://www.buddytv.com/trivia/game-of-thrones-trivia.aspx', function() {
page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
search = page.evaluate(function() {
return $('#id60questionText').text();
});
console.log(search);
phantom.exit()
});
})
Then in your PHP script read it like:
$pathToPhatomJs = '/home/aurimas/Downloads/phantomjs/phantomjs-1.9.1-linux-x86_64/bin/phantomjs';
$pathToJsScript = '/home/aurimas/Downloads/phantomjs/phantomjs-1.9.1-linux-x86_64/browser.js';
$stdOut = exec(sprintf('%s %s', $pathToPhatomJs, $pathToJsScript), $out);
echo $stdOut;
Change $pathToPhatomJs
and $pathToJsScript
according to your configuration.
If you are on windows this may not work. You can then change PHP script to:
$pathToPhatomJs = '/home/aurimas/Downloads/phantomjs/phantomjs-1.9.1-linux-x86_64/bin/phantomjs';
$pathToJsScript = '/home/aurimas/Downloads/phantomjs/phantomjs-1.9.1-linux-x86_64/browser.js';
exec(sprintf('%s %s > phatom.txt', $pathToPhatomJs, $pathToJsScript), $out);
$fileContents = file_get_contents('phatom.txt');
echo $fileContents;
Upvotes: 5