Reputation: 11
I want to web scrape several websites, which apparently rendered using JavaScript. To be specific, I want to target this website: http://cve.mitre.org/find/index.html
This is my code:
$client = new Client();
$crawler = $client->request('GET', 'http://cve.mitre.org/find/index.html');
$form = $crawler->selectButton('Search')->form();
$crawler = $client->submit($form, array('search' => 'Symphony'));
print $crawler->html();
If I view the source code, I don't see the HTML because this request is done by JavaScript, so, does anyone know how to scrape these kind of websites?
Upvotes: 1
Views: 3003
Reputation: 20439
This site has bolted on a lazy "Google custom search" rather than implement their own, which means that the site comes with all sorts of JavaScript cruft.
It looks like the actual search might be done by a traditional form submission, you just need to post to a form using the elements that Google renders. However, it may not be that easy, since Google may check referrers and so forth, and prevent it anyway.
You have a few options, I think:
domain:cve.mitre.org
as appropriateUpvotes: 3