Reputation: 5757
This site doesn't return anything when I try to parse it using CURL. Here is my code:
/* gets the data from a URL */
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data('http://casting.backstage.com/jobseekerx/SearchJobs.asp?SubmitToSearch=Search&lctr=1&rvsd=-1&o1=2&p1=1&ipp=10&city=&fromsearchpage=true&cg=11&cg=12&cg=13&cg=14&cg=15&cg=16&cg=17&cg=18&cg=19&cg=20&cg=22&kwrd=&kwdt=1&lcta=1&btnSearch=Run+Search+Now');
print $returned_content;
I've never encountered this problem and I've always used this method. I've also tried using Simple DOM Parser and get the same result. This is the URL in question:
Is there some sort of anti crawl code running on this page?
Upvotes: 0
Views: 671
Reputation: 1955
Have you seen what your error is? The echo curl_error($ch)
allows you to view what exactly is the error you're encountering. On the basis of that, you could then continue to solve the problem in many cases. In this particular case, I added a CURLOPT_USERAGENT
field and it worked perfectly.
<?php
function get_data($url)
{
$ch = curl_init();
$timeout = 30;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,false);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
curl_setopt($ch,CURLOPT_POST,false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0");
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$url = 'http://casting.backstage.com/jobseekerx/SearchJobs.asp?SubmitToSearch=Search&lctr=1&rvsd=-1&o1=2&p1=1&ipp=10&city=&fromsearchpage=true&cg=11&cg=12&cg=13&cg=14&cg=15&cg=16&cg=17&cg=18&cg=19&cg=20&cg=22&kwrd=&kwdt=1&lcta=1&btnSearch=Run+Search+Now';
$returned_content = get_data($url);
print $returned_content;
?>
I hope this helps you.
Upvotes: 1