Reputation:
I am trying to read some data from my website using cURL. Todo this I am running about 50-60 requests per minute to my server. At about 30 requests it seems that the script stopped working but I figured out that I suddenly get an Status code 500 back from my curl requests.
The routine is nothing special it increases the actual day of the month until it reaches the end of the month. For each day I read something (not part of this code).
The following code shows how I make the cURL requests. After 30 requests it sends me 500 back, but when I try it then again without the loop and only get the 31 request it is fine, so it only does not work with my mass requests.
Any ideas where the problem might be?
Thanks!
// To get an ASP.NET SessionID I first visit the page as usual...
$c = curl_init();
curl_setopt($c, CURLOPT_URL, "http://www.mypage.de/mysite.aspx");
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($c, CURLOPT_COOKIEJAR, "cookies.txt");
$o = curl_exec($c);
curl_close($c);
//start the request
$c = curl_init();
curl_setopt($c, CURLOPT_URL, "http://www.mypage.de/mysite.aspx");
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($c, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($c, CURLOPT_COOKIEJAR, "cookies.txt");
curl_setopt($c, CURLOPT_HTTPHEADER, array('Content-Length' => '999'));
curl_setopt($c, CURLOPT_POST, true);
curl_setopt($c, CURLOPT_HEADER, 1);
$headers = array();
//$headers[] = "Referer: http://www.mypage.de/mysite.aspx";
//$headers[] = "Content-Length: 999";
$data = "somevalidpostdata";
curl_setopt($c, CURLOPT_POSTFIELDS, $data);
$o = curl_exec($c);
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);
echo "\r\n" . $status . "\r\n";
curl_close($c);
Thanks, WorldSignia
Upvotes: 1
Views: 15875
Reputation: 15931
HTTP500 means something went wrong on the server while processing the request. You will need to see what the error is on http://www.mypage.de/mysite.aspx. Is there a message or payload property that you can inspect? It may contain the error being thrown by the application.
It's unclear to me if you control the application that your script is connecting too? If not, and you are just scraping a page, then you should definitely introduce a sleep of a few seconds before each request or else threat management gateway applications will block your script (because it is basically a Denial of Service attack). Also you should be checking for and respecting the existence of Robots.txt on the target website.
Upvotes: 1
Reputation: 879
500 means Internal Server Error.
Maybe you are sending the requests too fast.
Try adding usleep(500000);
between the requets.
Upvotes: 3