GrinderZ
GrinderZ

Reputation: 664

Using PHP CURL for parsing sites with heavy load?

I use PHP CURL for parsing site with heavy load (This site even rarely openes in browser). In the result I have server response code 503 or 0 (nothing). Maybe You can give me advice or tell me some CURL features for getting normal server response?

There's my CURL options:

$options = array(
    CURLOPT_REFERER => $url,
    CURLOPT_TIMEOUT => 1800,
    CURLOPT_HEADER => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_SSL_VERIFYHOST => false,
    CURLOPT_SSL_VERIFYPEER => false,
    CURLOPT_HEADERFUNCTION => "curlHeaderCallback",
    CURLOPT_COOKIE => Cookies::arrayToString(Cookies::instance()->load()),
    CURLOPT_USERAGENT => "Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.9.168 Version/11.50",
    CURLOPT_HTTPHEADER => $headers
);

The problem is that I can't get response with page code.

I have 2 variants: 1. Server didn't answer; 2. In server's answer I get page with code 503 "server is overloaded".

CurlHeaderCallback() code:

`function curlHeaderCallback($ch, $str)
{
if (strncmp($str, "Set-Cookie:", 11) === 0)
    {
    $cookie = trim(substr($str, 11));
    list($cookie, $options) = explode(";", $cookie, 2);
    list($key, $value) = explode("=", $cookie, 2);
    Cookies::instance()->set($key, $value);
    }
if (trim($str) == "")
    {
    curl_setopt($ch, CURLOPT_COOKIE, Cookies::arrayToString(Cookies::instance()->load()));
    }
return (strlen($str));
}`

My actions are: $response = curl_exec($ch); $info = curl_getinfo($ch);

I have no response and $info["http_code"] or second variant: in response I have page 503 code and $info["http_code"] = 503

Oh, one more option is:

CURLOPT_CONNECTTIMEOUT => 30

Diagram is here: http://s61.radikal.ru/i172/1212/d6/33471472ee8e.png

Upvotes: 3

Views: 492

Answers (1)

Stu
Stu

Reputation: 4150

If you're just after the http code, you need to use curl_getinfo using CURLINFO_HTTP_CODE, an example being;

$handle = curl_init($url);
curl_setopt($handle,  CURLOPT_RETURNTRANSFER, TRUE);
$response = curl_exec($handle);
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);

Upvotes: 2

Related Questions