Oscar Godson
Oscar Godson

Reputation: 32776

Send browser headers via PHP

How can I send a header to a website as if PHP / Apache is a browser? I'm trying to scrape a site, but it looks like they send a 404 error if it's coming from another server...

Or, if you know any other good ways to scrape content from a site?

Also, here is my current code:

<?php
    $curl_handle=curl_init();
    curl_setopt($curl_handle,CURLOPT_URL,$_GET['url']);
    curl_setopt($curl_handle, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
    curl_setopt($curl_handle, CURLOPT_REFERER, "http://google.com");
    curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
    curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
    $buffer = curl_exec($curl_handle);
    curl_close($curl_handle);
    echo $buffer;
?>

so, I'll be making an AJAX request like:

/spider.php?url=http://target.com

Which returns an empty string. I know this is setup right though because if i switch target with twitter.com it works... what am i missing to make this look like a full browser?

Upvotes: 2

Views: 2837

Answers (2)

Daniel Kluev
Daniel Kluev

Reputation: 11325

For cURL, there is CURLOPT_USERAGENT option for that,

curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");

However it may also check Referer header, which you can set via

curl_setopt($ch, CURLOPT_REFERER, "http://<somesite>");

Upvotes: 3

Daniel Egeberg
Daniel Egeberg

Reputation: 8382

If you're using the curl, you can use the CURLOPT_HTTPHEADER option, which takes an array of headers you wish to send with the request.

If you're using file_get_contents(), you can pass it a stream context created with stream_create_context().

Upvotes: 2

Related Questions