Alex Strauss
Alex Strauss

Reputation: 121

Scrape facebook fan page in PHP

I'm trying to scrape a facebook fanpage using curl in php but it just give's me a blank page. here is my code.

function curlFunction($source_url){
  $ch = curl_init();

  $userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1';
  curl_setopt($ch, CURLOPT_USERAGENT,       $userAgent);
  curl_setopt($ch, CURLOPT_URL,             $source_url);
  curl_setopt($ch, CURLOPT_HEADER,      false);
  curl_setopt($ch, CURLOPT_FAILONERROR,     true);
  curl_setopt($ch, CURLOPT_ENCODING,        "UTF-8" );
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION,  true);
  curl_setopt($ch, CURLOPT_AUTOREFERER,         true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER,  true);
  curl_setopt($ch, CURLOPT_TIMEOUT,             60);

  $html= curl_exec($ch);
  curl_close($ch);
  return $html;
}   

$token = "CAACEdEose0cBADMEK5uLLfSTj1nZCG8eogAZBi6Dfkr4gJN9o6fFuyfEHkPtO94br9i9YP9gmiYPunHxRxr1PqU3YNy34PziACwEaMXl4NT9zZBMgdWD6WFh6mAL2dlqsjnYs9sKQ5sz7ZCVBn7ZA8lVrZCJRq8O0ZD";

$url = "https://graph.facebook.com/StarHub/feed?accesstoken=" . $token;

$html = curlFunction($url, $info);
echo $html;

I already use this function to other websites to scrape pages and it works fine. and then i encountered this problem, when i use https it gives me blank page but when i only use http it works fine, but facebook graph api requiring me to use https to get the contents.

Upvotes: 2

Views: 3973

Answers (2)

andyrandy
andyrandy

Reputation: 73984

Pages are public and the feed can be read even with an App Access token. Try changing the access token like this:

$token = "APP-ID|APP-SECRET";

(App ID and App Secret, with a Pipe in the middle)

That´s the only Token that never expires, only if you change the ID or the Secret of your App.

Another solution with the PHP SDK:

$result = $facebook->api('/PAGE-ID/feed', array('access_token' => 'APP-ID|APP-SECRET'));
var_dump($result['data']);

You might even be able to do that without the Access Token, if no user is authorized it should use the App Access Token anyway.

Upvotes: 4

likeitlikeit
likeitlikeit

Reputation: 5638

The problem seems to be that because of an invalid access token, the server is returning a 400 Bad Request error. That in turn leads curl to return an empty string because of the

CURLOPT_FAILONERROR

option. See descriptions for this and other curl options here.

The following code returns the same results as a regular browser request to the same URL:

function curlFunction($source_url){
  $ch = curl_init();

  $userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1';
  curl_setopt($ch, CURLOPT_USERAGENT,       $userAgent);
  curl_setopt($ch, CURLOPT_URL,             $source_url);
  curl_setopt($ch, CURLOPT_HEADER,      false);
  curl_setopt($ch, CURLOPT_FAILONERROR,     true);
  curl_setopt($ch, CURLOPT_ENCODING,        "UTF-8" );
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION,  true);
  curl_setopt($ch, CURLOPT_AUTOREFERER,         true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER,  true);
  curl_setopt($ch, CURLOPT_TIMEOUT,             60);

  $html= curl_exec($ch);
  curl_close($ch);
  return $html;
}   

$token = "CAACEdEose0cBADMEK5uLLfSTj1nZCG8eogAZBi6Dfkr4gJN9o6fFuyfEHkPtO94br9i9YP9gmiYPunHxRxr1PqU3YNy34PziACwEaMXl4NT9zZBMgdWD6WFh6mAL2dlqsjnYs9sKQ5sz7ZCVBn7ZA8lVrZCJRq8O0ZD";

$url = "https://graph.facebook.com/StarHub/feed?accesstoken=" . $token;

$html = curlFunction($url, $info);
echo $html;

Upvotes: 0

Related Questions