Reputation: 373
how to find the total no.of inbound and outbound links of a website using php?
Upvotes: 1
Views: 3030
Reputation: 2361
function getGoogleLinks($host)
{
$request = "http://www.google.com/search?q=" . urlencode("link:" . $host) ."&hl=en";
$data = getPageData($request);
preg_match('/<div id=resultStats>(About )?([\d,]+) result/si', $data, $l);
$value = ($l[2]) ? $l[2] : "n/a";
$string = "<a href=\"" . $request . "\">" . $value . "</a>";
return $string;
}
//$host means the domain name
Upvotes: 0
Reputation: 267077
For outbound links, you will have to parse the HTML code of the website as some here have suggested.
For inbound links, I suggest using the Google Custom Search API, sending a direct request to google can get your ip banned. You can view the search api here. Here is a function I use in my code for this api:
function doGoogleSearch($searchTerm)
{
$referer = 'http://your-site.com';
$args['q'] = $searchTerm;
$endpoint = 'web';
$url = "http://ajax.googleapis.com/ajax/services/search/".$endpoint;
$args['v'] = '1.0';
$key= 'your-api-key';
$url .= '?'.http_build_query($args, '', '&');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
$body = curl_exec($ch);
curl_close($ch);
//decode and return the response
return json_decode($body);
}
After calling this function as: $result = doGoogleSearch('link:site.com')
, the variable $result->cursor->estimatedResultCount
will have the number of results returned.
Upvotes: 1
Reputation: 29536
To count outbound links
To inbound link
Upvotes: 1
Reputation: 26380
PHP can't determine the inbound links of a page through some trivial action. You either have to monitor all incoming visitors and check what their referrer is, or parse the entire internet for links that point to that site. The first method will miss links not getting used, and the second method is best left to Google.
On the other hand, the outbound links from a site is doable. You can read in a page and analyze the text for links with a regular expression, counting up the total.
Upvotes: 0