Reputation: 19
I'm trying to detect broken links. The following PHP accessing a MySQL table seems to work great (but slow due to fopen) for almost everything:
function fileExists($path){
return (@fopen($path,"r")==true);
}
$status="";
$result = mysql_query(" SELECT id, title, link from table ");
while ($row = mysql_fetch_array($result)) {
$id=$row{'id'};
$title=$row{'title'};
$link1=$row{'link1'};
etc.
if ($link){
if (fileExists($link)!=TRUE) {
$status='BROKEN_LINK';
}
}
//Here do something if the status gets set to broken
}
BUT the problem is links like this:
torrentfreak.com/unblocking-the-pirate-bay-the-hard-way-is-fun-for-geeks-120506
Here it isn't going to a file but going somewhere and getting content. So what is the best way to actually detect these situations correctly when they are not on your own domain?
Thanks!
Mordak
Upvotes: 1
Views: 107
Reputation: 4024
You can try using the cURL method:
function fileExists(&$pageScrape, $path){ // Adding parameter of cURL resource as a pointer.
curl_setopt($pageScrape, CURLOPT_URL, $path); // Set URL path.
curl_setopt($pageScrape, CURLOPT_RETURNTRANSFER, true); // Don't output the scraped page directly.
curl_exec($pageScrape); // Execute cURL call.
$status = curl_getinfo($pageScrape, CURLINFO_HTTP_CODE); // Get the HTTP status code of the page, load into variable $status.
if ($status >= 200 && $status <= 299) { // Checking for the page success.
return true;
} else {
return false;
}
}
$pageScrape = curl_init();
$status="";
$result = mysql_query(" SELECT id, title, link from table ");
while ($row = mysql_fetch_array($result)) {
$id=$row{'id'};
$title=$row{'title'};
$link1=$row{'link1'};
etc.
if ($link){
if (fileExists($pageScrape, $link)!=TRUE) {
$status='BROKEN_LINK';
}
}
//Here do something if the status gets set to broken
}
curl_close($pageScrape);
You can fine tune the status check by looking over the list of HTTP status codes: Wikipedia link
Upvotes: 1