Reputation: 365
I have a fairly simple piece of code here, i just add a bunch of links in the database, then check each link for a 200 ok.
<?php
function check_alive($url, $timeout = 10) {
$ch = curl_init($url);
// Set request options
curl_setopt_array($ch, array(
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_NOBODY => true,
CURLOPT_TIMEOUT => $timeout,
CURLOPT_USERAGENT => "page-check/1.0"
));
// Execute request
curl_exec($ch);
// Check if an error occurred
if(curl_errno($ch)) {
curl_close($ch);
return false;
}
// Get HTTP response code
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
// Page is alive if 200 OK is received
return $code === 200;
}
if (isset($_GET['cron'])) {
// database connection
$c = mysqli_connect("localhost", "paydayci_gsa", "", "paydayci_gsa");
//$files = scandir('Links/');
$files = glob("Links/*.{*}", GLOB_BRACE);
foreach($files as $file)
{
$json = file_get_contents($file);
$data = json_decode($json, true);
if(!is_array($data)) continue;
foreach ($data as $platform => $urls)
{
foreach($urls as $link)
{
//echo $link;
$lnk = parse_url($link);
$resUnique = $c->query("SELECT * FROM `links_to_check` WHERE `link_url` like '%".$lnk['host']."%'");
// If no duplicate insert in database
if(!$resUnique->num_rows)
{
$i = $c->query("INSERT INTO `links_to_check` (link_id,link_url,link_platform) VALUES ('','".$link."','".$platform."')");
}
}
}
// at the very end delete the file
unlink($file);
}
// check if the urls are alive
$select = $c->query("SELECT * FROM `links_to_check` ORDER BY `link_id` ASC");
while($row = $select->fetch_array()){
$alive = check_alive($row['link_url']);
$live = "";
if ($alive == true)
{
$live = "Y";
$lnk = parse_url($row['link_url']);
// Check for duplicate
$resUnique = $c->query("SELECT * FROM `links` WHERE `link_url` like '%".$row['link_url']."%'");
echo $resUnique;
// If no duplicate insert in database
if(!$resUnique->num_rows)
{
$i = $c->query("INSERT INTO links (link_id,link_url,link_platform,link_active,link_date) VALUES ('','".$row['link_url']."','".$row['link_platform']."','".$live."',NOW())");
}
}
$c->query("DELETE FROM `links_to_check` WHERE link_id = '".$row['link_id']."'");
}
}
?>
I'm trying not to add duplicate urls to the database but they are still getting in, have i missed something obvious with my code can anyone see? i have looked over it a few times, i can't see anything staring out at me.
Upvotes: 0
Views: 34
Reputation: 24579
If you are trying to enforce unique values in a database, you should be relying on the database itself to enforce that constraint. You can add an index (assuming you are using MySQL or a variant, which the syntax appears to be) like this:
ALTER TABLE `links` ADD UNIQUE INDEX `idx_link_url` (`link_url`);
One thing to be aware of is extra spaces as prefixes/suffixes so use trim()
on the values and also, you should strip trailing slashes to keep everything consistent (so you don't get dupes) using rtrim()
.
Upvotes: 2