Reputation: 385
So the title may be a little confusing, but it's hard to describe the problem in just a few words. So I have two tables that are important right now:
CREATE TABLE IF NOT EXISTS `private_crawler_urls` (
`id` int(11) NOT NULL,
`url` text NOT NULL,
`hash` varchar(47) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB AUTO_INCREMENT=100 DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `private_crawler_url_checks` (
`id` int(11) NOT NULL,
`url_id` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`total_proxies` int(11) NOT NULL,
`working_proxies` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=78 DEFAULT CHARSET=latin1;
Now I want to select all the urls from private_crawler_urls
where there's no entry in private_crawler_url_checks
which is newer than a given Timespan (e.g. 10 Minutes). That's what I have right now:
SELECT
u.id, u.url, c.created_at
FROM
private_crawler_urls u
INNER JOIN
private_crawler_url_checks c ON (c.url_id = u.id)
WHERE
c.created_at < NOW() - INTERVAL 10 MINUTE
ORDER BY c.created_at ASC
The problem is that this can't work as I don't want to delete the old entries in private_crawler_url_checks
and so there's always an entry that's older than the given Timespan even if there's a fresh one in the table.
As I'm pretty new to MySQL I have no Idea how I can accomplish this, so I need your help. Thanks! In case you need more information, just leave a comment!
Upvotes: 0
Views: 25
Reputation: 71891
I think this will make you keep your old records and not interfere with your wish. Not all that certain about performance impact though:
SELECT
pcu.url
FROM
private_crawler_urls as pcu
WHERE
pcu.id NOT IN (
SELECT
pcuc.url_id
FROM
private_crawler_url_checks as pcuc
WHERE
pcuc.created_at > DATE_SUB(NOW(), INTERVAL 10 MINUTE)
)
First it selects all url_id
, where there is a created_at
in the last 10 minutes. After that it will select all url
where the id
is not in this list
Upvotes: 1