Cyril N.
Cyril N.

Reputation: 39859

How to transform a NOT IN into a LEFT JOIN (or a NOT EXISTS)

I have the following query that is quite complex and even though I tried to understand how to do using various sources online, all the examples uses simple queries where mine is more complex, and for that, I don't find the solution.

Here's my current query :

SELECT id, category_id, name 
    FROM orders AS u1 
WHERE added < (UTC_TIMESTAMP() - INTERVAL 60 SECOND) 
    AND (executed IS NULL OR executed < (UTC_DATE() - INTERVAL 1 MONTH)) 
    AND category_id NOT IN (SELECT category_id 
                                FROM orders AS u2 
                            WHERE executed > (UTC_TIMESTAMP() - INTERVAL 5 SECOND) 
                            GROUP BY category_id) 
GROUP BY category_id 
ORDER BY added ASC 
LIMIT 10;

The table orders is like this:

id
category_id
name
added
executed

The purpose of the query is to list n orders (here, 10) that belong in different categories (I have hundreds of categories), so 10 category_id different. The orders showed here must be older than a minute ago (INTERVAL 60 SECOND) and never executed (IS NULL) or executed more than a month ago.

The NOT IN query is to avoid treating a category_id that has already been treated less than 5 seconds ago. So in the result, I remove all the categories that have been treated less than 5 seconds ago.

I've tried to change the NOT IN in a LEFT JOIN clause or a NOT EXISTS but the switch results in a different set of entries so I believe it's not correct.

Here's what I have so far :

SELECT u1.id, u1.category_id, u1.name, u1.added 
    FROM orders AS u1 
    LEFT JOIN orders AS u2 
        ON u1.category_id = u2.category_id 
        AND u2.executed > (UTC_TIMESTAMP() - INTERVAL 5 SECOND) 
WHERE u1.added < (UTC_TIMESTAMP() - INTERVAL 60 SECOND) 
AND (u1.executed IS NULL OR u1.executed < (UTC_DATE() - INTERVAL 1 MONTH))
AND u2.category_id IS NULL 
GROUP BY u1.category_id 
LIMIT 10

Thank you for your help.

Here's a sample data to try. In that case, there is no "older than 5 seconds" since it's near impossible to get a correct value, but it gives you some data to help out :)

Upvotes: 0

Views: 385

Answers (1)

Adam Copley
Adam Copley

Reputation: 1495

Your query is using a column which doesn't exist in the table as a join condition.

ON u1.domain = u2.category_id 

There is no column in your example data called "domain"

Your query is also using the incorrect operator for your 2nd join condition.

AND u2.executed > (UTC_TIMESTAMP() - INTERVAL 5 SECOND)

should be

AND u2.executed < (UTC_TIMESTAMP() - INTERVAL 5 SECOND)

as is used in your first query

Upvotes: 1

Related Questions