Reputation: 477
An example of a table, data along with the query can be found in http://sqlfiddle.com/#!9/2e65dd/3
I'm interested in finding all distinct user_id's that don't have certain record_type.
In my actual case, this table is huge and it has several million records in it and have an index on user_id column. Although i'm planning to retrieve it in batches by limiting the output to 1000 at a time.
select distinct user_id from
records o where
not exists (
select *
from records i
where i.user_id=o.user_id and i.record_type=3)
limit 0, 1000
Is there a better approach to achieve this need ?
Upvotes: 1
Views: 1771
Reputation: 15961
I's suggest a join as well, but mine would have differed from Bill K's like so:
SELECT DISTINCT r.user_id
FROM records AS r
LEFT JOIN (SELECT DISTINCT user_id FROM records WHERE record_type = 3) AS rt3users
ON r.user_id = rt3users.user_id
WHERE rt3users.user_id IS NULL
;
However, an alternative I would not expect better performance from, but is worth checking, since performance can vary based on size and content of data...
SELECT DISTINCT r.user_id
FROM records AS r
WHERE r.user_id NOT IN (
SELECT DISTINCT user_id
FROM records
WHERE record_type = 3
)
;
Note, this one is more similar to your original but does away with the correlated nature of the original subquery.
Upvotes: 1
Reputation: 1177
You could create a temporary table with record types equal 3 like
Select distinct user_id
into #users
from records
where record_type=3
Then create unique index (or primary key) on this table. Then you query would search indexes in both tables.
I can’t say the performance would be better you’d have to test it on your data.
Upvotes: 0
Reputation: 562661
I would do it this way:
SELECT u.user_id
FROM (SELECT DISTINCT user_id FROM records) AS u
LEFT OUTER JOIN records as r
ON u.user_id = r.user_id AND r.record_type = 3
WHERE r.user_id IS NULL
That avoids the correlated subquery in your NOT EXISTS solution.
Alternatively, you should have another table that just lists users, so you don't have to do the subquery:
SELECT u.user_id
FROM users AS u
LEFT OUTER JOIN records as r
ON u.user_id = r.user_id AND r.record_type = 3
WHERE r.user_id IS NULL
In either case, it would help optimize the JOIN to add a compound index on the pair of columns:
ALTER TABLE records ADD KEY (user_id, record_type)
Upvotes: 1