Reputation: 1892
I have following query which is filtering results using sub queries with NOT IN
how can I avoid NOT IN
? Is there any alternative for NOT IN
? How can I optimise overall query?
One option I see adding index on lower(display_name)
but Is the any better or further optimisation I can do here?
SELECT table_name.* FROM table_name
WHERE (
table_name.id IN (
SELECT distinct on (name, mode, formulation, strength, generic_name) id FROM table_name
where lower(display_name)
NOT IN(SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL)
)
ORDER BY table_name.created_at DESC LIMIT 100 OFFSET 0
Query Plan
------------------------------------------------------------------------------------------------
Limit (cost=9277.38..9277.38 rows=1 width=296)
-> Sort (cost=9277.38..9351.72 rows=29736 width=296)
Sort Key: table_name.created_at DESC
-> Nested Loop (cost=8015.41..9128.70 rows=29736 width=296)
-> HashAggregate (cost=8015.12..8017.12 rows=200 width=4)
Group Key: table_name_1.id
-> Unique (cost=7486.65..7932.69 rows=6594 width=93)
-> Sort (cost=7486.65..7560.99 rows=29736 width=93)
Sort Key: table_name_1.name, table_name_1.mode, table_name_1.formulation, table_name_1.strength, table_name_1.generic_name
-> Seq Scan on table_name table_name_1 (cost=2286.19..5277.28 rows=29736 width=93)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Bitmap Heap Scan on table_name table_name_2 (cost=106.16..2272.68 rows=5402 width=32)
Recheck Cond: (user_id IS NULL)
-> Bitmap Index Scan on index_table_name_on_user_id (cost=0.00..104.81 rows=5402 width=0)
Index Cond: (user_id IS NULL)
-> Index Scan using table_name_pkey on table_name (cost=0.29..6.61 rows=1 width=296)
Index Cond: (id = table_name_1.id)
What I am trying to do is table_name is drugs which holds two list like
I am returning two results in search 1. admin drugs 2. all users drugs like
Admin list
admin_list = select * from table_name user_id IS NULL and LOWER(table_name.name) like %cal%
All display names from admin list which need to skip in following query
SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL
All users drugs but skipping / filtering out admin list based on display_name like
user_list = SELECT table_name.* FROM table_name
WHERE (
table_name.id IN (
SELECT distinct on (name, mode, formulation, strength, generic_name) id FROM table_name
where lower(display_name)
NOT IN(SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL)
)
AND LOWER(table_name.name) LIKE '%cal%'
ORDER BY table_name.created_at DESC LIMIT 100 OFFSET 0
Goal to have two distinct lists admin_list and user_list where user_list should not contain any admin list display_name
Upvotes: 1
Views: 112
Reputation: 1269873
You seem to want one id per name, mode, formulation, strength, generic_name
combination where the display_name
has no null
user_id
.
If that is correct, I'm sure there are simpler ways to write the query:
select t.*
from (select distinct on (name, mode, formulation, strength, generic_name) t.*
from (select t.*,
count(*) filter (where user_id is null) over (partition by lower(display_name)) as cnt
from table_name
) t
where cnt = 0
order by name, mode, formulation, strength, generic_name, created_at desc
) t
order by created_at desc
limit 100 offset 0
Upvotes: 0
Reputation:
More often than not, NOT EXISTS if faster than NOT IN
SELECT t1.*
FROM table_name t1
WHERE table_name.id IN (SELECT distinct on (t2.name, t2.mode, t2.formulation, t2.strength, t2.generic_name) t2.id
FROM table_name t2
where not exists (select *
FROM table_name t3
WHERE t3.user_id IS NULL
and t2.lower(display_name) = t3.lower(display_name) )
)
ORDER BY table_name.created_at DESC
LIMIT 100 OFFSET 0
An index on table_name ( (lower(display_name)) ) where user_is null
should improve that even more.
Note that distinct on ()
without an ORDER BY is typically not a good idea.
Upvotes: 2