Meo Beo
Meo Beo

Reputation: 484

Postgresql - Optimize query with IN operator

I'm using PostgreSQL 13 and having a performance issue.

I have a post table and an index:

create table post {
id int primary key,
is_deleted boolean default false,
status int default 1, -- status can be 1, 2, 3
commented_at timestamp
}

create index post_deleted_status_commented_id on post(is_deleted, status, commented, id);

When I run the following query, I hoped it would use the above index but actually it did not.

explain analyze
select p.id
from post as p 
where is_deleted = false and (p.status = 1 or p.status = 2)
order by newest desc, id desc limit 20;

Here is the query plan:

Limit  (cost=8767.81..8770.14 rows=20 width=12) (actual time=61.289..61.345 rows=20 loops=1)
  ->  Gather Merge  (cost=8767.81..18738.60 rows=85458 width=12) (actual time=61.287..61.341 rows=20 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Sort  (cost=7767.78..7874.61 rows=42729 width=12) (actual time=54.776..54.778 rows=19 loops=3)
              Sort Key: newest DESC, id DESC
              Sort Method: top-N heapsort  Memory: 26kB
              Worker 0:  Sort Method: top-N heapsort  Memory: 26kB
              Worker 1:  Sort Method: top-N heapsort  Memory: 26kB
              ->  Parallel Seq Scan on post p  (cost=0.00..6630.78 rows=42729 width=12) (actual time=0.011..40.473 rows=39998 loops=3)
                    Filter: ((NOT is_deleted) AND ((status = 1) OR (status = 2)))
                    Rows Removed by Filter: 8848
Planning Time: 0.149 ms
Execution Time: 61.370 ms

The post table has 146538 rows (120770 rows with is_deleted = false). If I just use p.status = 1, the index will be used. So is anyway to optimize the query? Thanks.

Update 1: Partial index does not help.

The newest field is updated frequently (when there is a new comment).

Upvotes: 0

Views: 103

Answers (1)

jjanes
jjanes

Reputation: 44423

If the query is always exactly that, then the optimal index would be a filtered/partial index which matches the WHERE clause and is indexed compatible with the ORDER BY clause.

create index on post (newest,id) WHERE
   is_deleted = false and (status = 1 or status = 2);

Now your query gives:

    QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..1.03 rows=20 width=12) (actual time=0.013..0.021 rows=20 loops=1)
   ->  Index Only Scan Backward using post_newest_id_idx on post p  (cost=0.42..2040.81 rows=66693 width=12) (actual time=0.012..0.017 rows=20 loops=1)
         Heap Fetches: 0
 Planning Time: 0.121 ms
 Execution Time: 0.039 ms

Upvotes: 1

Related Questions