leonardorame
leonardorame

Reputation: 1161

Query not using index on timestamp without time zone field

I have a table with more than 3 million rows, one column named creationdate is a timestamp without time zone.

I created a couple of indexes on it, like:

"idx_routingtasks_creationdate" btree (creationdate)
"idx_routingtasks_creationdate2" btree ((creationdate::date))

When filter by creationdate::date (casting by date) the index idx_routingtasks_creationdate is not used:

explain analyze select * from routingtasks where creationdate::date < (now() - interval '1 day')::date;                                                                                 
                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on routingtasks  (cost=0.00..1315631.93 rows=2811638 width=715) (actual time=186642.012..413763.643 rows=2800659 loops=1)
   Filter: ((creationdate)::date < ((now() - '1 day'::interval))::date)
   Rows Removed by Filter: 212248
 Planning time: 0.195 ms
 Execution time: 413875.829 ms
(5 rows)

The same when not casting by date:

explain analyze select * from routingtasks where creationdate < now() - interval '1 day';
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on routingtasks  (cost=0.00..1300588.39 rows=2918447 width=715) (actual time=72089.312..327288.333 rows=2876756 loops=1)
   Filter: (creationdate < (now() - '1 day'::interval))
   Rows Removed by Filter: 141052
 Planning time: 0.104 ms
 Execution time: 327401.745 ms
(5 rows)

How can I create an index on the creationdate column to allow this filter use it?

Upvotes: 0

Views: 1554

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247625

The answer lies in this part of the execution plan:

Seq Scan ... (actual ... rows=2876756 ...)
  ...
  Rows Removed by Filter: 141052

Since almost all rows are returned anyway, using a sequential scan and discarding the few rows that are filtered out is the most efficient way to process the query.

If you want to verify that, temporarily

SET enable_seqscan = off;

to make PostgreSQL avoid a sequential scan if possible. Then you can test if query execution gets faster or not.

Upvotes: 3

Related Questions