delalma
delalma

Reputation: 928

SQL relative versus absolute date impact query time

I run the following query and it takes 50 seconds.

select created_at, currency, balance 
from YYY
where id in (ZZZ) and currency = 'XXX'
and created_at >= '2020-08-28'
order by created_at desc
limit 1;

explain:

Limit  (cost=100.12..1439.97 rows=1 width=72)  
->  Foreign Scan on yyy  (cost=100.12..21537.65 rows=16 width=72)
        Filter: (("substring"((object_key)::text, '\w+:(\d+):.*'::text))::integer = 723120)

Then I run the following query and it "infinite" time. Too long to wait until the end.

select created_at, currency, balance 
from YYY
where id in (ZZZ) and currency = 'XXX'
and created_at >= NOW() - INTERVAL '1 DAY'
order by created_at desc
limit 1;

explain:

Limit  (cost=53293831.90..53293831.91 rows=1 width=72)
  ->  Result  (cost=53293831.90..53293987.46 rows=17284 width=72)
        ->  Sort  (cost=53293831.90..53293840.54 rows=17284 width=556)
              Sort Key: yyy.created_at DESC
              ->  Foreign Scan on yyy  (cost=100.00..53293814.62 rows=17284 width=556)
                    Filter: ((created_at >= (now() - '1 day'::interval)) AND (("substring"((object_key)::text, '\w+:(\d+):.*'::text))::integer = 723120))

What could make this huge difference between those query. I know that index are used to improve performance. What can we infer from here? Any contribution would be appreciated.

Upvotes: 0

Views: 259

Answers (1)

marcothesane
marcothesane

Reputation: 6741

With a literal, the optimizer has an easy game to plan an efficient data access using the right index.

With an expression like NOW - INTERVAL '4 DAY', you run at least into two challenges:

  • It is a stable, not an immutable, expression. Let alone a literal.
  • The expression is a TIMESTAMP WITH TIME ZONE, not a DATE, and you need an implicit type cast.

You just make the life of the optimizer difficult ...

I just created a single-column table yyy with 12 years' worth of distinct dates in my PostgreSQL database. No indexes. Already here, you see a difference in the cost of the explain plan.

$ psql -c "explain select * from yyy where created_at >= '2020-08-28'"
                      QUERY PLAN                      
------------------------------------------------------
 Seq Scan on yyy  (cost=0.00..74.79 rows=126 width=4)
   Filter: (created_at >= '2020-08-28'::date)

And:

$ psql -c "explain select * from yyy where created_at >= now() - interval '4 day'"
                       QUERY PLAN                       
--------------------------------------------------------
 Seq Scan on yyy  (cost=0.00..96.70 rows=126 width=4)
   Filter: (created_at >= (now() - '4 days'::interval))
(2 rows)

It will be a much worse difference with the existence of an index ....

Upvotes: 1

Related Questions