xalo
xalo

Reputation: 275

Order by ASC 100x faster than Order by DESC ? Why?

I have one complexe query generated by Hibernate for JBPM. I can't really modify it and i'm searching to optimize it as much as possible.

I found out that ORDER BY DESC is way slower than ORDER BY ASC, do you have any idea ?

PostgreSQL Version : 9.4 Schema : https://pastebin.com/qNZhrbef Query :

select 
taskinstan0_.ID_ as ID1_27_, 
taskinstan0_.VERSION_ as VERSION3_27_, 
taskinstan0_.NAME_ as NAME4_27_, 
taskinstan0_.DESCRIPTION_ as DESCRIPT5_27_, 
taskinstan0_.ACTORID_ as ACTORID6_27_, 
taskinstan0_.CREATE_ as CREATE7_27_, 
taskinstan0_.START_ as START8_27_, 
taskinstan0_.END_ as END9_27_,
taskinstan0_.DUEDATE_ as DUEDATE10_27_, 
taskinstan0_.PRIORITY_ as PRIORITY11_27_, 
taskinstan0_.ISCANCELLED_ as ISCANCE12_27_, 
taskinstan0_.ISSUSPENDED_ as ISSUSPE13_27_, 
taskinstan0_.ISOPEN_ as ISOPEN14_27_, 
taskinstan0_.ISSIGNALLING_ as ISSIGNA15_27_, 
taskinstan0_.ISBLOCKING_ as ISBLOCKING16_27_, 
taskinstan0_.LOCKED as LOCKED27_, 
taskinstan0_.QUEUE as QUEUE27_, 
taskinstan0_.TASK_ as TASK19_27_, 
taskinstan0_.TOKEN_ as TOKEN20_27_, 
taskinstan0_.PROCINST_ as PROCINST21_27_, 
taskinstan0_.SWIMLANINSTANCE_ as SWIMLAN22_27_, 
taskinstan0_.TASKMGMTINSTANCE_ as TASKMGM23_27_ 
from JBPM_TASKINSTANCE taskinstan0_, JBPM_VARIABLEINSTANCE stringinst1_, JBPM_PROCESSINSTANCE processins2_, JBPM_VARIABLEINSTANCE variablein3_ 

where stringinst1_.CLASS_='S' 
    and taskinstan0_.PROCINST_=processins2_.ID_ 
    and taskinstan0_.ID_=variablein3_.TASKINSTANCE_ 
    and variablein3_.NAME_ = 'NIR' 
    and taskinstan0_.QUEUE = 'ERT_TPS'
    and (processins2_.ORGAPATH_ like '/ERT%')
    and taskinstan0_.ISOPEN_= 't'
    and variablein3_.ID_=stringinst1_.ID_
order by stringinst1_.STRINGVALUE_ ASC limit '10';

Explain result for ASC :

 Limit  (cost=1.71..11652.93 rows=10 width=646) (actual time=6.588..82.407 rows=10 loops=1)
   ->  Nested Loop  (cost=1.71..6215929.27 rows=5335 width=646) (actual time=6.587..82.402 rows=10 loops=1)
         ->  Nested Loop  (cost=1.29..6213170.78 rows=5335 width=646) (actual time=6.578..82.363 rows=10 loops=1)
               ->  Nested Loop  (cost=1.00..6159814.66 rows=153812 width=13) (actual time=0.537..82.130 rows=149 loops=1)
                     ->  Index Scan Backward using totoidx10 on jbpm_variableinstance stringinst1_  (cost=0.56..558481.07 rows=11199905 width=13) (actual time=0.018..11.914 rows=40182 loops=1)
                           Filter: (class_ = 'S'::bpchar)
                     ->  Index Scan using jbpm_variableinstance_pkey on jbpm_variableinstance variablein3_  (cost=0.43..0.49 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=40182)
                           Index Cond: (id_ = stringinst1_.id_)
                           Filter: ((name_)::text = 'NIR'::text)
                           Rows Removed by Filter: 1
               ->  Index Scan using jbpm_taskinstance_pkey on jbpm_taskinstance taskinstan0_  (cost=0.29..0.34 rows=1 width=641) (actual time=0.001..0.001 rows=0 loops=149)
                     Index Cond: (id_ = variablein3_.taskinstance_)
                     Filter: (isopen_ AND ((queue)::text = 'ERT_TPS'::text))
                     Rows Removed by Filter: 0
         ->  Index Only Scan using idx_procin_2 on jbpm_processinstance processins2_  (cost=0.42..0.51 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=10)
               Index Cond: (id_ = taskinstan0_.procinst_)
               Filter: ((orgapath_)::text ~~ '/ERT%'::text)
               Heap Fetches: 0
 Planning time: 2.598 ms
 Execution time: 82.513 ms

Explain result for DESC :

 Limit  (cost=1.71..11652.93 rows=10 width=646) (actual time=8144.871..8144.986 rows=10 loops=1)
   ->  Nested Loop  (cost=1.71..6215929.27 rows=5335 width=646) (actual time=8144.870..8144.984 rows=10 loops=1)
         ->  Nested Loop  (cost=1.29..6213170.78 rows=5335 width=646) (actual time=8144.858..8144.951 rows=10 loops=1)
               ->  Nested Loop  (cost=1.00..6159814.66 rows=153812 width=13) (actual time=8144.838..8144.910 rows=20 loops=1)
                     ->  Index Scan using totoidx10 on jbpm_variableinstance stringinst1_  (cost=0.56..558481.07 rows=11199905 width=13) (actual time=0.066..2351.727 rows=2619671 loops=1)
                           Filter: (class_ = 'S'::bpchar)
                           Rows Removed by Filter: 906237
                     ->  Index Scan using jbpm_variableinstance_pkey on jbpm_variableinstance variablein3_  (cost=0.43..0.49 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=2619671)
                           Index Cond: (id_ = stringinst1_.id_)
                           Filter: ((name_)::text = 'NIR'::text)
                           Rows Removed by Filter: 1
               ->  Index Scan using jbpm_taskinstance_pkey on jbpm_taskinstance taskinstan0_  (cost=0.29..0.34 rows=1 width=641) (actual time=0.002..0.002 rows=0 loops=20)
                     Index Cond: (id_ = variablein3_.taskinstance_)
                     Filter: (isopen_ AND ((queue)::text = 'ERT_TPS'::text))
         ->  Index Only Scan using idx_procin_2 on jbpm_processinstance processins2_  (cost=0.42..0.51 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=10)
               Index Cond: (id_ = taskinstan0_.procinst_)
               Filter: ((orgapath_)::text ~~ '/ERT%'::text)
               Heap Fetches: 0
 Planning time: 2.080 ms
 Execution time: 8145.053 ms

Tables infos : jbpm_variableinstance 12100592 rows jbpm_taskinstance 69913 rows jbpm_processinstance 97546 rows

If you have any idea, thanks

Upvotes: 4

Views: 1940

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 657617

This typically only happens when OFFSET and / or LIMIT are involved (as is the case here).

The key difference is this line in the EXPLAIN output for the query with DESC:

Rows Removed by Filter: 906237

Meaning that while the first 10 rows in the index totoidx10 match when scanning backwards (which matches your ASC ordering, obviously), Postgres has to filter ~ 900k rows before it finally finds qualifying rows when scanning the same index forward.

A matching multicolumn index (with the right sort order) might help a lot.
Or, since Postgres chooses an unfavorable query plan, maybe just updated (or more detailed) table statistics or cost settings.

Related:

Upvotes: 5

Related Questions